Translate

Wednesday, October 5, 2016

UNIFORM RESOURCE LOCATORS (URL) AND UNIVERSAL RESOURCE IDENTIFIERS (URI)



UNIFORM RESOURCE LOCATORS (URL) AND UNIVERSAL
RESOURCE IDENTIFIERS (URI)
Before turning to a description of the Hypertext Transfer Protocol (HTTP), we
need to examine two important concepts: the Uniform Resource Locator (URL)
and the Universal Resource Identifier (URI).
Uniform Resource Locator
A key concept in the operation of the World-Wide Web (WWW) is that of Uniform
Resource Locator (URL). In the defining documents (RFC 1738, 1808), the URL,
is characterized as follows:
A Uniform Resource Locator (UlU) is a compact representation of the location
and access method for a resource available via the Internet. UlUs are used to
locate resources by providing an abstract identification of the resource location.
Having located a resource, a system may perform a variety of operations on the
resource, as might be characterized by such words as access, update, replace, and
find attributes. In general, only the access method needs to be specified for any
URL scheme.
A resource is any object that can be accessed by the Internet, and includes file
directories, files, documents, images, audio or video clips, and any other data that
may be stored on an Internet-connected computer. The term resource in this context
also includes electronic mail addresses, the results of a finger or archie command,
USENET newsgroups, and individual messages in a USENET newsgroup.
With the exception of certain dynamic URLs, such as the email address, we
can think of a URL as a networked extension of a filename. The URL provides a
pointer to any object that is accessible on any machine connected to the Internet.
Furthermore, because different objects are accessible in different ways (e.g., via
Web, FTP, Gopher, etc., the URL also indicates the access method that must be
used to retrieve the object.
The general form of a URL is as follows:
The URL consists of the name of the access scheme being used, followed by a
colon, and then by an identifier of a resource whose format is specific to the scheme
being used.
Although the scheme-specific formats differ, they have a number of points in
common, as we will see. In particular, many of the access schemes support the use
of hierarchical structures, similar to the hierarchical directory and file structures
common to file systems such as UNIX. For the URL, the components of the hierarchy
are separated by a "I", similar to the UNIX approach.
RFC 1738 defines URL formats for the following access schemes:
,
File Transfer Protocol (FTP)
The FTP URL scheme designates files and directories accessible using the FTP protocol.
In its simplest form, an FTP URL has the following format:

After the specification of the host, with an optional user-ID and password,
and a port number, a slash indicates the beginning of the file designation. Each of
the <cwd> elements is a directory name, or, more precisely, an argument to a CWD
(change working directory) command, such as is used in UNIX. The <name> value,
if present, is the name of a file. Finally, the <typecode> value can be used to designate
a particular type of file; otherwise, the type defaults in an implementationdependent
way.
Hypertext Transfer Protocol (HTTP)
The HTTP URL scheme designates accessible Internet resources, using the HTTP
protocol, and, in particular, designates web sites. In its simplest form, an HTTP
URL has the following format:
The Gopher Protocol
The FTP URL scheme designates files and directories accessible using the FTP protocol.
A Gopher URL takes the form:
Selects the Gopher-accessible telephone directory at M.I.T. This directory is searchable
by keyword. A user who accesses this directory can then interactively enter a key
word to initiate a search. Alternatively, this can be part of the URL; for example
Electronic Mail Address
The mailto URL scheme designates the Internet mailing address of an individual or
service. When invoked by a web client, it triggers the creation of an email message
to be sent by Internet electronic mail. For example,
USENET News
The news URL scheme designates either a news group or the individual articles of
USENET news. For example,
USENET News Using NNTP Access
The NNTP URL scheme is an alternative way of designating news articles, useful
for specifying articles from NNTP servers. The general form is
Reference to Interactive Sessions (TELNET)
The TELNET URL scheme designates interactive services accessible by the
TELNET protocol. Thus, this URL does not designate a data object but a service.
Wide Area Information Servers (WAIS)
The WAIS URL scheme designates WAIS databases, searches, or individual documents
available from a WAIS database. A WAIS takes one of the following forms:
The first form designates a WAIS database. The second form designates a
search submitted to a database. The third form designates a particular document
within a database, where <wtype> is the WAIS designation of the document type.
Host-Specific File Names
The file URL scheme differs from other URL schemes in that it does not designate
an Internet-accessible object or service. It provides a way of uniquely identifying a
directory or file on an Internet-addressable host, but does not designate an access
protocol. Thus, it has limited utility in a network context.
Prospero Directory Service
The Prospero URL scheme designates resources that are accessed via the Prospero
Directory Service. A prosper0 URL takes the form
where <hsoname> is the host-specific object name in the Prospero protocol. The
optional clause <field>=<value> serves to identify a particular target entry.
Universal Resource Identifier
Universal Resource Identifier (URI) is a term for a generic WWW identifier. The
URI specification (RFC 1630) defines a syntax for encoding arbitrary naming or
addressing schemes, and provides a list of such schemes. The concept of a URI, and
in particular its details, are still evolving. The URL is a type of URI, in which an
access protocol is designated and a specific Internet address is provided.
The potential advantage of the URI is that it decouples the name of a resource
from its location and even from its access method. With the URL, a specific instance
of a resource at a specific location is designated. If there are multiple instances, and
that specific instance is unavailable at the time of a request, then a requester must
determine an alternative URL and try that. In principle, with a URI, this process
could be automated. In practice, documents such as the HTTP specification refer to
the use of URIs, but are currently implemented using only URLs.
 HYPERTEXT TRANSFER PROTOCOL (HTTP)
The Hypertext Transfer Protocol (HTTP) is the foundation protocol of the worldwide
web (WWW) and can be used in any client-server application involving
hypertext. The name is somewhat misleading in that HTTP is not a protocol for
transferring hypertext; rather, it is a protocol for transmitting information with
the efficiency necessary for making hypertext jumps. The data transferred by the
protocol can be plain text, hypertext, audio, images, or any Internet-accessible
information.
We begin with an overview of HTTP concepts and operation and then look
at some of the details4 A number of important terms defined in the HTTP speclfication
are summarized in Table 19.11; these will be introduced as the discussion
proceeds.
HTTP Overview
HTTP is a transaction-oriented clientlserver protocol. The most typical use of
HTTP is between a web browser and a web server. To provide reliability, HTTP
makes use of TCP. Nevertheless, HTTP is a "stateless" protocol: Each transaction
is treated independently. Accordingly, a typical implementation will create a new
TCP connection between client and server for each transaction and then terminate
the connection as soon as the transaction completes, although the specification
does not dictate this one-to-one relationship between transaction and connection
lifetimes.
The stateless nature of HTTP is well-suited to its typical application. A normal
session of a user with a web browser involves retrieving a sequence of web
pages and documents. The sequence is, ideally, performed rapidly, and the locations
of the various pages and documents may be a number of widely distributed servers.
Another important feature of HTTP is that it is flexible in the formats that it
can handle. When a client issues a request to a server, it may include a prioritized
list of formats that it can handle, and the server replies with the appropriate format.
For example, a Lynx browser cannot handle images, so a web server need not transmit
any images on web pages. This arrangement prevents the transmission of unnecessary
information and provides the basis for extending the set of formats with new
standardized and proprietary specifications.
Figure 19.13 illustrates three examples of HTTP operation. The simplest case
is one in which a user agent establishes a direct connection with an origin server. The
user agent is the client that initiates the request, such as a web browser being run on
behalf of an end user. The origin server is the server on which a resource of interest
resides; an example is a web server at which a desired web home page resides. For
this case, the client opens a TCP connection that is end-to-end between the client
and the server. The client then issues an HTTP request. The request consists of a
specific command, referred to as a method, a URL, and a MIME-like message containing
request parameters, information about the client, and perhaps some additional
content information.
When the server receives the request, it attempts to perform the requested
action and then returns an HTTP response. The response includes status information,
a success/error code, and a MIME-like message containing information about
the server, information about the response itself, and possible body content. The
TCP connection is then closed.
The middle part of Figure 19.13 shows a case in which there is not an end-toend
TCP connection between the user agent and the origin server. Instead, there
are one or more intermediate systems with TCP connections between logically adjacent
systems. Each intermediate system acts as a relay, so that a request initiated by
the client is relayed through the intermediate systems to the server, and the
response from the server is relayed back to the client.
Three forms of intermediate systems are defined in the HTTP specification:
proxy, gateway, and tunnel, all of which are illustrated in Figure 19.14.
Proxy
A proxy acts on behalf of other clients and presents requests from other clients to
a server. The proxy acts as a server in interacting with a client, and as a client in
interacting with a server. There are several scenarios that call for the use of a proxy:
1. Security intermediary. The client and server may be separated by a security
intermediary such as a firewall, with the proxy on the client side of the firewall.
Typically, the client is part of a network secured by a firewall, and the
server is external to the secured network. In this case, the server must authenticate
itself to the firewall to set up a connection with the proxy. The proxy
accepts responses after they have passed through the firewall.
2. Different versions of HTTP. If the client and server are running different versions
of HTTP, then the proxy can implement both versions and perform the
required mapping.
In summary, a proxy is a forwarding agent, receiving a request for a URL
object, modifying the request, and forwarding that request toward the server identified
in the URL.
Gateway
A gateway is a server that appears to the client as if it were an origin server. It acts
on behalf of other servers that may not be able to communicate directly with a
client. There are several scenarios in which servers can be used:
1. Security intermediary. The client and server may be separated by a security
intermediary such as a firewall, with the gateway on the server side of the firewall.
Typically, the server is connected to a network protected by a firewall,
with the client external to the network. In this case, the client must authenticate
itself to the proxy, which can then pass the request on to the server.
2. Non-HTTP server. Web browsers have built into them the capability to contact
servers for protocols other than HTTP, such as FTP and Gopher servers.
This capability can also be provided by a gateway. The client makes an HTTP
request to a gateway server. The gateway server then contacts the relevant
FTP or Gopher server to obtain the desired result. This result is then converted
into a form suitable for HTTP and transmitted back to the client.
Tunnel
Unlike the proxy and the gateway, the tunnel performs no operations on HTTP
requests and responses. Instead, a tunnel is simply a relay point between two TCP
connections, and the HTTP messages are passed unchanged as if there were a single
HTTP connection between user agent and origin server. Tunnels are used when
there must be an intermediary system between client and server, but it is not necessary
for that system to understand the contents of messages. An example is a firewall
in which a client or server external to a protected network can establish an
authenticated connection, and which can then maintain that connection for purposes
of HTTP transactions.
Cache
Returning to Figure 19.13, the lowest portion of the figure shows an example of a
cache. A cache is a facility that may store previous requests and responses for handling
new requests. If a new request arrives that is the same as a stored request, then
the cache can supply the stored response rather than accessing the resource indicated
in the URL. The cache can operate on a client or server, or on an intermediate
system other than a tunnel. In the figure, intermediary B has cached a
requestlresponse transaction, so that a corresponding new request from the client
need not travel the entire chain to the origin server, but is handled by B.
Not all transactions can be cached, and a client or server can dictate that a certain
transaction may be cached only for a given time limit.
Messages
The best way to describe the functionality of HTTP is to describe the individual elements
of the HTTP message. HTTP consists of two types of messages: requests
from clients to servers, and responses from servers to clients. The general structure
of such messages is shown in Figure 19.15. More formally, using enhanced BNF
(Backus-Naur Form) notation (Table 19.12), we have
HTTP-Message = Simple-Request I Simple-Response I Full-Request I Full-
Response
Full-Request = Request-Line
*( General-Header I Request-Header I Entity-Header )
CRLF
[ Entity-Body ]
Full-Response = Status-Line
*( General-Header I Response-Header I Entity-Header )
CRLF
[ Entity-Body ]
Simple-Request = "GET" SP Request-URI CRLF
Simple-Response = [ Entity-Body ]
The Simple-Request and Simple-Response messages were defined in
HTTPl0.9. The request is a simple GET command with the requested URI; the
response is simply a block containing the information identified in the URI. In

HTTPl1.1, the use of these simple forms is discouraged because it prevents the
client from using content negotiation and the server from identifying the media type
of the returned entity.
With full requests and responses, the following fields are used:
Request-line. Identifies the message type and the requested resource.
Response-line. Provides status information about this response.
General-header. Contains fields that are applicable to both request and
response messages, but which do not apply to the entity being transferred.
Request-header. Contains information about the request and the client.
Response-header. Contains information about the response.
Entity-header. Contains information about the resource identified by the
request and information about the entity body.
Entity-body. The body of the message.
All of the HTTP headers consist of a sequence of fields, following the same
generic format as RFC 822 (described in Section 19.3). Each field begins on a new
line and consists of the field name followed by a colon and the field value.
Although the basic transaction mechanism is simple, there are a large number
of fields and parameters defined in HTTP; these are listed in Table 19.13. In the
remainder of this section, we look at the general header fields. Succeeding sections
describe request headers, response headers, and entities.
General Header Fields
General header fields can be used in both request and response messages. These
fields are applicable in both types of messages and contain information that does
not directly apply to the entity being transferred. The fields are the following:
Cache-Control. Specifies directives that must be obeyed by any caching mechanisms
along the requestlresponse chain; the purpose is to prevent a cache
from adversely interfering with this particular request or response.
Connection. Contains a list of keywords and header-field names that only
apply to this TCP connection between the sender and the nearest non-tunnel
recipient.
Data. Data and time at which the message originated.
Forwarded. Used by gateways and proxies to indicate intermediate steps
along a request or response chain. Each gateway or proxy that handles a message
may attach a Forwarded field that gives its URI.
Keep-Alive. May be present if the Keep-Alive keyword is present in an
incoming Connection field, to provide information to the requester of the persistent
connection. This field may indicate a maximum time that the sender
will keep the connection open while waiting for the next request or the maximum
number of additional requests that will be allowed on the current persistent
connection.
MIME-Version. Indicates that the message complies with the indicated version
of MIME.
Pragma. Contains implementation-specific directives that may apply to any
recipient along the requestlresponse chain.
Upgrade. Used in a request to specify what additional protocols the client
supports and would like to use; used in a response to indicate which protocol
will be used.
Two of these fields warrant further elaboration: Cache-Control and Connection.
Cache-Control
A Cache-Control field can be attached to either a request or a response. Any
caching mechanisms that receive a message with this header must follow the directives
in the header, which may mean deviating from the default caching action. This
field has the following format:
That is, this field consists of the phrase "Cache-Control:" followed by one or more
directives.
A cachable directive is included in a response to indicate that the server generating
the response declares it to be cachable. Any caching mechanism that forwards
this response may cache it for future use.
A max-age directive is used in a request to inform any caching mechanism en
route that it may use a cached response to this message only if it has a cached
response that is no older than the age specified. A server may include this directive
in a response to inform any caching mechanism en route that it may cache this
response for future requests up to the max-age time limit.
A private directive in a response indicates that parts of the response message
are intended for a single user and must not be cached except within a non-shared
cache controlled by the user agent. If no field names are listed, the entire message
is private.
A no-cache directive in a request forces that request to be forwarded to the
origin server and not answered by an intermediate cache. This directive allows a
client to request an authoritative response or to refresh a suspect cache. The list of
field names is not used in a request message. In a response, the no-cache directive
indicates that part or all of the message must not be cached for future use.
Connection
A Connection field can be attached to either a request or a response. It is used to
communicate from one end point of a TCP connection to the other end point. Thus,
this field is not end-to-end at the HTTP level. When an intermediary system
receives and forwards a message containing this field, that system must remove the
field prior to forwarding.
The body of this field may include one or more field names for fields included
in this message. These fields are to be processed by the recipient and not forwarded
with the rest of the message. Alternatively, the body may consist of one or more
keywords. At present, only the Keep-Alive keyword is defined in version 1.1 of
HTTP; this indicates that the sender would like a persistent TCP connection (one
that remains open beyond the current transaction).
Request Messages
A full-request message consists of a status line followed by one or more general,
request, and entity headers, followed by an optional entity body.
Request Methods
A full request message always begins with a Request-Line, which has the following
format:
Request-Line = Method SP Request-URI SP HTTP-Version CRLF
The Method parameter indicates the actual request command, called a
method in HTTP. Request-URI is the URI of the requested resource, and HTTPVersion
is the version number of HTTP used by the sender.
The following request methods are defined in HTTPl1.1:
OPTIONS. A request for information about the options available for the
requestlresponse chain identified by this URI.
GET. A request to retrieve the information identified in the URI and return
it in an entity body. A GET is conditional if the If-Modified-Since header field
is included, and is partial if a Range header field is included.
HEAD. This request is identical to a GET, except that the server's response
must not include an entity body; all of the header fields in the response are the
same as if the entity body were present; this enables a client to get information
about a resource without transferring the entity body.
POST. A request to accept the attached entity as a new subordinate to the
identified URI. The posted entity is subordinate to that URI in the same way
that a file is subordinate to a directory containing it, a news article is subordinate
to a newsgroup to which it is posted, or a record is subordinate to a database.
PUT. A request to accept the attached entity and store it under the supplied
URI. This may be a new resource with a new URI, or a replacement of the
contents of an existing resource with an existing URI.
PATCH. Similar to a PUT, except that the entity contains a list of differences
from the content of the original resource identified in the URI.
COPY. Requests that a copy of the resource identified by the URI in the
Request-Line be copied to the location(s) given in the URI-Header field in
the Entity-Header of this message.
MOVE. Requests that the resource identified by the URI in the Request-Line
be moved to the location(s) given in the URI-Header field in the Entity-
Header of this message; equivalent to a COPY followed by a DELETE.
DELETE. Requests that the origin server delete the resource identified by
the URI in the Request-Line.
LINK. Establishes one or more link relationships from the resource identified
in the Request-Line. The links are defined in the Link field in the Entity-
Header.
UNLINK. Removes one or more link relationships from the resource identified
in the Request-Line. The links are defined in the Link field in the Entity-
Header.
TRACE. Requests that the server return whatever is received as the entity
body of the response; this can be used for testing and diagnostic purposes.
WRAPPED. Allows a client to send one or more encapsulated requests. The
requests may be encrypted or otherwise processed. The server must unwrap
the requests and process accordingly.
Extension-method. Allows additional methods to be defined without changing
the protocol, but these methods cannot be assumed to be recognizable by
the recipient.
Request Header Fields
Request header fields function as request modifiers, providing additional information
and parameters related to the request. The following fields are defined in
HTTPI1 .l:
Accept. A list of media types and ranges that are acceptable as a response to
this request.
Accept-charset. A list of character sets acceptable for the response.
Accept-encoding. List of acceptable content encodings for the entity body.
Content encodings are primarily used to allow a document to be compressed
or encrypted. Typically, the resource is stored in this encoding and only
decoded before actual use.
Accept-language. Restricts the set of natural languages that are preferred for
the response.
Authorization. Contains a field value, referred to as credentials, used by the
client to authenticate itself to the server.
From. The Internet e-mail address for the human user who controls the
requesting user agent.
Host. Specifies the Internet host of the resource being requested.
If-modified-since. Used with the GET method. This header includes a
dateltime parameter; the resource is to be transferred only if it has been modified
since the dateltime specified. This feature allows for efficient cache
update. A caching mechanism can periodically issue GET messages to an origin
server, and will receive only a small response message unless an update is
needed.
Proxy-authorization. Allows the client to identify itself to a proxy that
requires authentication.
Range. For future study. The intent is that, in a GET message, a client can
request only a portion of the identified resource.
Referer. The URI of the resource from which the Request-URI was obtained.
This enables a server to generate lists of back-links.
Unless. Similar in function to the If-Modified-Since field, with two differences:
(1) It is not restricted to the GET method, and (2) comparison is based
on any Entity-Header field value rather than a dateltime value.
User-agent. Contains information about the user agent originating this request.
This is used for statistical purposes, the tracing of protocol violations,
and automated recognition of user agents for the sake of tailoring responses
to avoid particular user agent limitations.
Response Messages
A full-response message consists of a status line followed by one or more general,
response, and entity headers, followed by an optional entity body.
Status Codes
A full-response message always begins with a Status-Line, which has the following
format:
Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF
The HTTP-Version value is the version number of HTTP used by the sender.
The Status-Code is a 3-digit integer that indicates the response to a received
request, and the Reason-Phrase provides a short textual explanation of the status
code.
There are a rather large number of status codes defined in HTTPl1.1; these
are listed in Table 19.14, together with a brief definition. The codes are organized
into the following categories:

Informational. The request has been received and processing continues. No
entity body accompanies this response.
Successful. The request was successfully received, understood, and accepted.
The information returned in the response message depends on the request
method, as follows:
-GET: The contents of the entity-body corresponds to the requested
resource.
-HEAD: No entity body is returned.
-POST: The entity describes or contains the result of the action.
-TRACE: The entity contains the request message.
-Other methods: The entity describes the result of the action.
Redirection. Further action is required to complete the request.
Client error. The request contains a syntax error or the request cannot be fulfilled.
Server error. The server failed to fulfill an apparently valid request.
Response Header Fields
Response header fields providing additional information related to the response
that cannot be placed in the Status-Line. The following fields are defined in
HTTPI1 .l:
Location. Defines the exact location of the resource identified by the
Request-URI.
Proxy-authenticate. Included with a response that has a status code of Proxy
Authentication Required. This field contains a "challenge" that indicates the
authentication scheme and parameters required.
Public. Lists the non-standard methods supported by this server.
Retry-after. Included with a response that has a status code of Service
Unavailable, and indicates how long the service is expected to be unavailable.
Server. Identifies the software product used by the origin server to handle the
request.
WWW-authenticate. Included with a response that has a status code of Unauthorized.
This field contains a challenge that indicates the authentication
scheme and parameters required.
Entities
An entity consists of an entity header and an entity body in a request or response
message. An entity may represent a data resource, or it may constitute other information
supplied with a request or response.
Entity Header Fields
Entity header fields provide optional information about the entity body or, if no
body is present, about the resource identified by the request. The following fields
are defined in HTTPl1.1:
Allow. Lists methods supported by the resource identified in the Request-
URI. This field must be included with a response that has a status code of
Method Not Allowed and may be included in other responses.
Content-encoding. Indicates what content encodings have been applied to the
resource. The only encoding currently defined is zip compression.
Content-language. Identifies the natural language(s) of the intended audience
of the enclosed entity.
Content-length. The size of the entity body in octets.
Content-MDS. For future study. MD5 refers to the MD5 hash code function,
described in Lesson 18.
Content-range. For future study. The intent is that this designation will indicate
a portion of the identified resource that is included in this response.
Content-type. Indicates the media type of the entity body.
Content-version. A version tag associated with an evolving entity.
Derived-from. Indicates the version tag of the resource from which this entity
was derived before modifications were made by the sender. This field and the
Content-Version field can be used to manage multiple updates by a group
of users.
Expires. Dateltime after which the entity should be considered stale.
Last-modified. Dateltime that the sender believes the resource was last modified.
Link. Defines links to other resources.
Title. A textual title for the entity.
Transfer-encoding. Indicates what type of transformation has been applied to
the message body to safely transfer it between the sender and the recipient.
The only encoding defined in the standard is chunked. The chunked option
defines a procedure for breaking an entity body into labeled chunks that are
transmitted separately.
URI-header. Informs the recipient of other URIs by which the resource can
be identified.
* Extension-header. Allows additional fields to be defined without changing the
protocol, but these fields cannot be assumed to be recognizable by the recipient.
Entity Body
An entity body consists of an arbitrary sequence of octets. HTTP is designed to be
able to transfer any type of content, including text, binary data, audio, images, and
video. When an entity body is present in a message, the interpretation of the octets
in the body is determined by the entity header fields Content-Encoding, Content-
Type, and Transfer-Encoding. These define a three-layer, ordered encoding model:
entity-body := Transfer-Encoding( Content-Encoding( Content-Type
( data 1 1 1
The data are the contents of a resource identified by a URI. The Content-
Type field determines the way in which the data are interpreted. A Content-Encoding
may be applied to the data and stored at the URI instead of the data. Finally,
on transfer, a Transfer-Encoding may be applied to form the entity body of the
message.
Access Authentication
HTTPl1.1 defines a simple challenge-response technique for authentication. This
definition does not restrict HTTP clients and servers from using other forms of
authentication, but the current standard only covers this simple form.
Two authentication exchanges are defined: one between a client and a server,
and one between a client and a proxy. Both types of exchange use a challengeresponse
mechanism. The challenge, issued by a server or proxy, is of the form
challenge = auth-scheme 1*SP realm *( "," auth-param )
auth-scheme = token
auth-param = token "=" quoted-string
realm = "realm" "=" realm-value
realm-value = quoted-string
Auth-scheme is the name of a particular authentication scheme. The realm
defines a particular protection space, which is simply a conceptual partition of the
resource, with its own authentication scheme and authorization database. For
example, a resource may define several realms, one for end users and one for network
managers. The latter realm may have more privileges and requires a more
powerful authentication scheme.
In response to an authentication challenge, a client must provide credentials.
These are of the form
credentials = basic-credentials I auth-scheme *("," auth-param )
Basic credentials are covered below. In the general case, the user would return
the name of the authentication scheme and a set of parameters required to authenticate
itself.
Client-Server Authentication
A user agent that wishes to authenticate itself with a server may do so by including
an Authorization field in the request header; an agent may do this when initially
sending the request. An alternative, which may be more common, is that a client
sends a Request message without an Authorization field and is then required to
return an authorization by the server. Figure 19.16 illustrates this scenario, which
involves three steps:
The client sends a request, such as a GET request to the server, with no
Authorization field in the request header.
The server returns a response with a status code in the Status line of Unauthorized
and a WWW-Authenticate field in the response header. The
WWW-Authenticate field consists of a challenge that indicates the type of
authentication required and may include other parameters. No entity body is
returned.
The client repeats the request but includes an Authorization field that contains
the authorization data needed by the server.
If authentication succeeds, the server returns a response with some other status
code and without a WWW-Authenticate field. If authentication fails, the server
can initiate a new authentication sequence by returning a response with a status of
Unauthorized and a WWW-Authenticate field containing the (possibly new) challenge.
The entity body should explain the reason for the refusal.
In client-server authentication, any proxy or gateway must be transparent, as
far as authentication is concerned. That is, the WWW-Authenticate and Authorization
fields must be forwarded unmodified, and the response to a request containing
an Authorization field must not be cached. This latter requirement dictates
that the authentication always takes place between client and server and does not
simply replay the server's prior acceptance of authentication.
Proxy Authentication
A proxy may be configured so that a client must first authenticate itself to the proxy
before being granted access to an origin server. The sequence is similar to that
described for client-server authentication. In this case, the authentication information
is carried in the Proxy-Authorization field in the request header. A client may
authenticate itself when first issuing a Request message. Alternatively, a scenario
similar to Figure 19.16 occurs:
1. The client sends a request, such as a GET request to the server, with no Proxy-
Authorization field in the request header.
2. The proxy does not forward the request, but returns a response with a status
code in the Status line of Proxy-Authentication Required and a Proxy-
Authenticate field in the response header.
3. The client repeats the request but includes a Proxy-Authorization field that
contains the authorization data needed by the proxy.
If the request is authenticated, then the proxy may forward the request to a
server, but will omit the Proxy-Authorization field. The proxy could also return a
cached response.
Basic Authentication Scheme
For the basic authentication scheme, a user agent authenticates itself within a particular
realm by supplying a user ID and a password. This is the simplest form of
authentication, comparable to logging on to a system. Within HTTP, there is no
provision for protecting the user ID or password with encryption, so this method
provides minimal security. The form of the credentials for basic authentication are

No comments:

Post a Comment

silahkan membaca dan berkomentar