teknik informatika: UNIFORM RESOURCE LOCATORS (URL) AND UNIVERSAL RESOURCE IDENTIFIERS (URI)

UNIFORM RESOURCE LOCATORS (URL) AND UNIVERSAL

RESOURCE IDENTIFIERS (URI)

Before turning to a description of the Hypertext Transfer Protocol (HTTP), we

need to examine two important concepts: the Uniform Resource Locator (URL)

and the Universal Resource Identifier (URI).

Uniform Resource Locator

A key concept in the operation of the World-Wide Web (WWW) is that of Uniform

Resource Locator (URL). In the defining documents (RFC 1738, 1808), the URL,

is characterized as follows:

A Uniform Resource Locator (UlU) is a compact representation of the location

and access method for a resource available via the Internet. UlUs are used to

locate resources by providing an abstract identification of the resource location.

Having located a resource, a system may perform a variety of operations on the

resource, as might be characterized by such words as access, update, replace, and

find attributes. In general, only the access method needs to be specified for any

URL scheme.

A resource is any object that can be accessed by the Internet, and includes file

directories, files, documents, images, audio or video clips, and any other data that

may be stored on an Internet-connected computer. The term resource in this context

also includes electronic mail addresses, the results of a finger or archie command,

USENET newsgroups, and individual messages in a USENET newsgroup.

With the exception of certain dynamic URLs, such as the email address, we

can think of a URL as a networked extension of a filename. The URL provides a

pointer to any object that is accessible on any machine connected to the Internet.

Furthermore, because different objects are accessible in different ways (e.g., via

Web, FTP, Gopher, etc., the URL also indicates the access method that must be

used to retrieve the object.

The general form of a URL is as follows:

The URL consists of the name of the access scheme being used, followed by a

colon, and then by an identifier of a resource whose format is specific to the scheme

being used.

Although the scheme-specific formats differ, they have a number of points in

common, as we will see. In particular, many of the access schemes support the use

of hierarchical structures, similar to the hierarchical directory and file structures

common to file systems such as UNIX. For the URL, the components of the hierarchy

are separated by a "I", similar to the UNIX approach.

RFC 1738 defines URL formats for the following access schemes:

File Transfer Protocol (FTP)

The FTP URL scheme designates files and directories accessible using the FTP protocol.

In its simplest form, an FTP URL has the following format:

After the specification of the host, with an optional user-ID and password,

and a port number, a slash indicates the beginning of the file designation. Each of

the <cwd> elements is a directory name, or, more precisely, an argument to a CWD

(change working directory) command, such as is used in UNIX. The <name> value,

if present, is the name of a file. Finally, the <typecode> value can be used to designate

a particular type of file; otherwise, the type defaults in an implementationdependent

way.

Hypertext Transfer Protocol (HTTP)

The HTTP URL scheme designates accessible Internet resources, using the HTTP

protocol, and, in particular, designates web sites. In its simplest form, an HTTP

URL has the following format:

The Gopher Protocol

The FTP URL scheme designates files and directories accessible using the FTP protocol.

A Gopher URL takes the form:

Selects the Gopher-accessible telephone directory at M.I.T. This directory is searchable

by keyword. A user who accesses this directory can then interactively enter a key

word to initiate a search. Alternatively, this can be part of the URL; for example

Electronic Mail Address

The mailto URL scheme designates the Internet mailing address of an individual or

service. When invoked by a web client, it triggers the creation of an email message

to be sent by Internet electronic mail. For example,

USENET News

The news URL scheme designates either a news group or the individual articles of

USENET news. For example,

USENET News Using NNTP Access

The NNTP URL scheme is an alternative way of designating news articles, useful

for specifying articles from NNTP servers. The general form is

Reference to Interactive Sessions (TELNET)

The TELNET URL scheme designates interactive services accessible by the

TELNET protocol. Thus, this URL does not designate a data object but a service.

Wide Area Information Servers (WAIS)

The WAIS URL scheme designates WAIS databases, searches, or individual documents

available from a WAIS database. A WAIS takes one of the following forms:

The first form designates a WAIS database. The second form designates a

search submitted to a database. The third form designates a particular document

within a database, where <wtype> is the WAIS designation of the document type.

Host-Specific File Names

The file URL scheme differs from other URL schemes in that it does not designate

an Internet-accessible object or service. It provides a way of uniquely identifying a

directory or file on an Internet-addressable host, but does not designate an access

protocol. Thus, it has limited utility in a network context.

Prospero Directory Service

The Prospero URL scheme designates resources that are accessed via the Prospero

Directory Service. A prosper0 URL takes the form

where <hsoname> is the host-specific object name in the Prospero protocol. The

optional clause <field>=<value> serves to identify a particular target entry.

Universal Resource Identifier

Universal Resource Identifier (URI) is a term for a generic WWW identifier. The

URI specification (RFC 1630) defines a syntax for encoding arbitrary naming or

addressing schemes, and provides a list of such schemes. The concept of a URI, and

in particular its details, are still evolving. The URL is a type of URI, in which an

access protocol is designated and a specific Internet address is provided.

The potential advantage of the URI is that it decouples the name of a resource

from its location and even from its access method. With the URL, a specific instance

of a resource at a specific location is designated. If there are multiple instances, and

that specific instance is unavailable at the time of a request, then a requester must

determine an alternative URL and try that. In principle, with a URI, this process

could be automated. In practice, documents such as the HTTP specification refer to

the use of URIs, but are currently implemented using only URLs.

HYPERTEXT TRANSFER PROTOCOL (HTTP)

The Hypertext Transfer Protocol (HTTP) is the foundation protocol of the worldwide

web (WWW) and can be used in any client-server application involving

hypertext. The name is somewhat misleading in that HTTP is not a protocol for

transferring hypertext; rather, it is a protocol for transmitting information with

the efficiency necessary for making hypertext jumps. The data transferred by the

protocol can be plain text, hypertext, audio, images, or any Internet-accessible

information.

We begin with an overview of HTTP concepts and operation and then look

at some of the details4 A number of important terms defined in the HTTP speclfication

are summarized in Table 19.11; these will be introduced as the discussion

proceeds.

HTTP Overview

HTTP is a transaction-oriented clientlserver protocol. The most typical use of

HTTP is between a web browser and a web server. To provide reliability, HTTP

makes use of TCP. Nevertheless, HTTP is a "stateless" protocol: Each transaction

is treated independently. Accordingly, a typical implementation will create a new

TCP connection between client and server for each transaction and then terminate

the connection as soon as the transaction completes, although the specification

does not dictate this one-to-one relationship between transaction and connection

lifetimes.

The stateless nature of HTTP is well-suited to its typical application. A normal

session of a user with a web browser involves retrieving a sequence of web

pages and documents. The sequence is, ideally, performed rapidly, and the locations

of the various pages and documents may be a number of widely distributed servers.

Another important feature of HTTP is that it is flexible in the formats that it

can handle. When a client issues a request to a server, it may include a prioritized

list of formats that it can handle, and the server replies with the appropriate format.

For example, a Lynx browser cannot handle images, so a web server need not transmit

any images on web pages. This arrangement prevents the transmission of unnecessary

information and provides the basis for extending the set of formats with new

standardized and proprietary specifications.

Figure 19.13 illustrates three examples of HTTP operation. The simplest case

is one in which a user agent establishes a direct connection with an origin server. The

user agent is the client that initiates the request, such as a web browser being run on

behalf of an end user. The origin server is the server on which a resource of interest

resides; an example is a web server at which a desired web home page resides. For

this case, the client opens a TCP connection that is end-to-end between the client

and the server. The client then issues an HTTP request. The request consists of a

specific command, referred to as a method, a URL, and a MIME-like message containing

request parameters, information about the client, and perhaps some additional

content information.

When the server receives the request, it attempts to perform the requested

action and then returns an HTTP response. The response includes status information,

a success/error code, and a MIME-like message containing information about

the server, information about the response itself, and possible body content. The

TCP connection is then closed.

The middle part of Figure 19.13 shows a case in which there is not an end-toend

TCP connection between the user agent and the origin server. Instead, there

are one or more intermediate systems with TCP connections between logically adjacent

systems. Each intermediate system acts as a relay, so that a request initiated by

the client is relayed through the intermediate systems to the server, and the

response from the server is relayed back to the client.

Three forms of intermediate systems are defined in the HTTP specification:

proxy, gateway, and tunnel, all of which are illustrated in Figure 19.14.

Proxy

A proxy acts on behalf of other clients and presents requests from other clients to

a server. The proxy acts as a server in interacting with a client, and as a client in

interacting with a server. There are several scenarios that call for the use of a proxy:

1. Security intermediary. The client and server may be separated by a security

intermediary such as a firewall, with the proxy on the client side of the firewall.

Typically, the client is part of a network secured by a firewall, and the

server is external to the secured network. In this case, the server must authenticate

itself to the firewall to set up a connection with the proxy. The proxy

accepts responses after they have passed through the firewall.

2. Different versions of HTTP. If the client and server are running different versions

of HTTP, then the proxy can implement both versions and perform the

required mapping.

In summary, a proxy is a forwarding agent, receiving a request for a URL

object, modifying the request, and forwarding that request toward the server identified

in the URL.

Gateway

A gateway is a server that appears to the client as if it were an origin server. It acts

on behalf of other servers that may not be able to communicate directly with a

client. There are several scenarios in which servers can be used:

1. Security intermediary. The client and server may be separated by a security

intermediary such as a firewall, with the gateway on the server side of the firewall.

Typically, the server is connected to a network protected by a firewall,

with the client external to the network. In this case, the client must authenticate

itself to the proxy, which can then pass the request on to the server.

2. Non-HTTP server. Web browsers have built into them the capability to contact

servers for protocols other than HTTP, such as FTP and Gopher servers.

This capability can also be provided by a gateway. The client makes an HTTP

request to a gateway server. The gateway server then contacts the relevant

FTP or Gopher server to obtain the desired result. This result is then converted

into a form suitable for HTTP and transmitted back to the client.

Tunnel

Unlike the proxy and the gateway, the tunnel performs no operations on HTTP

requests and responses. Instead, a tunnel is simply a relay point between two TCP

connections, and the HTTP messages are passed unchanged as if there were a single

HTTP connection between user agent and origin server. Tunnels are used when

there must be an intermediary system between client and server, but it is not necessary

for that system to understand the contents of messages. An example is a firewall

in which a client or server external to a protected network can establish an

authenticated connection, and which can then maintain that connection for purposes

of HTTP transactions.

Cache

Returning to Figure 19.13, the lowest portion of the figure shows an example of a

cache. A cache is a facility that may store previous requests and responses for handling

new requests. If a new request arrives that is the same as a stored request, then

the cache can supply the stored response rather than accessing the resource indicated

in the URL. The cache can operate on a client or server, or on an intermediate

system other than a tunnel. In the figure, intermediary B has cached a

requestlresponse transaction, so that a corresponding new request from the client

need not travel the entire chain to the origin server, but is handled by B.

Not all transactions can be cached, and a client or server can dictate that a certain

transaction may be cached only for a given time limit.

Messages

The best way to describe the functionality of HTTP is to describe the individual elements

of the HTTP message. HTTP consists of two types of messages: requests

from clients to servers, and responses from servers to clients. The general structure

of such messages is shown in Figure 19.15. More formally, using enhanced BNF

(Backus-Naur Form) notation (Table 19.12), we have

HTTP-Message = Simple-Request I Simple-Response I Full-Request I Full-

Response

Full-Request = Request-Line

*( General-Header I Request-Header I Entity-Header )

CRLF

[ Entity-Body ]

Full-Response = Status-Line

*( General-Header I Response-Header I Entity-Header )

CRLF

[ Entity-Body ]

Simple-Request = "GET" SP Request-URI CRLF

Simple-Response = [ Entity-Body ]

The Simple-Request and Simple-Response messages were defined in

HTTPl0.9. The request is a simple GET command with the requested URI; the

response is simply a block containing the information identified in the URI. In

HTTPl1.1, the use of these simple forms is discouraged because it prevents the

client from using content negotiation and the server from identifying the media type

of the returned entity.

With full requests and responses, the following fields are used:

Request-line. Identifies the message type and the requested resource.

Response-line. Provides status information about this response.

General-header. Contains fields that are applicable to both request and

response messages, but which do not apply to the entity being transferred.

Request-header. Contains information about the request and the client.

Response-header. Contains information about the response.

Entity-header. Contains information about the resource identified by the

request and information about the entity body.

Entity-body. The body of the message.

All of the HTTP headers consist of a sequence of fields, following the same

generic format as RFC 822 (described in Section 19.3). Each field begins on a new

line and consists of the field name followed by a colon and the field value.

Although the basic transaction mechanism is simple, there are a large number

of fields and parameters defined in HTTP; these are listed in Table 19.13. In the

remainder of this section, we look at the general header fields. Succeeding sections

describe request headers, response headers, and entities.

General Header Fields

General header fields can be used in both request and response messages. These

fields are applicable in both types of messages and contain information that does

not directly apply to the entity being transferred. The fields are the following:

Cache-Control. Specifies directives that must be obeyed by any caching mechanisms

along the requestlresponse chain; the purpose is to prevent a cache

from adversely interfering with this particular request or response.

Connection. Contains a list of keywords and header-field names that only

apply to this TCP connection between the sender and the nearest non-tunnel

recipient.

Data. Data and time at which the message originated.

Forwarded. Used by gateways and proxies to indicate intermediate steps

along a request or response chain. Each gateway or proxy that handles a message

may attach a Forwarded field that gives its URI.

Keep-Alive. May be present if the Keep-Alive keyword is present in an

incoming Connection field, to provide information to the requester of the persistent

connection. This field may indicate a maximum time that the sender

will keep the connection open while waiting for the next request or the maximum

number of additional requests that will be allowed on the current persistent

connection.

MIME-Version. Indicates that the message complies with the indicated version

of MIME.

Pragma. Contains implementation-specific directives that may apply to any

recipient along the requestlresponse chain.

Upgrade. Used in a request to specify what additional protocols the client

supports and would like to use; used in a response to indicate which protocol

will be used.

Two of these fields warrant further elaboration: Cache-Control and Connection.

Cache-Control

A Cache-Control field can be attached to either a request or a response. Any

caching mechanisms that receive a message with this header must follow the directives

in the header, which may mean deviating from the default caching action. This

field has the following format:

That is, this field consists of the phrase "Cache-Control:" followed by one or more

directives.

A cachable directive is included in a response to indicate that the server generating

the response declares it to be cachable. Any caching mechanism that forwards

this response may cache it for future use.

A max-age directive is used in a request to inform any caching mechanism en

route that it may use a cached response to this message only if it has a cached

response that is no older than the age specified. A server may include this directive

in a response to inform any caching mechanism en route that it may cache this

response for future requests up to the max-age time limit.

A private directive in a response indicates that parts of the response message

are intended for a single user and must not be cached except within a non-shared

cache controlled by the user agent. If no field names are listed, the entire message

is private.

A no-cache directive in a request forces that request to be forwarded to the

origin server and not answered by an intermediate cache. This directive allows a

client to request an authoritative response or to refresh a suspect cache. The list of

field names is not used in a request message. In a response, the no-cache directive

indicates that part or all of the message must not be cached for future use.

Connection

A Connection field can be attached to either a request or a response. It is used to

communicate from one end point of a TCP connection to the other end point. Thus,

this field is not end-to-end at the HTTP level. When an intermediary system

receives and forwards a message containing this field, that system must remove the

field prior to forwarding.

The body of this field may include one or more field names for fields included

in this message. These fields are to be processed by the recipient and not forwarded

with the rest of the message. Alternatively, the body may consist of one or more

keywords. At present, only the Keep-Alive keyword is defined in version 1.1 of

HTTP; this indicates that the sender would like a persistent TCP connection (one

that remains open beyond the current transaction).

Request Messages

A full-request message consists of a status line followed by one or more general,

request, and entity headers, followed by an optional entity body.

Request Methods

A full request message always begins with a Request-Line, which has the following

format:

Request-Line = Method SP Request-URI SP HTTP-Version CRLF

The Method parameter indicates the actual request command, called a

method in HTTP. Request-URI is the URI of the requested resource, and HTTPVersion

is the version number of HTTP used by the sender.

The following request methods are defined in HTTPl1.1:

OPTIONS. A request for information about the options available for the

requestlresponse chain identified by this URI.

GET. A request to retrieve the information identified in the URI and return

it in an entity body. A GET is conditional if the If-Modified-Since header field

is included, and is partial if a Range header field is included.

HEAD. This request is identical to a GET, except that the server's response

must not include an entity body; all of the header fields in the response are the

same as if the entity body were present; this enables a client to get information

about a resource without transferring the entity body.

POST. A request to accept the attached entity as a new subordinate to the

identified URI. The posted entity is subordinate to that URI in the same way

that a file is subordinate to a directory containing it, a news article is subordinate

to a newsgroup to which it is posted, or a record is subordinate to a database.

PUT. A request to accept the attached entity and store it under the supplied

URI. This may be a new resource with a new URI, or a replacement of the

contents of an existing resource with an existing URI.

PATCH. Similar to a PUT, except that the entity contains a list of differences

from the content of the original resource identified in the URI.

COPY. Requests that a copy of the resource identified by the URI in the

Request-Line be copied to the location(s) given in the URI-Header field in

the Entity-Header of this message.

MOVE. Requests that the resource identified by the URI in the Request-Line

be moved to the location(s) given in the URI-Header field in the Entity-

Header of this message; equivalent to a COPY followed by a DELETE.

DELETE. Requests that the origin server delete the resource identified by

the URI in the Request-Line.

LINK. Establishes one or more link relationships from the resource identified

in the Request-Line. The links are defined in the Link field in the Entity-

Header.

UNLINK. Removes one or more link relationships from the resource identified

in the Request-Line. The links are defined in the Link field in the Entity-

Header.

TRACE. Requests that the server return whatever is received as the entity

body of the response; this can be used for testing and diagnostic purposes.

WRAPPED. Allows a client to send one or more encapsulated requests. The

requests may be encrypted or otherwise processed. The server must unwrap

the requests and process accordingly.

Extension-method. Allows additional methods to be defined without changing

the protocol, but these methods cannot be assumed to be recognizable by

the recipient.

Request Header Fields

Request header fields function as request modifiers, providing additional information

and parameters related to the request. The following fields are defined in

HTTPI1 .l:

Accept. A list of media types and ranges that are acceptable as a response to

this request.

Accept-charset. A list of character sets acceptable for the response.

Accept-encoding. List of acceptable content encodings for the entity body.

Content encodings are primarily used to allow a document to be compressed

or encrypted. Typically, the resource is stored in this encoding and only

decoded before actual use.

Accept-language. Restricts the set of natural languages that are preferred for

the response.

Authorization. Contains a field value, referred to as credentials, used by the

client to authenticate itself to the server.

From. The Internet e-mail address for the human user who controls the

requesting user agent.

Host. Specifies the Internet host of the resource being requested.

If-modified-since. Used with the GET method. This header includes a

dateltime parameter; the resource is to be transferred only if it has been modified

since the dateltime specified. This feature allows for efficient cache

update. A caching mechanism can periodically issue GET messages to an origin

server, and will receive only a small response message unless an update is

needed.

Proxy-authorization. Allows the client to identify itself to a proxy that

requires authentication.

Range. For future study. The intent is that, in a GET message, a client can

request only a portion of the identified resource.

Referer. The URI of the resource from which the Request-URI was obtained.

This enables a server to generate lists of back-links.

Unless. Similar in function to the If-Modified-Since field, with two differences:

(1) It is not restricted to the GET method, and (2) comparison is based

on any Entity-Header field value rather than a dateltime value.

User-agent. Contains information about the user agent originating this request.

This is used for statistical purposes, the tracing of protocol violations,

and automated recognition of user agents for the sake of tailoring responses

to avoid particular user agent limitations.

Response Messages

A full-response message consists of a status line followed by one or more general,

response, and entity headers, followed by an optional entity body.

Status Codes

A full-response message always begins with a Status-Line, which has the following

format:

Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

The HTTP-Version value is the version number of HTTP used by the sender.

The Status-Code is a 3-digit integer that indicates the response to a received

request, and the Reason-Phrase provides a short textual explanation of the status

code.

There are a rather large number of status codes defined in HTTPl1.1; these

are listed in Table 19.14, together with a brief definition. The codes are organized

into the following categories:

Informational. The request has been received and processing continues. No

entity body accompanies this response.

Successful. The request was successfully received, understood, and accepted.

The information returned in the response message depends on the request

method, as follows:

-GET: The contents of the entity-body corresponds to the requested

resource.

-HEAD: No entity body is returned.

-POST: The entity describes or contains the result of the action.

-TRACE: The entity contains the request message.

-Other methods: The entity describes the result of the action.

Redirection. Further action is required to complete the request.

Client error. The request contains a syntax error or the request cannot be fulfilled.

Server error. The server failed to fulfill an apparently valid request.

Response Header Fields

Response header fields providing additional information related to the response

that cannot be placed in the Status-Line. The following fields are defined in

HTTPI1 .l:

Location. Defines the exact location of the resource identified by the

Request-URI.

Proxy-authenticate. Included with a response that has a status code of Proxy

Authentication Required. This field contains a "challenge" that indicates the

authentication scheme and parameters required.

Public. Lists the non-standard methods supported by this server.

Retry-after. Included with a response that has a status code of Service

Unavailable, and indicates how long the service is expected to be unavailable.

Server. Identifies the software product used by the origin server to handle the

request.

WWW-authenticate. Included with a response that has a status code of Unauthorized.

This field contains a challenge that indicates the authentication

scheme and parameters required.

Entities

An entity consists of an entity header and an entity body in a request or response

message. An entity may represent a data resource, or it may constitute other information

supplied with a request or response.

Entity Header Fields

Entity header fields provide optional information about the entity body or, if no

body is present, about the resource identified by the request. The following fields

are defined in HTTPl1.1:

Allow. Lists methods supported by the resource identified in the Request-

URI. This field must be included with a response that has a status code of

Method Not Allowed and may be included in other responses.

Content-encoding. Indicates what content encodings have been applied to the

resource. The only encoding currently defined is zip compression.

Content-language. Identifies the natural language(s) of the intended audience

of the enclosed entity.

Content-length. The size of the entity body in octets.

Content-MDS. For future study. MD5 refers to the MD5 hash code function,

described in Lesson 18.

Content-range. For future study. The intent is that this designation will indicate

a portion of the identified resource that is included in this response.

Content-type. Indicates the media type of the entity body.

Content-version. A version tag associated with an evolving entity.

Derived-from. Indicates the version tag of the resource from which this entity

was derived before modifications were made by the sender. This field and the

Content-Version field can be used to manage multiple updates by a group

of users.

Expires. Dateltime after which the entity should be considered stale.

Last-modified. Dateltime that the sender believes the resource was last modified.

Link. Defines links to other resources.

Title. A textual title for the entity.

Transfer-encoding. Indicates what type of transformation has been applied to

the message body to safely transfer it between the sender and the recipient.

The only encoding defined in the standard is chunked. The chunked option

defines a procedure for breaking an entity body into labeled chunks that are

transmitted separately.

URI-header. Informs the recipient of other URIs by which the resource can

be identified.

* Extension-header. Allows additional fields to be defined without changing the

protocol, but these fields cannot be assumed to be recognizable by the recipient.

Entity Body

An entity body consists of an arbitrary sequence of octets. HTTP is designed to be

able to transfer any type of content, including text, binary data, audio, images, and

video. When an entity body is present in a message, the interpretation of the octets

in the body is determined by the entity header fields Content-Encoding, Content-

Type, and Transfer-Encoding. These define a three-layer, ordered encoding model:

entity-body := Transfer-Encoding( Content-Encoding( Content-Type

( data 1 1 1

The data are the contents of a resource identified by a URI. The Content-

Type field determines the way in which the data are interpreted. A Content-Encoding

may be applied to the data and stored at the URI instead of the data. Finally,

on transfer, a Transfer-Encoding may be applied to form the entity body of the

message.

Access Authentication

HTTPl1.1 defines a simple challenge-response technique for authentication. This

definition does not restrict HTTP clients and servers from using other forms of

authentication, but the current standard only covers this simple form.

Two authentication exchanges are defined: one between a client and a server,

and one between a client and a proxy. Both types of exchange use a challengeresponse

mechanism. The challenge, issued by a server or proxy, is of the form

challenge = auth-scheme 1*SP realm *( "," auth-param )

auth-scheme = token

auth-param = token "=" quoted-string

realm = "realm" "=" realm-value

realm-value = quoted-string

Auth-scheme is the name of a particular authentication scheme. The realm

defines a particular protection space, which is simply a conceptual partition of the

resource, with its own authentication scheme and authorization database. For

example, a resource may define several realms, one for end users and one for network

managers. The latter realm may have more privileges and requires a more

powerful authentication scheme.

In response to an authentication challenge, a client must provide credentials.

These are of the form

credentials = basic-credentials I auth-scheme *("," auth-param )

Basic credentials are covered below. In the general case, the user would return

the name of the authentication scheme and a set of parameters required to authenticate

itself.

Client-Server Authentication

A user agent that wishes to authenticate itself with a server may do so by including

an Authorization field in the request header; an agent may do this when initially

sending the request. An alternative, which may be more common, is that a client

sends a Request message without an Authorization field and is then required to

return an authorization by the server. Figure 19.16 illustrates this scenario, which

involves three steps:

The client sends a request, such as a GET request to the server, with no

Authorization field in the request header.

The server returns a response with a status code in the Status line of Unauthorized

and a WWW-Authenticate field in the response header. The

WWW-Authenticate field consists of a challenge that indicates the type of

authentication required and may include other parameters. No entity body is

returned.

The client repeats the request but includes an Authorization field that contains

the authorization data needed by the server.

If authentication succeeds, the server returns a response with some other status

code and without a WWW-Authenticate field. If authentication fails, the server

can initiate a new authentication sequence by returning a response with a status of

Unauthorized and a WWW-Authenticate field containing the (possibly new) challenge.

The entity body should explain the reason for the refusal.

In client-server authentication, any proxy or gateway must be transparent, as

far as authentication is concerned. That is, the WWW-Authenticate and Authorization

fields must be forwarded unmodified, and the response to a request containing

an Authorization field must not be cached. This latter requirement dictates

that the authentication always takes place between client and server and does not

simply replay the server's prior acceptance of authentication.

Proxy Authentication

A proxy may be configured so that a client must first authenticate itself to the proxy

before being granted access to an origin server. The sequence is similar to that

described for client-server authentication. In this case, the authentication information

is carried in the Proxy-Authorization field in the request header. A client may

authenticate itself when first issuing a Request message. Alternatively, a scenario

similar to Figure 19.16 occurs:

1. The client sends a request, such as a GET request to the server, with no Proxy-

Authorization field in the request header.

2. The proxy does not forward the request, but returns a response with a status

code in the Status line of Proxy-Authentication Required and a Proxy-

Authenticate field in the response header.

3. The client repeats the request but includes a Proxy-Authorization field that

contains the authorization data needed by the proxy.

If the request is authenticated, then the proxy may forward the request to a

server, but will omit the Proxy-Authorization field. The proxy could also return a

cached response.

Basic Authentication Scheme

For the basic authentication scheme, a user agent authenticates itself within a particular

realm by supplying a user ID and a password. This is the simplest form of

authentication, comparable to logging on to a system. Within HTTP, there is no

provision for protecting the user ID or password with encryption, so this method

provides minimal security. The form of the credentials for basic authentication are

teknik informatika

Translate

Wednesday, October 5, 2016

UNIFORM RESOURCE LOCATORS (URL) AND UNIVERSAL RESOURCE IDENTIFIERS (URI)

No comments:

Post a Comment