6.1
The Transport Service
In the following sections we will
provide an introduction to the transport service. We look at what kind of
service is provided to the application layer. To make the issue of transport
service more concrete, we will examine two sets of transport layer primitives.
First comes a simple (but hypothetical) one to show the basic ideas. Then comes
the interface commonly used in the Internet.
The ultimate goal of the transport
layer is to provide efficient, reliable, and cost-effective service to its
users, normally processes in the application layer. To achieve this goal, the
transport layer makes use of the services provided by the network layer. The
hardware and/or software within the transport layer that does the work is
called the transport entity. The transport entity can be located in the
operating system kernel, in a separate user process, in a library package bound
into network applications, or conceivably on the network interface card. The
(logical) relationship of the network, transport, and application layers is
illustrated in Fig. 6-1.
Just as there are two types of
network service, connection-oriented and connectionless, there are also two
types of transport service. The connection-oriented transport service is
similar to the connection-oriented network service in many ways. In both cases,
connections have three phases: establishment, data transfer, and release.
Addressing and flow control are also similar in both layers. Furthermore, the
connectionless transport service is also very similar to the connectionless
network service.
The obvious question is then this:
If the transport layer service is so similar to the network layer service, why
are there two distinct layers? Why is one layer not adequate? The answer is
subtle, but crucial, and goes back to Fig. 1-9. The transport code runs entirely on the
users' machines, but the network layer mostly runs on the routers, which are
operated by the carrier (at least for a wide area network). What happens if the
network layer offers inadequate service? Suppose that it frequently loses
packets? What happens if routers crash from time to time?
Problems occur, that's what. The users
have no real control over the network layer, so they cannot solve the problem
of poor service by using better routers or putting more error handling in the
data link layer. The only possibility is to put on top of the network layer
another layer that improves the quality of the service. If, in a
connection-oriented subnet, a transport entity is informed halfway through a
long transmission that its network connection has been abruptly terminated,
with no indication of what has happened to the data currently in transit, it
can set up a new network connection to the remote transport entity. Using this
new network connection, it can send a query to its peer asking which data
arrived and which did not, and then pick up from where it left off.
In essence, the existence of the
transport layer makes it possible for the transport service to be more reliable
than the underlying network service. Lost packets and mangled data can be
detected and compensated for by the transport layer. Furthermore, the transport
service primitives can be implemented as calls to library procedures in order
to make them independent of the network service primitives. The network service
calls may vary considerably from network to network (e.g., connectionless LAN
service may be quite different from connection-oriented WAN service). By hiding
the network service behind a set of transport service primitives, changing the
network service merely requires replacing one set of library procedures by
another one that does the same thing with a different underlying service.
Thanks to the transport layer,
application programmers can write code according to a standard set of
primitives and have these programs work on a wide variety of networks, without
having to worry about dealing with different subnet interfaces and unreliable
transmission. If all real networks were flawless and all had the same service
primitives and were guaranteed never, ever to change, the transport layer might
not be needed. However, in the real world it fulfills the key function of
isolating the upper layers from the technology, design, and imperfections of
the subnet.
For this reason, many people have
traditionally made a distinction between layers 1 through 4 on the one hand and
layer(s) above 4 on the other. The bottom four layers can be seen as the transport
service provider, whereas the upper layer(s) are the transport service user.
This distinction of provider versus user has a considerable impact on the
design of the layers and puts the transport layer in a key position, since it
forms the major boundary between the provider and user of the reliable data
transmission service.
To allow users to access the
transport service, the transport layer must provide some operations to
application programs, that is, a transport service interface. Each transport
service has its own interface. In this section, we will first examine a simple
(hypothetical) transport service and its interface to see the bare essentials.
In the following section we will look at a real example.
The transport service is similar to
the network service, but there are also some important differences. The main
difference is that the network service is intended to model the service offered
by real networks, warts and all. Real networks can lose packets, so the network
service is generally unreliable.
The (connection-oriented) transport
service, in contrast, is reliable. Of course, real networks are not error-free,
but that is precisely the purpose of the transport layer—to provide a reliable
service on top of an unreliable network.
As an example, consider two
processes connected by pipes in UNIX. They assume the connection between them
is perfect. They do not want to know about acknowledgements, lost packets,
congestion, or anything like that. What they want is a 100 percent reliable
connection. Process A puts data into one end of the pipe, and process B takes
it out of the other. This is what the connection-oriented transport service is
all about—hiding the imperfections of the network service so that user
processes can just assume the existence of an error-free bit stream.
As an aside, the transport layer can
also provide unreliable (datagram) service. However, there is relatively little
to say about that, so we will mainly concentrate on the connection-oriented
transport service in this chapter. Nevertheless, there are some applications,
such as client-server computing and streaming multimedia, which benefit from
connectionless transport, so we will say a little bit about it later on.
A second difference between the
network service and transport service is whom the services are intended for.
The network service is used only by the transport entities. Few users write
their own transport entities, and thus few users or programs ever see the bare
network service. In contrast, many programs (and thus programmers) see the
transport primitives. Consequently, the transport service must be convenient
and easy to use.
To get an idea of what a transport
service might be like, consider the five primitives listed in Fig. 6-2. This transport interface is truly bare
bones, but it gives the essential flavor of what a connection-oriented
transport interface has to do. It allows application programs to establish,
use, and then release connections, which is sufficient for many applications.
To see how these primitives might be
used, consider an application with a server and a number of remote clients. To
start with, the server executes a LISTEN primitive, typically by calling a
library procedure that makes a system call to block the server until a client
turns up. When a client wants to talk to the server, it executes a CONNECT
primitive. The transport entity carries out this primitive by blocking the
caller and sending a packet to the server. Encapsulated in the payload of this
packet is a transport layer message for the server's transport entity.
A quick note on terminology is now
in order. For lack of a better term, we will reluctantly use the somewhat
ungainly acronym TPDU (Transport Protocol Data Unit) for messages sent from
transport entity to transport entity. Thus, TPDUs (exchanged by the transport
layer) are contained in packets (exchanged by the network layer). In turn,
packets are contained in frames (exchanged by the data link layer). When a
frame arrives, the data link layer processes the frame header and passes the
contents of the frame payload field up to the network entity. The network entity
processes the packet header and passes the contents of the packet payload up to
the transport entity. This nesting is illustrated in Fig. 6-3.
Getting back to our client-server
example, the client's CONNECT call causes a CONNECTION REQUEST TPDU to be sent
to the server. When it arrives, the transport entity checks to see that the
server is blocked on a LISTEN (i.e., is interested in handling requests). It
then unblocks the server and sends a CONNECTION ACCEPTED TPDU back to the
client. When this TPDU arrives, the client is unblocked and the connection is
established.
Data can now be exchanged using the
SEND and RECEIVE primitives. In the simplest form, either party can do a
(blocking) RECEIVE to wait for the other party to do a SEND. When the TPDU
arrives, the receiver is unblocked. It can then process the TPDU and send a
reply. As long as both sides can keep track of whose turn it is to send, this
scheme works fine.
Note that at the transport layer,
even a simple unidirectional data exchange is more complicated than at the
network layer. Every data packet sent will also be acknowledged (eventually).
The packets bearing control TPDUs are also acknowledged, implicitly or
explicitly. These acknowledgements are managed by the transport entities, using
the network layer protocol, and are not visible to the transport users.
Similarly, the transport entities will need to worry about timers and
retransmissions. None of this machinery is visible to the transport users. To
the transport users, a connection is a reliable bit pipe: one user stuffs bits
in and they magically appear at the other end. This ability to hide complexity
is the reason that layered protocols are such a powerful tool.
When a connection is no longer
needed, it must be released to free up table space within the two transport
entities. Disconnection has two variants: asymmetric and symmetric. In the
asymmetric variant, either transport user can issue a DISCONNECT primitive,
which results in a DISCONNECT TPDU being sent to the remote transport entity.
Upon arrival, the connection is released.
In the symmetric variant, each
direction is closed separately, independently of the other one. When one side
does a DISCONNECT, that means it has no more data to send but it is still
willing to accept data from its partner. In this model, a connection is
released when both sides have done a DISCONNECT.
A state diagram for connection
establishment and release for these simple primitives is given in Fig. 6-4. Each transition is triggered by some
event, either a primitive executed by the local transport user or an incoming
packet. For simplicity, we assume here that each TPDU is separately
acknowledged. We also assume that a symmetric disconnection model is used, with
the client going first. Please note that this model is quite unsophisticated.
We will look at more realistic models later on.
Figure 6-4. A state diagram for a simple connection
management scheme. Transitions labeled in italics are caused by packet
arrivals. The solid lines show the client's state sequence. The dashed lines
show the server's state sequence.
Let us now briefly inspect another
set of transport primitives, the socket primitives used in Berkeley UNIX for
TCP. These primitives are widely used for Internet programming. They are listed
in Fig. 6-5. Roughly speaking, they follow the model
of our first example but offer more features and flexibility. We will not look
at the corresponding TPDUs here. That discussion will have to wait until we
study TCP later in this chapter.
The first four primitives in the
list are executed in that order by servers. The SOCKET primitive creates a new
end point and allocates table space for it within the transport entity. The
parameters of the call specify the addressing format to be used, the type of
service desired (e.g., reliable byte stream), and the protocol. A successful
SOCKET call returns an ordinary file descriptor for use in succeeding calls,
the same way an OPEN call does.
Newly-created sockets do not have
network addresses. These are assigned using the BIND primitive. Once a server
has bound an address to a socket, remote clients can connect to it. The reason
for not having the SOCKET call create an address directly is that some
processes care about their address (e.g., they have been using the same address
for years and everyone knows this address), whereas others do not care.
Next comes the LISTEN call, which
allocates space to queue incoming calls for the case that several clients try
to connect at the same time. In contrast to LISTEN in our first example, in the
socket model LISTEN is not a blocking call.
To block waiting for an incoming
connection, the server executes an ACCEPT primitive. When a TPDU asking for a
connection arrives, the transport entity creates a new socket with the same
properties as the original one and returns a file descriptor for it. The server
can then fork off a process or thread to handle the connection on the new
socket and go back to waiting for the next connection on the original socket.
ACCEPT returns a normal file descriptor, which can be used for reading and
writing in the standard way, the same as for files.
Now let us look at the client side.
Here, too, a socket must first be created using the SOCKET primitive, but BIND
is not required since the address used does not matter to the server. The
CONNECT primitive blocks the caller and actively starts the connection process.
When it completes (i.e., when the appropriate TPDU is received from the
server), the client process is unblocked and the connection is established.
Both sides can now use SEND and RECV to transmit and receive data over the
full-duplex connection. The standard UNIX READ and WRITE system calls can also
be used if none of the special options of SEND and RECV are required.
Connection release with sockets is
symmetric. When both sides have executed a CLOSE primitive, the connection is
released.
As an example of how the socket
calls are used, consider the client and server code of Fig. 6-6. Here we have a very primitive Internet
file server along with an example client that uses it. The code has many
limitations (discussed below), but in principle the server code can be compiled
and run on any UNIX system connected to the Internet. The client code can then
be compiled and run on any other UNIX machine on the Internet, anywhere in the
world. The client code can be executed with appropriate parameters to fetch any
file to which the server has access on its machine. The file is written to
standard output, which, of course, can be redirected to a file or pipe.
Let us look at the server code
first. It starts out by including some standard headers, the last three of
which contain the main Internet-related definitions and data structures. Next
comes a definition of SERVER_PORT as 12345. This number was chosen arbitrarily.
Any number between 1024 and 65535 will work just as well as long as it is not
in use by some other process. Of course, the client and server have to use the
same port. If this server ever becomes a worldwide hit (unlikely, given how
primitive it is), it will be assigned a permanent port below 1024 and appear on
www.iana.org.
The next two lines in the server
define two constants needed. The first one determines the chunk size used for
the file transfer. The second one determines how many pending connections can
be held before additional ones are discarded upon arrival.
After the declarations of local
variables, the server code begins. It starts out by initializing a data
structure that will hold the server's IP address. This data structure will soon
be bound to the server's socket. The call to memset sets the data structure to
all 0s. The three assignments following it fill in three of its fields. The
last of these contains the server's port. The functions htonl and htons have to
do with converting values to a standard format so the code runs correctly on
both big-endian machines (e.g., the SPARC) and little-endian machines (e.g.,
the Pentium). Their exact semantics are not relevant here.
Next the server creates a socket and
checks for errors (indicated by s < 0). In a production version of the code,
the error message could be a trifle more explanatory. The call to setsockopt is
needed to allow the port to be reused so the server can run indefinitely,
fielding request after request. Now the IP address is bound to the socket and a
check is made to see if the call to bind succeeded. The final step in the
initialization is the call to listen to announce the server's willingness to
accept incoming calls and tell the system to hold up to QUEUE_SIZE of them in
case new requests arrive while the server is still processing the current one.
If the queue is full and additional requests arrive, they are quietly
discarded.
At this point the server enters its
main loop, which it never leaves. The only way to stop it is to kill it from
outside. The call to accept blocks the server until some client tries to
establish a connection with it. If the accept call succeeds, it returns a file
descriptor that can be used for reading and writing, analogous to how file
descriptors can be used to read and write from pipes. However, unlike pipes,
which are unidirectional, sockets are bidirectional, so sa (socket address) can
be used for reading from the connection and also for writing to it.
After the connection is established,
the server reads the file name from it. If the name is not yet available, the
server blocks waiting for it. After getting the file name, the server opens the
file and then enters a loop that alternately reads blocks from the file and
writes them to the socket until the entire file has been copied. Then the
server closes the file and the connection and waits for the next connection to
show up. It repeats this loop forever.
Now let us look at the client code.
To understand how it works, it is necessary to understand how it is invoked.
Assuming it is called client, a typical call is
client
flits.cs.vu.nl /usr/tom/filename >f
This call only works if the server
is already running on flits.cs.vu.nl and the file /usr/tom/filename exists and
the server has read access to it. If the call is successful, the file is
transferred over the Internet and written to f, after which the client program
exits. Since the server continues after a transfer, the client can be started
again and again to get other files.
The client code starts with some
includes and declarations. Execution begins by checking to see if it has been
called with the right number of arguments (argc = 3 means the program name plus
two arguments). Note that argv [1] contains the server's name (e.g., flits.cs.vu.nl)
and is converted to an IP address by gethostbyname. This function uses DNS to
look up the name.
Next a socket is created and initialized.
After that, the client attempts to establish a TCP connection to the server,
using connect. If the server is up and running on the named machine and
attached to SERVER_PORT and is either idle or has room in its listen queue, the
connection will (eventually) be established. Using the connection, the client
sends the name of the file by writing on the socket. The number of bytes sent
is one larger than the name proper since the 0-byte terminating the name must
also be sent to tell the server where the name ends.
* on the next page. The server responds by
sending the whole file.
*/
#include
<sys/types.h>
#include
<sys/socket.h>
#include
<netinet/in.h>
#include
<netdb.h>
#define
SERVER_PORT 12345 /*
arbitrary, but client & server must
agree
*/
#define
BUF_SIZE 4096 /*
block transfer size */
int
main(int argc, char **argv)
{
int c, s, bytes;
char buf[BUF_SIZE]; /* buffer for
incoming file */
struct hostent *h; /* info about
server */
struct sockaddr_in channel; /* holds IP address */
if (argc != 3) fatal("Usage: client
server-name file-name");
h = gethostbyname(argv[1]); /* look up host's IP address
*/
if (!h) fatal("gethostbyname
failed");
s = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);
i f (s <0) fatal("socket");
memset(&channel, 0, sizeof(channel));
channel.sin_family= AF_INET;
memcpy(&channel.sin_addr.s_addr, h->h_addr,
h->h_length);
channel.sin_port= htons(SERVER_PORT);
c = connect(s, (struct sockaddr *)
&channel, sizeof(channel));
if (c < 0) fatal("connect
failed");
/* Connection is now established. Send file
name including 0 byte at end. */
write(s, argv[2], strlen(argv[2])+1);
/ * Go get the file and write it to standard
output. */
while (1) {
bytes = read(s, buf, BUF_SIZE); /* read from socket */
if (bytes <= 0) exit(0); /* check for end of file */
write(1, buf, bytes); /* write to standard
output */
}
}
fatal(char
*string)
{
printf("%s\n", string);
exit(1);
}
#include
<sys/types.h>
/* This is the server code */
#include
<sys/fcntl.h>
#include
<sys/socket.h>
#include
<netinet/in.h>
#include
<netdb.h>
#define
SERVER_PORT 12345 /*
arbitrary, but client & server must
agree
*/
#define
BUF_SIZE 4096 /*
block transfer size */
#define
QUEUE_SIZE 10
int
main(int argc, char *argv[])
{
int s, b, l, fd, sa, bytes, on = 1;
char buf[BUF_SIZE]; /* buffer for
outgoing file */
struct sockaddr_in channel; /* holds IP address */
/* Build address structure to bind to socket.
*/
memset(&channel, 0, sizeof(channel)); /* zerochannel */
channel.sin_family = AF_INET;
channel.sin_addr.s_addr = htonl(INADDR_ANY);
channel.sin_port = htons(SERVER_PORT);
/* Passive open. Wait for connection. */
s = socket(AF_INET, SOCK_STREAM,
IPPROTO_TCP); /* createsocket */
if (s < 0) fatal("socket
failed");
setsockopt(s, SOL_SOCKET, SO_REUSEADDR, (char
*) &on, sizeof(on));
b = bind(s, (struct sockaddr *) &channel,
sizeof(channel));
if (b < 0) fatal("bind failed");
l = listen(s, QUEUE_SIZE); /* specify queue size */
if (l < 0) fatal("listen
failed");
/* Socket is now set up and bound. Wait for
connection and process it. */
while (1) {
sa = accept(s, 0, 0); /* block for connection
request */
if (sa < 0) fatal("accept
failed");
read(sa, buf, BUF_SIZE); /* read file name from socket */
/* Get and return the file. */
fd = open(buf, O_RDONLY); /* open the file to be sent back
*/
if (fd < 0) fatal("open failed");
while (1) {
bytes = read(fd, buf, BUF_SIZE); /*
read from file */
if (bytes <= 0) break; /* check for end of file */
write(sa, buf, bytes); /* write bytes to socket */
}
close(fd); /* closefile */
close(sa); /* close
connection */
}
}
Now the client enters a loop,
reading the file block by block from the socket and copying it to standard
output. When it is done, it just exits.
The procedure fatal prints an error
message and exits. The server needs the same procedure, but it was omitted due
to lack of space on the page. Since the client and server are compiled
separately and normally run on different computers, they cannot share the code
of fatal.
by clicking on the Web Site link
next to the photo of the cover. They can be downloaded and compiled on any UNIX
system (e.g., Solaris, BSD, Linux) by
cc
–o client client.c –lsocket –lnsl
cc
–o server server.c –lsocket –lnsl
The server is started by just typing
server
The client needs two arguments, as
discussed above. A Windows version is also available on the Web site.
Just for
the record, this server is not the last word in serverdom. Its error checking
is meager and its error reporting is mediocre. It has clearly never heard about
security, and using bare UNIX system calls is not the last word in platform
independence. It also makes some assumptions that are technically illegal, such
as assuming the file name fits in the buffer and is transmitted atomically.
Since it handles all requests strictly sequentially (because it has only a
single thread), its performance is poor. These shortcomings notwithstanding, it
is a complete, working Internet file server. In the exercises, the reader is
invited to improve it. For more information about programming with sockets, see
(Stevens, 1997).
No comments:
Post a Comment
silahkan membaca dan berkomentar