Translate

Monday, October 3, 2016

Transport Protocols



Transport Protocols
he transport protocol is the keystone of the whole concept of a computercommunications
architecture. Lower-layer protocols are needed, to be sure,
but they are less important for 1) pedagogical purposes, and 2) designing
purposes. For one thing, lower-level protocols are better understood and, on the
whole, less complex than transport protocols. Also, standards have settled out quite
well for most kinds of layer 1 to 3 transmission facilities, and there is a large body
of experience behind their use.
Viewed from the other side, upper-level protocols depend heavily on the
transport protocol. The transport protocol provides the basic end-to-end service of
transferring data between users and relieves applications and other upper-layer
protocols from the need to deal with the characteristics of intervening communications
networks and services.
We begin by looking at the services that one might expect from a transport
protocol. Next, we examine the protocol mechanisms required to provide these services.
We find that most of the complexity relates to connection-oriented services.
As might be expected, the less the network service provides, the more the transport
protocol must do. The remainder of the lesson looks at two widely used transport
protocols: transmission control protocol (TCP) and user datagram protocol (UDP).
Figure 17.1 highlights the position of these protocols within the TCP/IP protocol
suite.
TRANSPORT SERVICES
We begin by looking at the kinds of services that a transport protocol can or should
provide to higher-level protocols. Figure 17.2 places the concept of transport services
in context. In a system, there is a transport entity that provides services to TS
users,' which might be an application process or a session-protocol entity. This local
transport entity communicates with some remote-transport entity, using the services
of some lower layer, such as the network layer.
We have already mentioned that the general service provided by a transport
protocol is the end-to-end transport of data in a way that shields the TS user from
the details of the underlying communications systems. To be more specific, we must
consider the specific services that a transport protocol can provide. The following
categories of service are useful for describing the transport service:
Transport
entity
Type of service
Quality of service
Data transfer
User interface
Connection management
Expedited delivery
Status reporting
Security
Type of Service
Two basic types of service are possible: connection-oriented and connectionless, or
datagram service. A connection-oriented service provides for the establishment,
maintenance, and termination of a logical connection between TS users. This has,
so far, been the most common type of protocol service available and has a wide variety
of applications. The connection-oriented service generally implies that the service
is reliable.
The strengths of the connection-oriented approach are clear. It allows for connection-
related features such as flow control, error control, and sequenced delivery.
Connectionless service, however, is more appropriate in some contexts. At lower
layers (internet, network), connectionless service is more robust (e.g., see discussion
in Section 9.1). In addition, it represents a "least common denominator" of service
to be expected at higher layers. Further, even at transport and above, there is justification
for a connectionless service. There are instances in which the overhead of
connection establishment and maintenance is unjustified or even counterproductive.
Some examples follow:
a Inward data collection. Involves the periodic active or passive sampling of
data sources, such as sensors, and automatic self-test reports from security
equipment or network components. In a real-time monitoring situation, the
loss of an occasional data unit would not cause distress, as the next report
should arrive shortly.
Outward data dissemination. Includes broadcast messages to network users,
the announcement of a new node or the change of address of a service, and
the distribution of real-time clock values.
Request-response. Applications in which a transaction service is provided by
a common server to a number of distributed TS users, and for which a single
request-response sequence is typical. Use of the service is regulated at the
application level, and lower-level connections are often unnecessary and
cumbersome.
Real-time applications. Such as voice and telemetry, involving a degree of
redundancy and/or a real-time transmission requirement; these must not have
connection-oriented functions, such as retransmission.
Thus, there is a place at the transport level for both a connection-oriented and
a connectionless type of service.
Quality of Service
The transport protocol entity should allow the TS user to specify the quality of
transmission service to be provided. The transport entity will attempt to optimize
the use of the underlying link, network, and internet resources to the best of its ability,
so as to provide the collective requested services.
Examples of services that might be requested are
Acceptable error and loss levels
Desired average and maximum delay
Desired average and minimum throughput
a Priority levels
Of course, the transport entity is limited to the inherent capabilities of the
underlying service. For example, IP does provide a quality-of-service parameter. It
allows for specification of eight levels of precedence or priority as well as a binary
specification for normal or low delay, normal or high throughput, and normal or
high reliability. Thus, the transport entity can "pass the buck" to the internetwork
entity. However, the internet protocol entity is itself limited; routers have some
freedom to schedule items preferentially from buffers, but beyond that are still
dependent on the underlying transmission facilities. Here is another example: X.25
provides for throughput class negotiation as an optional user facility. The network
may alter flow control parameters and the amount of network resources allocated
on a virtual circuit to achieve desired throughput.
The transport layer may also resort to other mechanisms to try to satisfy TS
user requests, such as splitting one transport connection among multiple virtual circuits
to enhance throughput.
The TS user of the quality-of-service feature needs to recognize that
Depending on the nature of the transmission facility, the transport entity will
have varying degrees of success in providing a requested grade of service.
There is bound to be a trade-off among reliability, delay, throughput, and cost
of services.
Nevertheless, certain applications would benefit from, or even require, certain
qualities of service and, in a hierarchical or layered architecture, the easiest way for
an application to extract this quality of service from a transmission facility is to pass
the request down to the transport protocol.
Examples of applications that might request particular qualities of service are
as follows:
A file transfer protocol might require high throughput. It may also require
high reliability to avoid retransmissions at the file transfer level.
A transaction protocol (e.g., web browser-web server) may require low delay.
An electronic mail protocol may require multiple priority levels.
One approach to providing a variety of qualities of service is to include a
quality-of-service facility within the protocol; we have seen this with IP and will see
that transport protocols typically follow the same approach. An alternative is to
provide a different transport protocol for different classes of traffic; this is to some
extent the approach taken by the ISO-standard family of transport protocols.
Data Transfer
The whole purpose, of course, of a transport protocol is to transfer data between
two transport entities. Both user data and control data must be transferred, either
on the same channel or separate channels. Full-duplex service must be provided.
Half-duplex and simpkx modes may also be offered to support peculiarities of particular
TS users.
User Interface
It is not clear that the exact mechanism of the user interface to the transport protocol
should be standardized. Rather, it should be optimized to the station environment.
As examples, a transport entity's services could be invoked by
* Procedure calls.
* Passing of data and parameters to a process through a mailbox.
* Use of direct memory access (DMA) between a host user and a front-end
processor containing the transport entity.
A few characteristics of the interface may be specified, however. For example,
a mechanism is needed to prevent the TS user from swamping the transport entity
with data. A similar mechanism is needed to prevent the transport entity from
swamping a TS user with data. Another aspect of the interface has to do with the
timing and significance of confirmations. Consider the following: A TS user passes
data to a transport entity to be delivered to a remote TS user. The local transport
entity can acknowledge receipt of the data immediately, or it can wait until the
remote transport entity reports that the data have made it through to the other end.
Perhaps the most useful interface is one that allows immediate acceptance or rejection
of requests, with later confirmation of the end-to-end significance.
Connection Management
When connection-oriented service is provided, the transport entity is responsible
for establishing and terminating connections. A symmetric connection-establishment
procedure should be provided, which allows either TS user to initiate connection
establishment. An asymmetric procedure may also be provided to support simplex
connections.
Connection termination can be either abrupt or graceful. With an abrupt termination,
data in transit may be lost. A graceful termination prevents either side
from shutting down until all data have been delivered.
Expedited Delivery
A service similar to that provided by priority classes is the expedited delivery of
data. Some data submitted to the transport service may supersede data submitted
previously. The transport entity will endeavor to have the transmission facility
transfer the data as rapidly as possible. At the receiving end, the transport entity
will interrupt the TS user to notify it of the receipt of urgent data. Thus, the expedited
data service is in the nature of an interrupt mechanism, and is used to transfer
occasional urgent data, such as a break character from a terminal or an alarm
condition. In contrast, a priority service might dedicate resources and adjust parameters
such that, on average, higher priority data are delivered more quickly.
Status Reporting
A status reporting service allows the TS user to obtain or be notified of information
concerning the condition or attributes of the transport entity or a transport connection.
Examples of status information are
Performance characteristics of a connection (e.g., throughput, mean delay)
Addresses (network, transport)
Class of protocol in use
Current timer values
* State of protocol "machine" supporting a connection
Degradation in requested quality of service
Security
The transport entity may provide a variety of security services. Access control may
be provided in the form of local verification of sender and remote verification of
receiver. The transport service may also include encryptionldecryption of data on
demand. Finally, the transport entity may be capable of routing through secure links
or nodes if such a service is available from the transmission facility.

PROTOCOL MECHANISM
It is the purpose of this section to make good on our claim that a transport protocol
may need to be very complex. For purposes of clarity, we present the transport protocol
mechanisms in an evolutionary fashion. We begin with a network service that
makes life easy for the transport protocol, by guaranteeing the delivery of all transport
data units in order, as well as defining the required mechanisms. Then we will
look at the transport protocol mechanisms required to cope with an unreliable network
service.
Reliable Sequencing Network Service
In this case, we assume that the network service will accept messages of arbitrary
length and will, with virtually 100% reliability, deliver them in sequence to the destination.
Examples of such networks follow:
A highly reliable packet-switching network with an X.25 interface
A frame relay network using the LAPF control protocol
An IEEE 802.3 LAN using the connection-oriented LLC service
The assumption of a reliable sequencing networking services allows the use of
a quite simple transport protocol. Four issues need to be addressed:
Addressing
Multiplexing
Flow control
Connection establishmentltermination
Addressing
The issue concerned with addressing is simply this: A user of a given transport
entity wishes to either establish a connection with or make a connectionless data
transfer to a user of some other transport entity. The target user needs to be specified
by all of the following:
* User identification
Transport entity identification
* Station address
* Network number
The transport protocol must be able to derive the information listed above
from the TS user address. Typically, the user address is specified as station or port.
The port variable represents a particular TS user at the specified station; in OSI, this
is called a transport service access point (TSAP). Generally, there will be a single
transport entity at each station, so a transport entity identification is not needed. If
more than one transport entity is present, there is usually only one of each type. In
this latter case, the address should include a designation of the type of transport
protocol (e.g., TCP, UDP). In the case of a single network, station identifies an
attached network device. In the case of an internet, station is a global internet
address. In TCP, the combination of port and station is referred to as a socket.
Because routing is not a concern of the transport layer, it simply passes the
station portion of the address down to the network service. Port is included
in a transport header, to be used at the destination by the destination transport
protocol.
One question remains to be addressed: How does the initiating TS user know
the address of the destination TS user? Two static and two dynamic strategies
suggest themselves:
I. The TS user must know the address it wishes to use ahead of time; this is basically
a system configuration function. For example, a process may be running
that is only of concern to a limited number of TS users, such as a process that
collects statistics on performance. From time to time, a central network management
routine connects to the process to obtain the statistics. These processes
generally are not, and should not be, well-known and accessible to all.
2. Some commonly used services are assigned "well-known addresses" (for
example, time sharing and word processing).
3. A name server is provided. The TS user requests a service by some generic or
global name. The request is sent to the name server, which does a directory
lookup and returns an address. The transport entity then proceeds with the
connection. This service is useful for commonly used applications that change
location from time to time. For example, a data entry process may be moved
from one station to another on a local network in order to balance load.
4. In some cases, the target user is to be a process that is spawned at request
time. The initiating user can send a process request to a well-known address.
The user at that address is a privileged system process that will spawn the new
process and return an address. For example, a programmer has developed a
private application (e.g., a simulation program) that will execute on a remote
mainframe but be invoked from a local minicomputer. An RJE-type request
can be issued to a remote job-management process that spawns the simulation
process.
Multiplexing
We now turn to the concept of multiplexing, which was discussed in general terms
in Section 15.1. With respect to the interface between the transport protocol and
higher-level protocols, the transport protocol performs a multiplexing/demultiplexing
function. That is, multiple users employ the same transport protocol, and are
distinguished by either port numbers or service access points.
The transport entity may also perform a multiplexing function with respect to
the network services that it uses. Recall that we defined upward multiplexing as the
multiplexing of multiple connections on a single lower-level connection, and downward
multiplexing as the splitting of a single connection among multiple lower-level
connections.
Consider, for example, a transport entity making use of an X.25 service. Why
should the transport entity employ upward multiplexing? There are, after all, 4095
virtual circuits available. In the typical case, this is more than enough to handle all
active TS users. However, most X.25 networks base part of their charge on virtualcircuit
connect time, as each virtual circuit consumes some node buffer resources.
Thus, if a single virtual circuit provides sufficient throughput for multiple TS users,
upward multiplexing is indicated.
On the other hand, downward multiplexing or splitting might be used to
improve throughput. For example, each X.25 virtual circuit is restricted to a 3-bit or
7-bit sequence number. A larger sequence space might be needed for high-speed,
high-delay networks. Of course, throughput can only be increased so far. If there is
a single station-node link over which all virtual circuits are multiplexed, the
throughput of a transport connection cannot exceed the data rate of that link.
Flow Control
Whereas flow control is a relatively simple mechanism at the link layer, it is a rather
complex mechanism at the transport layer, for two main reasons:
0 Flow control at the transport level involves the interaction of TS users, transport
entities, and the network service.
The transmission delay between transport entities is generally long compared
to actual transmission time, and, what is worse, it is variable.
Figure 17.3 illustrates the first point. TS user A wishes to send data to TS user
B over a transport connection. We can view the situation as involving four queues.
A generates data and queues it up to send. A must wait to send that data until
0 It has permission from B (peer flow control).
It has permission from its own transport entity (interface flow control).
As data flow down from A to transport entity a, a queues the data until it has
permission to send it on from b and the network service. The data are then handed
to the network layer for delivery to b. The network service must queue the data
until it receives permission from b to pass them on. Finally, b must await B's permission
before delivering the data to their destination.
To see the effects of delay, consider the possible interactions depicted in Figure
17.4. When a TS user wishes to transmit data, it sends these data to its transport
entity (e.g., using a Send call); this triggers two events. The transport entity generates
one or more transport-level protocol data units, which we will call segment^,^
and passes these on to the network service. It also in some way acknowledges to the
TS user that it has accepted the data for transmission. At this point, the transport
entity can exercise flow control across the user-transport interface by simply withholding
its acknowledgment. The transport entity is most likely to do this if the
entity itself is being held up by a flow control exercised by either the network service
or the target transport entity.
In any case, once the transport entity has accepted the data, it sends out a segment.
Some time later, it receives an acknowledgment that the data have been
received at the remote end. It then sends a confirmation to the sender.
At the receiving end, a segment arrives at the transport entity, which unwraps
the data and sends them on (e.g., by an Indication primitive) to the destination TS
user. When the TS user accepts the data, it issues an acknowledgment (e.g., in the
form of a Response primitive). The TS user can exercise flow control over the transport
entity by withholding its response.
The target transport entity has two choices regarding acknowledgment. Either
it can issue an acknowledgment as soon as it has correctly received the segment (the
usual practice), or it can wait until it knows that the TS user has correctly received
the data before acknowledging; the latter course is the safer, where the confirmation
is in fact a confirmation that the destination TS user received the data. In the
former case, the entity merely confirms that the data made it through to the remote
transport entity.
With the discussion above in mind, we can cite two reasons why one transport
entity would want to restrain the rate of segment transmission over a connection
from another transport entity:
* The user of the receiving transport entity cannot keep up with the flow of
data.
* The receiving transport entity itself cannot keep up with the flow of segments.
How do such problems manifest themselves? Well, presumably a transport
entity has a certain amount of buffer space, to which incoming segments are added.
Each buffered segment is processed (i.e., the transport header is examined) and the
data are sent to the TS user. Either of the two problems mentioned above will cause
the buffer to fill up. Thus, the transport entity needs to take steps to stop or slow
the flow of segments so as to prevent buffer overflow. This requirement is not so
easy to fulfill, because of the annoying time gap between sender and receiver. We
return to this point in a moment. First, we present four ways of coping with the flow
control requirement. The receiving transport entity can
1. Do nothing.
2. Refuse to accept further segments from the network service.
3. Use a fixed sliding-window protocol.
4. Use a credit scheme.
Alternative 1 means that the segments that overflow the buffer are discarded.
The sending transport entity, failing to get an acknowledgment, will retransmit. This
is a shame, as the advantage of a reliable network is that one never has to retransmit.
Furthermore, the effect of this maneuver is to exacerbate the problem! The
sender has increased its output to include new segments, plus retransmitted old
segments.
The second alternative is a backpressure mechanism that relies on the network
service to do the work. When a buffer of a transport entity is full, it refuses
additional data from the network service. This triggers flow control procedures
within the network that throttle the network service at the sending end. This service,
in turn, refuses additional segments from its transport entity. It should be clear
that this mechanism is clumsy and coarse-grained. For example, if multiple transport
connections are multiplexed on a single network connection (virtual circuit),
flow control is exercised only on the aggregate of all transport connections.
The third alternative is already familiar to you from our discussions of link
layer protocols. The key ingredients, recall, are
@ The use of sequence numbers on data units.
The use of a window of fixed size.
The use of acknowledgments to advance the window.
With a reliable network service, the sliding window technique would actually
work quite well. For example, consider a protocol with a window size of 7. Whenever
the sender receives an acknowledgment to a particular segment, it is automatically
authorized to send the succeeding seven segments. (Of course, some may
already have been sent.) Now, when the receiver's buffer capacity gets down to
seven segments, it can withhold acknowledgment of incoming segments to avoid
overflow. The sending transport entity can send, at most, seven additional segments
and then must stop. Because the underlying network service is reliable, the sender
will not time-out and retransmit. Thus, at some point, a sending transport entity
may have a number of segments outstanding, for which no acknowledgment has
been received. Because we are dealing with a reliable network, the sending transport
entity can assume that the segments will get through and that the lack of
acknowledgment is a flow control tactic. Such a strategy would not work well in an
unreliable network, as the sending transport entity would not know whether the
lack of acknowledgment is due to flow control or a lost segment.
The fourth alternative, a credit scheme, provides the receiver with a greater
degree of control over data flow. Although it is not strictly necessary with a reliable
network service, a credit scheme should result in a smoother traffic flow; further, it
is a more effective scheme with an unreliable network service, as we shall see.
The credit scheme decouples acknowledgment from flow control. In fixed
sliding-window protocols, such as X.25 and HDLC, the two are synonymous. In a
credit scheme, a segment may be acknowledged without granting new credit, and
vice versa. Figure 17.5 illustrates the protocol (compare Figure 6.4). For simplicity,
we show a data flow in one direction only. In this example, data segments are numbered
sequentially modulo 8 (e.g., SN 0 = segment with sequence number 0). Initially,
through the connection-establishment process, the sending and receiving
sequence numbers are synchronized, and A is granted a credit allocation of 7. A
advances the trailing edge of its window each time that it transmits, and advances
the leading edge only when it is granted credit.
Figure 17.6 shows the view of this mechanism from the sending and receiving
sides; of course, both sides take both views because data may be exchanged in both
directions. From the sending point of view, sequence numbers fall into four regions:
Data sent and acknowledged. Beginning with the initial sequence number
used on this connection through the last acknowledged number.
Q Data sent but not yet acknowledged. Represents data that have already been
transmitted, with the sender now awaiting acknowledgment.
Permitted data transmission. The window of allowable transmissions, based
on unused credit allocated from the other side.
Unused and unusable numbers. Numbers above the window.
From the receiving point of view, the concern is for received data and for the
window of credit that has been allocated. Note that the receiver is not required to
immediately acknowledge incoming segments, but may wait and issue a cumulative
acknowledgment for a number of segments; this is true for both TCP and the IS0
transport protocol.
In both the credit allocation scheme and the sliding window scheme, the
receiver needs to adopt some policy concerning the amount of data it permits the
sender to transmit. The conservative approach is to only allow new segments up to
the limit of available buffer space. If this policy were in effect in Figure 17.5, the first
credit message implies that B has five free buffer slots, and the second message that
B has seven free slots.
A conservative flow control scheme may limit the throughput of the transport
connection in long-delay situations. The receiver could potentially increase
throughput by optimistically granting credit for space it does not have. For example,
if a receiver's buffer is full but it anticipates that it can release space for two segments
within a round-trip propagation time, it could immediately send a credit of 2.
If the receiver can keep up with the sender, this scheme may increase throughput
and can do no harm. If the sender is faster than the receiver, however, some segments
may be discarded, necessitating a retransmission. Because retransmissions
are not otherwise necessary with a reliable network service, an optimistic flow control
scheme will complicate the protocol.
Connection Establishment and Termination
Even with a reliable network service, there is a need for connection establishment
and termination procedures to support connection-oriented service. Connection
establishment serves three main purposes:
It allows each end to assure that the other exists.
It allows negotiation of optional parameters (e.g., maximum segment size,
maximum window size, quality of service).
It triggers allocation of transport entity resources (e.g., buffer space, entry in
connection table).
Connection establishment is by mutual agreement and can be accomplished
by a simple set of user commands and control segments, as shown in the state diagram
of Figure 17.7. To begin, a TS user is in an CLOSED state (i.e., it has no open
transport connection). The TS user can signal that it will passively wait for a request
with a Passive Open command. A server program, such as time sharing or a file
transfer application, might do this. The TS user may change its mind by sending a
Close command. After the Passive Open command is issued, the transport entity
creates a connection object of some sort (i.e., a table entry) that is in the LISTEN
state.
From the CLOSED state, the TS user may open a connection by issuing an
Active Open command, which instructs the transport entity to attempt connection
establishment with a designated user, which then triggers the transport entity to
send an SYN (for synchronize) segment. This segment is carried to the receiving
transport entity and interpreted as a request for connection to a particular port. If
the destination transport entity is in the LISTEN state for that port, then a connection
is established through the following actions by the receiving transport entity:
Signal the TS user that a connection is open.
Send an SYN as confirmation to the remote transport entity.
Put the connection object in an ESTAB (established) state.
When the responding SYN is received by the initiating transport entity, it too
can move the connection to an ESTAB state. The connection is prematurely
aborted if either TS user issues a Close command.
Figure 17.8 shows the robustness of this protocol. Either side can initiate a
connection. Further, if both sides initiate the connection at about the same time, it
is established without confusion; this is because the SYN segment functions both as
a connection request and a connection acknowledgment.
The reader may ask what happens if an SYN comes in while the requested TS
user is idle (not listening). Three courses may be followed:
The transport entity can reject the request by sending an RST (reset) segment
back to the other transport entity.
9 The request can be queued until a matching Open is issued by the TS user.
e The transport entity can interrupt or otherwise signal the TS user to notify it
of a pending request.
Note that if the latter mechanism is used, a Passive Open command is not
strictly necessary, but may be replaced by an Accept command, which is a signal
from the user to the transport entity that it accepts the request for connection.
Connection termination is handled similarly. Either side, or both sides, may
initiate a close. The connection is closed by mutual agreement. This strategy allows
for either abrupt or graceful termination. To achieve the latter, a connection in the
CLOSE WAIT state must continue to accept data segments until a FIN (finish) segment
is received.
Similarly, the diagram defines the procedure for graceful termination. First,
consider the side that initiates the termination procedure:
1. In response to a TS user's Close primitive, a FIN segment is sent to the other
side of the connection, requesting termination.
2. Having sent the FIN, the transport entity places the connection in the FIN
WAIT state. In this state, the transport entity must continue to accept data
from the other side and deliver that data to its user.
3. When a FIN is received in response, the transport entity informs its user and
closes the connection.
From the point of view of the side that does not initiate a termination,
1. When a FIN segment is received, the transport entity informs its user of the
termination request and places the connection in the CLOSE WAIT state. In
this state, the transport entity must continue to accept data from its user and
transmit it in data segments to the other side.
2. When the user issues a Close primitive, the transport entity sends a responding
FIN segment to the other side and closes the connection.
This procedure ensures that both sides have received all outstanding data and
that both sides agree to connection termination before actual termination.
Unreliable Network Service
The most difficult case for a transport protocol is that of an unreliable network service.
Examples of such networks are
An internetwork using IP
A frame relay network using only the LAPF core protocol
An IEEE 802.3 LAN using the unacknowledged connectionless LLC service
The problem is not just that segments are occasionally lost, but that segments
may arrive out of sequence due to variable transit delays. As we shall see, elaborate
machinery is required to cope with these two interrelated network deficiencies. We
shall also see that a discouraging pattern emerges. The combination of unreliability
and nonsequencing creates problems with every mechanism we have discussed so
far. Generally, the solution to each problem raises new problems, and although
there are problems to be overcome for protocols at all levels, it seems that there are
more difficulties with a reliable connection-oriented transport protocol than any
other sort of protocol.
Seven issues need to be addressed:
Ordered delivery
Retransmission strategy
Duplicate detection
Flow control
Connection establishment
Connection termination
Crash recovery
Ordered Delivery
With an unreliable network service, it is possible that segments, even if they are all
delivered, may arrive out of order. The required solution to this problem is to number
segments sequentially. We have seen that for data link control protocols, such
as HDLC, and for X.25, that each data unit (frame, packet) is numbered sequentially
with each successive sequence number being one more than the previous
sequence number; this scheme is used in some transport protocols, such as the IS0
transport protocols. However, TCP uses a somewhat different scheme in which
each data octet that is transmitted is implicitly numbered. Thus, the first segment
may have a sequence number of 0. If that segment has 1000 octets of data, then the
second segment would have the sequence number 1000, and so on. For simplicity in
the discussions of this section, we will assume that each successive segment's sequence
number is one more than that of the previous segment.
Retransmission Strategy
Two events necessitate the retransmission of a segment. First, the segment may be
damaged in transit but, nevertheless, could arrive at its destination. If a frame check
sequence is included with the segment, the receiving transport entity can detect the
error and discard the segment. The second contingency is that a segment fails to
arrive. In either case, the sending transport entity does not know that the segment
transmission was unsuccessful. To cover this contingency, we require that a positive
acknowledgment (ACK) scheme be used: The receiver must acknowledge each successfully
received segment. For efficiency, we do not require one ACK per segment.
Rather, a cumulative acknowledgment can be used, as we have seen many times in
this lesson. Thus, the receiver may receive segments numbered 1,2, and 3, but only
send ACK 4 back. The sender must interpret ACK 4 to mean that number 3 and all
previous segments have been successfully received.
If a segment does not arrive successfully, no ACK will be issued and a retransmission
becomes necessary. To cope with this situation, there must be a timer associated
with each segment as it is sent. If the timer expires before the segment is
acknowledged, the sender must retransmit.
So, the addition of a timer solves this first problem. Next, at what value should
the timer be set? If the value is too small, there will be many unnecessary retransmissions,
thereby wasting network capacity. If the value is too large, the protocol
will be sluggish in responding to a lost segment. The timer should be set at a value
a bit longer than the round trip delay (send segment, receive ACK). Of course this
delay is variable even under constant network load. Worse, the statistics of the
delay will vary with changing network conditions.
Two strategies suggest themselves. A fixed timer value could be used, based
on an understanding of the network's typical behavior; this suffers from an inability
to respond to changing network conditions. If the value is set too high, the service
will always be sluggish. If it is set too low, a positive feedback condition can
develop, in which network congestion leads to more retransmissions, which increase
congestion.
An adaptive scheme has its own problems. Suppose that the transport entity
keeps track of the time taken to acknowledge data segments and sets its retrans
mission timer based on the average of the observed delays. This value cannot be
trusted for three reasons:
The peer entity may not acknowledge a segment immediately; recall that we
gave it the privilege of cumulative acknowledgments.
If a segment has been retransmitted, the sender cannot know whether the
received ACK is a response to the initial transmission or the retransmission.
Network conditions may change suddenly.
Each of these problems is a cause for some further tweaking of the transport algorithm,
but the problem admits of no complete solution. There will always be some
uncertainty concerning the best value for the retransmission timer.
Incidentally, the retransmission timer is only one of a number of timers
needed for proper functioning of a transport protocol; these are listed in Table 17.1,
together with a brief explanation. Further discussion will be found in what follows.
Duplicate Detection
If a segment is lost and then retransmitted, no confusion will result. If, however, an
ACK is lost, one or more segments will be retransmitted and, if they arrive successfully,
will be duplicates of previously received segments. Thus, the receiver must
be able to recognize duplicates. The fact that each segment carries a sequence number
helps but, nevertheless, duplicate detection and handling is no easy thing. There
are two cases:
A duplicate is received prior to the close of the connection.
A duplicate is received after the close of the connection.
Notice that we say "a" duplicate rather than "the" duplicate. From the
sender's point of view, the retransmitted segment is the duplicate. However, the
retransmitted segment may arrive before the original segment, in which case
the receiver views the original segment as the duplicate. In any case, two tactics are
needed to cope with a duplicate received prior to the close of a connection:
The receiver must assume that its acknowledgment was lost and therefore
must acknowledge the duplicate. Consequently, the sender must not get confused
if it receives multiple ACKs to the same segment.
The sequence number space must be long enough so as not to "cycle" in less
than the maximum possible segment lifetime.
Figure 17.9 illustrates the reason for the latter requirement. In this example,
the sequence space is of length 8. For simplicity, we assume a sliding-window protocol
with a window size of 3. Suppose that A has transmitted data segments O,1,
and 2 and receives no acknowledgments. Eventually, it times-out and retransmits
segment 0. B has received 1 and 2, but 0 is delayed in transit. Thus, B does not send
any ACKs. When the duplicate segment 0 arrives, B acknowledges 0, 1, and 2.
Meanwhile, A has timed-out again and retransmits 1, which B acknowledges with
another ACK 3. Things now seem to have sorted themselves out, and data transfer
continues. When the sequence space is exhausted, A cycles back to sequence number
0 and continues. Alas, the old segment 0 makes a belated appearance and is
accepted by B before the new segment 0 arrives
It should be clear that the untimely emergence of the old segment would have
caused no difficulty if the sequence numbers had not yet wrapped around. The
problem is: How big must the sequence space be? This depends on, among other
things, whether the network enforces a maximum packet lifetime, as well as the rate
at which segments are being transmitted. Fortunately, each addition of a single bit
to the sequence number field doubles the sequence space, so it is rather easy to
select a safe size. As we shall see, the standard transport protocols allow stupendous
sequence spaces.
Flow Control
The credit-allocation flow control mechanism described earlier is quite robust in the
face of an unreliable network service and requires little enhancement. We assume
that the credit allocation scheme is tied to acknowledgments in the following way:
To both acknowledge segments and grant credit, a transport entity sends a control
segment of the form (ACK N, CREDIT M), where ACK N acknowledges all data
segments through number N - 1, and CREDIT M allows segments number N
though N + M + 1 to be transmitted. This mechanism is quite powerful. Consider
that the last control segment issued by B was (ACK N, CREDIT M). Then,
To increase or decrease credit to Xwhen no additional segments have arrived,
B can issue (ACK N, CREDIT X).
To acknowledge a new segment without increasing credit, B can issue (ACK
N + 1, CREDIT M - 1).
If an ACKICREDIT segment is lost, little harm is done. Future acknowledgments
will resynchronize the protocol. Further, if no new acknowledgments are
forthcoming, the sender times-out and retransmits a data segment, which triggers a
new acknowledgment. However, it is still possible for deadlock to occur. Consider
a situation in which B sends (ACK N, CREDIT 0), temporarily closing the window.
Subsequently, B sends (ACK N, CREDIT M), but this segment is lost. A is awaiting
the opportunity to send data, and B thinks that it has granted that opportunity.
To overcome this problem, a window timer can be used. This timer is reset with
each outgoing ACWCREDIT segment. If the timer ever expires, the protocol
entity is required to send an ACKICREDIT segment, even if it duplicates a previous
one. This breaks the deadlock and also assures the other end that the protocol
entity is still alive.
An alternative or supplemental mechanism is to provide for acknowledgments
to the ACKICREDIT segment. With this mechanism in place, the window timer can
have quite a large value without causing much difficulty.
Connection Establishment
As with other protocol mechanisms, connection establishment must take into account
the unreliability of a network service. Recall that a connection establishment
calls for the exchange of SYNs, a procedure sometimes referred to as a two-way
handshake. Suppose that A issues an SYN to B. It expects to get an SYN back, confirming
the connection. Two things can go wrong: A's SYN can be lost or B's
answering SYN can be lost. Both cases can be handled by use of a retransmit-SYN
timer. After A issues an SYN, it will reissue the SYN when the timer expires.
This situation gives rise. potentially, to duplicate SYNs. If A's initial SYN was
lost, there are no duplicates. If B's response was lost, then B may receive two SYNs
from A. Further, if B's response was not lost, but simply delayed, A may get two
responding SYNs; all of this means that A and B must simply ignore duplicate SYNs
once a connection is established.
There are other problems with which to contend. Just as a delayed SYN or
lost response can give rise to a duplicate SYN, a delayed data segment or lost
acknowledgment can give rise to duplicate data segments, as we have seen in Figure
17.9). Such a delayed or duplicated data segment can interfere with connection
establishment, as illustrated in Figure 17.10. Assume that with each new connection,
each transport protocol entity begins numbering its data segments with sequence
number 0. In the figure, a duplicate copy of segment 2 from an old connection
arrives during the lifetime of a new connection and is delivered to B before delivery
of the legitimate data segment number 2. One way of attacking this problem is
to start each new connection with a different sequence number, far removed from
the last sequence number of the most recent connection. For this purpose, the connection
request is of the form SYN i, where i is the sequence number of the first
data segment that will be sent on this connection.
Now, consider that a duplicate SYN i may survive past the termination of the
connection. Figure 17.11 depicts the problem that may arise. An old SYN i arrives
at B after the connection is terminated. B assumes that this is a fresh request and
responds with SYN j. Meanwhile, A has decided to open a new connection with B
and sends SYN k; B discards this as a duplicate. Now, both sides have transmitted
and subsequently received a SYN segment, and therefore think that a valid con-
nection exists. However, when A initiates data transfer with a segment numbered k,
B rejects the segment as being out of sequence.
The way out of this problem is for each side to acknowledge explicitly the
other's SYN and sequence number. The procedure is known as a three-way handshake.
The revised connection state diagram, which is the one employed by TCP, is
shown in the upper part of Figure 17.12. A new state (SYN RECEIVED) is added,
in which the transport entity hesitates during connection opening to assure that the
SYN segments sent by the two sides have both been acknowledged before the connection
is declared established. In addition to the new state, there is a control segment
(RST) to reset the other side when a duplicate SYN is detected.
Figure 17.13 illustrates typical three-way handshake operations. Transport
entity A initiates the connection; a SYN includes the sending sequence number, i.
The responding SYN acknowledges that number and includes the sequence number
for the other side. A acknowledges the SYNIACK in its first data segment. Next is
shown a situation in which an old SYN X arrives at B after the close of the relevant
connection. B assumes that this is a fresh request and responds with SYN j, ACK i.
When A receives this message, it realizes that it has not requested a connection and
therefore sends an RST, ACK j. Note that the ACK j portion of the RST message
is essential so that an old duplicate RST does not abort a legitimate connection
establishment. The final example shows a case in which an old SYN, ACK arrives
in the middle of a new connection establishment. Because of the use of sequence
numbers in the acknowledgments, this event causes no mischief.
The upper part of Figure 17.12 does not include transitions in which RST is
sent. This was done for simplicity. The basic rule is to send an RST if the connection
state is not yet OPEN and an invalid ACK (one that does not reference something
that was sent) is received. The reader should try various combinations of
events to see that this connection establishment procedure works in spite of any
combination of old and lost segments.
Connection Termination
The state diagram of Figure 17.7 defines the use of a simple two-way handshake for
connection establishment, which was found to be unsatisfactory in the face of an
unreliable network service. Similarly, the two-way handshake defined in that dia
gram for connection termination is inadequate for an unreliable network service.
The following scenario could be caused by a misordering of segments. A transport
entity in the CLOSE WAIT state sends its last data segment, followed by a FIN segment,
but the FIN segment arrives at the other side before the last data segment.
The receiving transport entity will accept that FIN, close the connection, and lose
the last segment of data. To avoid this problem, a sequence number can be associated
with the FIN, which can be assigned the next sequence number after the last
octet of transmitted data. With this refinement, the receiving transport entity, upon
receiving a FIN, will wait if necessary for the late-arriving data before closing the
connection.
A more serious problem is the potential loss of segments and the potential
presence of obsolete segments. Figure 17.12 shows that the termination procedure
adopts a similar solution to that used for connection establishment. Each side must
explicitly acknowledge the FIN of the other, using an ACK with the sequence number
of the FIN to be acknowledged. For a graceful close, a transport entity requires
the following:
It must send a FIN i and receive an ACK i.
It must receive a FIN j and send an ACK j.
It must wait an interval equal to twice the maximum-expected segment
lifetime
Crash Recovery
When the system upon which a transport entity is running fails and subsequently
restarts, the state information of all active connections is lost. The affected connections
become half-open, as the side that did not fail does not yet realize the problem.
The still active side of a half-open connection can close the connection using
a give-up timer. This timer measures the time the transport machine will continue
to await an acknowledgment (or other appropriate reply) of a transmitted segment
after the segment has been retransmitted the maximum number of times. When the
timer expires, the transport entity assumes that either the other transport entity or
the intervening network has failed. As a result, the timer closes the connection, and
signals an abnormal close to the TS user.
In the event that a transport entity fails and quickly restarts, half-open connections
can be terminated more quickly by the use of the RST segment. The failed
side returns an RST i to every segment i that it receives. When the RST i reaches
the other side, it must be checked for validity based on the sequence number i, as
the RST could be in response to an old segment. If the reset is valid, the transport
entity performs an abnormal termination.
These measures clean up the situation at the transport level. The decision as
to whether to reopen the connection is up to the TS users. The problem is one of
synchronization. At the time of failure, there may have been one or more outstanding
segments in either direction. The TS user on the side that did not fail
knows how much data it has received, but the other user may not if state information
were lost. Thus, there is the danger that some user data will be lost or
duplicated

No comments:

Post a Comment

silahkan membaca dan berkomentar