Translate

Monday, October 3, 2016

INTERNETWORKING



INTERNETWORKING
Packet-switched and packet broadcasting networks grew out of a need to
Allow the computer user to have access to resources beyond those available
In a single system. In a similar fashion, the resources of a single network are
Often inadequate to meet users' needs. Because the networks that might be of interest
exhibit so many differences, it is impractical to consider merging them into a single
network. Rather, what is needed is the ability to interconnect various networks
so that any two stations on any of the constituent networks can communicate.
Table 16.1 lists some commonly used terms relating to the interconnection of
networks, or internetworking. An interconnected set of networks, from a user's
point of view, may appear simply as a larger network. However, if each of the constituent
networks retains its identity, and special mechanisms are needed for communicating
across multiple networks, then the entire configuration is often referred
to as an internet, and each of the constituent networks as a subnetwork.
Each constituent subnetwork in an internet supports communication among
the devices attached to that subnetwork; these devices are referred to as end systems
(ESs). In addition, subnetworks are connected by devices referred to in the
IS0 documents as intermediate systems (1%). ISs provide a communications path
and perform the necessary relaying and routing functions so that data can be
exchanged between devices attached to different subnetworks in the internet.
Two types of ISs of particular interest are bridges and routers. The differences
between them have to do with the types of protocols used for the internetworking
logic. In essence, a bridge operates at layer 2 of the open systems interconnection
(OSI) 7-layer architecture and acts as a relay of frames between like networks.
(Bridges were examined in detail in Lesson 14.) A router'operates at layer 3 of
the OSI architecture and routes packets between potentially different networks.
Both the bridge and the router assume that the same upper-layer protocols are in
use.
We begin our examination with a discussion of the principles underlying various
approaches to internetworking. We then examine the most important architectural
approach to internetworking: the connectionless router. As an example, we
describe the most widely used internetworking protocol, called simply the Internet
Protocol (IP). These three approaches are explored in some detail. The lesson
then turns to the issue of internetwork routing algorithms. Finally, we look at the
newest standardized internetworking protocol, known as IPv6.
Figure 16.1 highlights the position of the protocols discussed in this lesson
within the TCPIIP protocol.
PRINCIPLES OF INTERNETWORKING
 
Requirements
IP
Although a variety of approaches have been taken to provide internetwork service,
the overall requirements on the internetworking facility can be stated in general;
these include
1. Providing a link between networks. At minimum, a physical and link control
connection is needed.
2. Providing for the routing and delivery of data between processes on different
networks.
3. Providing an accounting service that keeps track of the use of the various networks
and routers and that maintains status information.
4. Providing the services listed above in such a way as not to require modifications
to the networking architecture of any of the constituent networks; this
means that the internetworking facility must accommodate a number of differences
among networks, including
a) Different addressing schemes. The networks may use different endpoint
names and addresses and directory maintenance schemes. Some form of
global network-addressing must be provided, as well as a directory service.
b) Different maximum packet size. Packets from one network may have to be
broken up into smaller pieces for another. This process is referred to as
segmentation, or fragmentation.
c) Different network-access mechanisms. The network-access mechanism
between station and network may be different for stations on different networks.
d) Different timeouts. Typically, a connection-oriented transport service
will await an acknowledgment until a timeout expires, at which time it will
retransmit its block of data. In general, longer times are required for successful
delivery across multiple networks. Internetwork timing procedures
must allow for successful transmission that avoids unnecessary
retransmissions.
e) Error recovery. Intranetwork procedures may provide anything from no
error recovery up to reliable end-to-end (within the network) service. The
internetwork service should not depend on, nor be interfered with, by the
nature of the individual network's error recovery capability.
f) Status reporting. Different networks report status and performance differently.
Yet it must be possible for the internetworking facility to provide
such information on internetworking activity to interested and authorized
processes.
g) Routing techniques. Intranetwork routing may depend on fault detection
and congestion control techniques peculiar to each network; the internetworking
facility must be able to coordinate these to adaptively route data
between stations on different networks.
h) User-access control. Each network will have its own user-access control
technique (authorization for use of the network) that must be invoked by
the internetwork facility as needed. Further, a separate internetwork
access control technique may be required.
i) Connection, connectionless. Individual networks may provide connectionoriented
(e.g., virtual circuit) or connectionless (datagram) service. It may
be desirable for the internetwork service not to depend on the nature of
the connection service of the individual networks.
These points are worthy of further comment but are best pursued in the context
of specific architectural approaches. We outline these approaches next, and
then turn to a more detailed discussion of the router-based connectionless
approach.
Architectural Approaches
In describing the interworking function, two dimensions are important:
@ The mode of operation (connection-mode or connectionless)
e The protocol architecture
The mode of operation determines the protocol architecture. There are two general
approaches, depicted in Figure 16.2
Connection-Mode Operation
In the connection-mode operation, it is assumed that each subnetwork provides a
connection-mode form of service. That is, it is possible to establish a logical network
connection (e.g., virtual circuit) between any two DTEs attached to the same subnetwork.
With this in mind, we can summarize the connection-mode approach as
follows:
1. ISs are used to connect two or more subnetworks; each IS appears as a DTE
to each of the subnetworks to which it is attached.
2. When DTE A wishes to exchange data with DTE B, a logical connection is set
up between them. This logical connection consists of the concatenation of a
sequence of logical connections across subnetworks. The sequence is such that
it forms a path from DTE A to DTE B.
3. The individual subnetwork logical connections are spliced together by ISs. For
example, there is a logical connection from DTE A to IS I across subnetwork
1 and another logical connection from IS I to IS M across subnetwork 2. Any
traffic arriving at IS I on the first logical connection is retransmitted on the
second logical connection, and vice versa.
Several additional points can be made about this form of operation. First, this
approach is suited to providing support for a connection-mode network service.
From the point of view of network users in DTEs A and B, a logical network connection
is established between them that provides all of the features of a logical connection
across a single network.
The second point to be made is that this approach assumes that there is a
connection-mode service available from each subnetwork and that these services
are equivalent; clearly, this may not always be the case. For example, an IEEE 802
or FDDI local area network provides a service defined by the logical link control
(LLC). Two of the options with LLC provide only connectionless service. Therefore,
in this case, the subnetwork service must be enhanced. An example of how this
would be done is for the ISs to implement X.25 on top of LLC across the LAN.
Figure 16.3a illustrates the protocol architecture for connection-mode operation.
Access to all subnetworks, either inherently or by enhancement, is by means
of the same network layer protocol. The interworking units operate at layer 3. As
was mentioned, layer 3 ISs are commonly referred to as routers. A connectionoriented
router performs the following key functions:
@ Relaying. Data units arriving from one subnetwork via the network layer protocol
are relayed (retransmitted) on another subnetwork. Traffic is over logical
connections that are spliced together at the routers.
@ Routing. When an end-to-end logical connection, consisting of a sequence of
logical connections, is to be set up, each router in the sequence must make a
routing decision that determines the next hop in the sequence.
Thus, at layer 3, a relaying operation is performed. It is assumed that all of the
end systems share common protocols at layer 4 (transport), and above, for successful
end-to-end communication.
Connectionless-Mode Operation
Figure 16.3b illustrates the connectionless mode of operation. Whereas connectionmode
operation corresponds to the virtual circuit mechanism of a packet-switching
network (Figure 9.4c), connectionless-mode operation corresponds to the datagram
mechanism of a packet-switching network (Figure 9.4d). Each network protocol
data unit is treated independently and routed from source DTE to destination DTE
through a series of routers and networks. For each data unit transmitted by A, A
makes a decision as to which router should receive the data unit. The data unit hops
across the internet from one router to the next until it reaches the destination subnetwork.
At each router, a routing decision is made (independently for each data
unit) concerning the next hop. Thus, different data units may travel different routes
between source and destination DTE.
Figure 16.3b illustrates the protocol architecture for connectionless-mode
operation. All DTEs and all routers share a common network layer protocol known
generically as the internet protocol (IP). An internet protocol was initially developed
for the DARPA internet project and published as RFC 791, and has become
an Internet Standard. The IS0 standard, IS0 8473, provides similar functionality.
Below this internet protocol, a protocol is needed to access the particular subnetwork.
Thus, there are typically two protocols operating in each DTE and router at
the network layer: an upper sublayer that provides the internetworking function,
and a lower sublayer that provides subnetwork access.
Bridge Approach
A third approach that is quite common is the use of a bridge. The bridge, also
known as a MAC-level relay, uses a connectionless mode of operation (Figure
16.2b), but does so at a lower level than a router.
The protocol architecture for a bridge is illustrated in Figure 16%. In this
case, the end systems share common transport and network protocols. In addition,
it is assumed that all of the networks use the same protocols at the link layer. In the
case of IEEE 802 and FDDI LANs, this means that all of the LANs share a common
LLC protocol and a common MAC protocol. For example, all of the LANs are
IEEE 802.3 using the unacknowledged connectionless form of LLC. In this case,
MAC frames are relayed through bridges between the LANs.
The bridge approach is examined in Lesson 14.
 CONNECTIONLESS INTERNETWORKING
The internet protocol (IP) was developed as part of the DARPA internet project.
Somewhat later, when the international standards community recognized the need
for a connectionless approach to internetworking, the IS0 connectionless network
protocol (CLNP) was standardized. The functionality of IP and CLNP is very similar;
they differ in the formats used and in some minor functional features. In this
section, we examine the essential functions of an internetworking protocol, which
apply to both CLNP and IP. For convenience, we refer to IP, but it should be understood
that the narrative in this section applies to both IP and CLNP.
Operation of a Connectionlless Internetworking Scheme
IP provides a connectionless, or datagram, service between end systems. There are
a number of advantages to this connectionless approach:
A connectionless internet facility is flexible. It can deal with a variety of networks,
some of which are themselves connectionless. In essence, IP requires
very little from the constituent networks.
A connectionless internet service can be made highly robust. This is basically
the same argument made for a datagram network service versus a virtual circuit
service. For a further discussion, the reader is referred to Section 9.1.
A connectionless internet service is best for connectionless transport
protocols.
Figure 16.4 depicts a typical example of IP, in which two LANs are interconnected
by an X.25 packet-switched WAN. The figure depicts the operation of
the internet protocol for data exchange between host A on one LAN (subnetwork
1) and host B on another departmental LAN (subnetwork 2) through the
WAN. The figure shows the format of the data unit at each stage. The end systems
and routers must all share a common internet protocol. In addition, the end systems
must share the same protocols above IP. The intermediate routers need only implement
up through IP.
The IP at A receives blocks of data to be sent to B from the higher layers of
software in A. IP attaches a header specifying, among other things, the global internet
address of B. That address is logically in two parts: network identifier and end
system identifier. The result is called an internet-protocol data unit, or simply a
datagram. The datagram is then encapsulated with the LAN protocol and sent to
the router, which strips off the LAN fields to read the IP header. The router then
encapsulates the datagram with the X.25 protocol fields and transmits it across the
WAN to another router. This router strips off the X.25 fields and recovers the datagram,
which it then wraps in LAN fields appropriate to LAN 2 and sends it to B.
Let us now look at this example in more detail. End system A has a datagram
to transmit to end system B; the datagram includes the internet address of B. The
IP module in A recognizes that the destination (B) is on another subnetwork. So,
the first step is to send the data to a router, in this case router X. To do this, IP
passes the datagram down to the next lower layer (in this case LLC) with instructions
to send it to router X. LLC in turn passes this information down to the MAC
layer, which inserts the MAC-level address of router X into the MAC header. Thus,
the block of data transmitted onto LAN 1 includes data from a layer or layers above
TCP. plus a TCP header, an IP header, and LLC header, and a MAC header and
trailer.
Next, the packet travels through subnetwork 1 to router X. The router
removes MAC and LLC fields and analyzes the IP header to determine the ultimate
destination of the data, in this case B. The router must now make a routing decision.
There are three possibilities:
1. The destination station Y is connected directly to one of the subnetworks to
which the router is attached. In this case, the router sends the datagram
directly to the destination.
2. To reach the destination, one or more additional routers must be traversed. In
this case, a routing decision must be made: To which router should the datagram
be sent? In both cases, the IP module in the router sends the datagram
down to the next lower layer with the destination subnetwork address. Please
note that we are speaking here of a lower-layer address that refers to this
network.
3. The router does not know the destination address. In this case, the router
returns an error message to the source of the datagram.
In this example, the data must pass through router Y before reaching the destination.
Router X, then, constructs a new packet by appending an X.25 header,
containing the address of router Y, to the IP data unit. When this packet arrives at
router Y, the packet header is stripped off. The router determines that this IP data
unit is destined for B, which is connected directly to a network to which this router
is attached. The router therefore creates a frame with a destination address of B and
sends it out onto LAN 2. The data finally arrive at B, where the LAN and IP headers
can be stripped off.
At each router, before the data can be forwarded, the router may need to segment
the data unit to accommodate a smaller maximum packet-size limitation on
the outgoing network. The data unit is split into two or more segments, each of
which becomes an independent IP data unit. Each new data unit is wrapped in a
lower-layer packet and queued for transmission. The router may also limit the
length of its queue for each network to which it attaches so as to avoid having a slow
network penalize a faster one. Once the queue limit is reached, additional data units
are simply dropped.
The process described above continues through as many routers as it takes for
the data unit to reach its destination. As with a router, the destination end system
recovers the IP data unit from its network wrapping. If segmentation has occurred,
the IP module in the destination end system buffers the incoming data until the
entire original data field can be reassembled. This block of data is then passed to a
higher layer in the end system.
This service offered by the internet protocol is an unreliable one. That is, the
internet protocol does not guarantee that all data will be delivered or that the data
that are delivered will arrive in the proper order. It is the responsibility of the next
higher layer (e.g., TCP) to recover from any errors that occur. This approach provides
for a great deal of flexibility.
With the internet protocol approach, each unit of data is passed from router
to router in an attempt to get from source to destination. Because delivery is not
guaranteed, there is no particular reliability requirement on any of the subnetworks;
thus, the protocol will work with any combination of subnetwork types. And,
since the sequence of delivery is not guaranteed, successive data units can follow
different paths through the internet; this allows the protocol to react to both congestion
and failure in the internet by changing routes.
Design Issues
With that brief sketch of the operation of an IP-controlled internet, we can now go
back and examine some design issues in greater detail:
Routing
Datagram lifetime
Segmentation and reassembly
Error control
Flow control
As we proceed with this discussion, the reader will note many similarities
between the design issues and techniques relevant to packet-switched networks. To
see the reason for these parallels, consider Figure 16.5, which compares an internet
architecture with a packet-switched network architecture. The routers (GI, G2, G3)
in the internet correspond to the packet-switched nodes (PI, P2, P3) in the network,
and the networks (Nl, N2, N3) in the internet correspond to the transmission links
(TI, T2, T3) in the networks. The routers perform essentially the same functions as
packet-switched nodes, and use the intervening networks in a manner analogous to
transmission links.
Routing
Routing is generally accomplished by maintaining a routing table in each end system
and router that gives, for each possible destination network, the next router to
which the internet datagram should be sent.
The routing table may be static or dynamic. A static table, however, could
contain alternate routes if a router is unavailable. A dynamic table is more flexible
in responding to both error and congestion conditions. In the Internet, for example,
when a router goes down, all of its neighbors will send out a status report, allowing
other routers and stations to update their routing tables. A similar scheme can be
used to control congestion; this is a particularly important function because of the
mismatch in capacity between local and wide-area networks. Section 16.4 discusses
routing protocols.
Routing tables may also be used to support other internetworking services,
such as those governing security and priority. For example, individual networks
might be classified to handle data up to a given security classification. The routing
mechanism must assure that data of a given security level are not allowed to pass
through networks not cleared to handle such data.
Another routing technique is source routing. The source station specifies the
route by including a sequential list of routers in the datagram. This, again, could be
useful for security or priority requirements.
Finally, we mention a service related to routing: route recording. To record a
route, each router appends its internet address to a list of addresses in the datagram.
This feature is useful for testing and debugging purposes.
Datagram Lifetime
If dynamic or alternate routing is used, the potential exists for a datagram to loop
indefinitely through the internet. This is undesirable for two reasons. First, an endlessly
circulating datagram consumes resources. Second, we will see in Lesson 17
that a transport protocol may depend on there being an upper bound on datagram
lifetime. To avoid these problems, each datagram can be marked with a lifetime.
Once the lifetime expires, the datagram is discarded.
A simple way to implement lifetime is to use a hop count. Each time that a
datagram passes through a router, the count is decremented. Alternatively, the lifetime
could be a true measure of time; this requires that the routers must somehow
know how long it has been since the datagram or fragment last crossed a router, in
order to know by how much to decrement the lifetime field. This would seem to
require some global clocking mechanism. The advantage of using a true time measure
is that it can be used in the reassembly algorithm, described next.
Segmentation and Reassembly
Individual subnetworks within an internet may specify different maximum packet
sizes. It would be inefficient and unwieldy to try to dictate uniform packet size
across networks. Thus, routers may need to segment incoming datagrams into
smaller pieces, called fragments, before transmitting on to the next subnetwork.
If datagrams can be segmented (perhaps more than once) in the course of
their travels, the question arises as to where they should be reassembled. The easiest
solution is to have reassembly performed at the destination only. The principal
disadvantage of this approach is that fragments can only get smaller as data move
through the internet; this may impair the efficiency of some networks. However, if
intermediate router reassembly is allowed, the following disadvantages result:
1. Large buffers are required at routers, and there is the risk that all of the buffer
space will be used up in the storing partial datagrams.
2. All fragments of a datagram must pass through the same router, thereby
inhibiting the use of dynamic routing.
In IP, datagram fragments are reassembled at the destination end system. The
IP segmentation technique uses the following fields in the IP header:
Data Unit Identifier (ID)
w Data Length
Offset
More-flag
The ID is a means of uniquely identifying an end-system-originated datagram.
In IP, it consists of the source and destination addresses, an identifier of the protocol
layer that generated the data (e.g., TCP), and a sequence number supplied by
that protocol layer. Data length indicates the length of the user data field in octets,
and the offset is the position of a fragment of user data in the data field of the original
datagram, in multiples of 64 bits.
The source end system creates a datagram with a data length equal to the
entire length of the data field, with Offset = 0, and a more-flag set to 0 (false). To
segment a long datagram, an IP module in a router performs the following tasks:
1. Create two new datagrams and copy the header fields of the incoming datagram
into both.
2. Divide the incoming user data field into two approximately equal portions
along a 64-bit boundary, placing one portion in each new datagram. The first
portion must be a multiple of 64 bits.
3. Set the Data Length of the first new datagram to the length of the inserted
data, and set more-flag to 1 (true). The Offset field is unchanged.
4, Set the data length of the second new datagram to the length of the inserted
data, and add the length of the first data portion divided by 8 to the Offset
field. The more-flag remains the same.
Table 16.2 gives an example. The procedure can easily be generalized to an
n-way split.
To reassemble a datagram, there must be sufficient buffer space at the
reassembly point. As fragments with the same ID arrive, their data fields are
inserted in the proper position in the buffer until the entire data field is reassem
bled, which is achieved when a contiguous set of data exists, starting with an offset
of zero and ending with data from a fragment with a false more-flag.
One eventuality that must be dealt with is that one or more of the fragments
may not get through; the IP service does not guarantee delivery. Some means is
needed to decide whether to abandon a reassembly effort to free up buffer space.
Two approaches are commonly used. First, one can assign a reassembly lifetime to
the first fragment to arrive. This is a local, real-time clock assigned by the reassembly
function and decremented while the fragments of the original datagram are
being buffered. If the time expires prior to complete reassembly, the received fragments
are discarded. A second approach is to make use of the datagram lifetime,
which is part of the header of each incoming fragment. The lifetime field continues
to be decremented by the reassembly function; as with the first approach, if the lifetime
expires prior to complete reassembly, the received fragments are discarded.
First segment
Error Control
The internetwork facility does not guarantee successful delivery of every datagram.
When a datagram is discarded by a router, the router should attempt to return some
information to the source, if possible. The source internet protocol entity may use
this information to modify its transmission strategy, and it may notify higher layers.
To report that a specific datagram has been discarded, some means of datagram
identification is needed.
Datagrams may be discarded for a number of reasons, including lifetime expiration,
congestion, and FCS error. In the latter case, notification is not possible, as
the source address field may have been damaged.
Flow Control
Internet flow control allows routers and/or receiving stations to limit the rate at
which they receive data. For the connectionless type of service we are describing,
flow control mechanisms are limited. The best approach would seem to be to send
flow control packets, requesting reduced data flow, to other routers and source
stations.
THE INTERNET PROTOCOL
The Internet Protocol (IP) is part of the TCP/IP protocol suite, and is the most
widely-used internetworking protocol. It is functionally similar to the IS0 standard
connectionless network protocol (CLNP). As with any protocol standard, IP is
specified in two parts:
The interface with a higher layer (e.g., TCP), specifying the services that IP
provides
The actual protocol format and mechanisms
In this section, we first examine IP services and then the IP protocol. This is followed
by a discussion of IP address formats. Finally, the Internet Control Message
Protocol (ICMP), which is an integral part of IP, is described.
IP Services
IP provides two service primitives at the interface to the next-higher layer (Figure
16.6). The Send primitive is used to request transmission of a data unit. The Deliver
primitive is used by IP to notify a user of the arrival of a data unit. The parameters
associated with the two primitives are
Source address. Internetwork address of sending IP entity.
Destination address. Internetwork address of destination IP entity.
Protocol. Recipient protocol entity (an IP user).
Type of service indicators. Used to specify the treatment of the data unit in its
transmission through component networks.
Identifier. Used in combination with the source and destination addresses and
user protocol to identify the data unit uniquely. This parameter is needed for
reassembly and error reporting.
Don't-fragment identifier. Indicates whether IP can segment (called fragment
in the standard) data to accomplish delivery.
Time to live. Measured in network hops.
Data length. Length of data being transmitted.
Option data. Options requested by the IP user.
Data. User data to be transmitted.
Note that the identifier, don't-fragment identifier, and time-to-live parameters
are present in the Send primitive but not in the Deliver primitive. These three parameters
provide instructions to IP that are not of concern to the recipient IP user.
The sending IP user includes the type-of-service parameter to request a particular
quality of service. The user may specify one or more of the services listed in
Table 16.3. This parameter can be used to guide routing decisions. For example, if
a router has several alternative choices for the next hop in routing a datagram, it
may choose a network of a higher data rate if the high throughput option has been
selected. This parameter, if possible, is also passed down to the network access protocol
for use over individual networks. For example, if a precedence level is
selected, and if the subnetwork supports precedence or priority levels, the precedence
level will be mapped onto the network level for this hop.
The options parameter allows for future extensibility and for inclusion of
parameters that are usually not invoked. The currently defined options are
Security. Allows a security label to be attached to a datagram.
Source routing. A sequenced list of router addresses that specifies the route
to be followed. Routing may be strict (only identified routers may be visited)
or loose (other intermediate routers may be visited).
Route recording. A field is allocated to record the sequence of routers visited
by the datagram.
Stream identification. Names reserved resources used for stream service. This
service provides special handling for volatile periodic traffic (e.g., voice).
Timestamping. The source IP entity and some or all intermediate routers add
a timestamp (precision to milliseconds) to the data unit as it goes by.
IP Protocol
The protocol between IP entities is best described with reference to the IP datagram
format, shown in Figure 16.7. The fields are
Version (4 bits). Indicates the version number, to allow evolution of the protocol.
Internet header length (IHL) (4 bits). Length of header in 32-bit words. The
minimum value is five, for a minimum header length of 20 octets.
Type of service (8 bits). Specifies reliability, precedence, delay, and throughput
parameters.
Total length (16 bits). Total datagram length, in octets.
Identifier (16 bits). A sequence number that, together with the source address,
destination address, and user protocol, is intended to uniquely identify a datagram.
Thus, the identifier should be unique for the datagram's source address,
destination address, and user protocol for the time during which the datagram
will remain in the internet.
Flags (3 bits). Only two of the bits are currently defined. The More bit is used
for segmentation (fragmentation) and reassembly, as previously explained.
The Don't-Fragment bit prohibits fragmentation when set. This bit may be
useful if it is known that the destination does not have the capability to
reassemble fragments. However, if this bit is set, the datagram will be discarded
if it exceeds the maximum size of an en route subnetwork. Therefore,
if the bit is set, it may be advisable to use source routing to avoid subnetworks
with small maximum packet size.
Fragment offset (13 bits). Indicates where in the original datagram this fragment
belongs, measured in 64-bit units, implying that fragments other than
the last fragment must contain a data field that is a multiple of 64 bits.
Time to live (8 bits). Measured in router hops.
Protocol (8 bits). Indicates the next higher level protocol that is to receive the
data field at the destination.
Header checksum (16 bits). An error-detecting code applied to the header
only. Because some header fields may change during transit (e.g., time to live,
segmentation-related fields), this is reverified and recomputed at each router.
The checksum field is the 16-bit one's complement addition of all 16-bit words
in the header. For purposes of computation, the checksum field is itself initialized
to a value of zero.
Source address (32 bits). Coded to allow a variable allocation of bits to specify
the network and the end system attached to the specified network (7 and
24 bits, 14 and 16 bits, or 21 and 8 bits).
Destination address (32 bits). As above.
Options (variable). Encodes the options requested by the sending user.
e Padding (variable). Used to ensure that the datagram header is a multiple of
32 bits.
Data (variable). The data field must be an integer multiple of 8 bits. The maximum
length of the datagram (data field plus header) is 65,535 octets.
It should be clear how the IP services specified in the Send and Deliver primitives
map into the fields of the IP datagram.
IP Addresses
The source and destination address fields in the IP header each contain a 32-bit
global internet address, generally consisting of a network identifier and a host identifier.
The address is coded to allow a variable allocation of bits to specify network
and host, as depicted in Figure 16.8. This encoding provides flexibility in assigning
addresses to hosts and allows a mix of network sizes on an internet. In particular,
the three network classes are best suited to the following conditions:
0 Class A. Few networks, each with many hosts.
Class B. Medium number of networks, each with a medium number of hosts.
Class C. Many networks, each with a few hosts.
In a particular environment, it may be best to use addresses all from one class.
For example, a corporate internetwork that consists of a large number of departmental
local area networks may need to use class C addresses exclusively. However,
the format of the addresses is such that it is possible to mix all three classes of
addresses on the same internetwork; this is what is done in the case of the Internet
itself. A mixture of classes is appropriate for an internetwork consisting of a few
large networks, many small networks, plus some medium-sized networks.
The Internet Control Message Protocol (ICMP)
The IP standard specifies that a compliant implementation must also implement
ICMP (RFC 792). ICMP provides a means for transferring messages from routers
and other hosts to a host. In essence, ICMP provides feedback about problems in
the communication environment. Examples of its use are: When a datagram cannot
reach its destination, when the router does not have the buffering capacity to forward
a datagram, and when the router can direct the station to send traffic on a
shorter route. In most cases, an ICMP message is sent in response to a datagram,
either by a router along the datagram's path, or by the intended destination host.
Although ICMP is, in effect, at the same level as IP in the TCPIIP architecture,
it is a user of IP. An ICMP message is constructed and then passed down to
IP, which encapsulates the message with an IP header and then transmits the resulting
datagram in the usual fashion. Because ICMP messages are transmitted in
IP datagrams, their delivery is not guaranteed and their use cannot be considered
reliable.
Figure 16.9 shows the format of the various ICMP message types. All ICMP
message start with a 64-bit header consisting of the following:
Type (8 bits). Specifies the type of ICMP message.
Code (8 bits). Used to specify parameters of the message that can be encoded
in one or a few bits.
Checksum (16 bits). Checksum of the entire ICMP message. This is the same
checksum algorithm used for IP.
Parameters (32 bits). Used to specify more lengthy parameters.
These fields are generally followed by additional information fields that further
specify the content of the message.
In those cases in which the ICMP message refers to a prior datagram, the
information field includes the entire IP header plus the first 64 bits of the data field
of the original datagram. This enables the source host to match the incoming ICMP
message with the prior datagram. The reason for including the first 64 bits of the
data field is that this will enable the IP module in the host to determine which
upper-level protocol or protocols were involved. In particular, the first 64 bits would
include a portion of the TCP header or other transport-level header.
ICMP messages include the following:
Destination unreachable
Time exceeded
Parameter problem
Source quench
Redirect
Echo
Echo reply
Timestamp
Timestamp reply
Address mask request
Address mask reply
The destination-unreachable message covers a number of contingencies. A
router may return this message if it does not know how to reach the destination network.
In some networks, an attached router may be able to determine if a particular
host is unreachable, and then return the message. The destination host itself may
return this message if the user protocol or some higher-level service access point is
unreachable. This could happen if the corresponding field in the IP header was set
incorrectly. If the datagram specifies a source route that is unusable, a message is
returned. Finally, if a router must fragment a datagram but the Don't-Fragment flag
is set, a message is returned.
A router will return a time-exceeded message if the lifetime of the datagram
expires. A host will send this message if it cannot complete reassembly within a time
limit.
A syntactic or semantic error in an IP header will cause a parameter-problem
message to be returned by a router or host. For example, an incorrect argument
may be provided with an option. The parameter field contains a pointer to the octet
in the original header where the error was detected.
The source-quench message provides a rudimentary form of flow control.
Either a router or a destination host may send this message to a source host, requesting
that it reduce the rate at which it is sending traffic to the internet destination.
On receipt of a source-quench message, the source host should cut back the
rate at which it is sending traffic to the specified destination until it no longer
receives source-quench messages; this message can be used by a router or host that
must discard datagrams because of a full buffer. In this case, the router or host will
issue a source-quench message for every datagram that it discards. In addition, a
system may anticipate congestion and issue such messages when its buffers
approach capacity. In that case, the datagram referred to in the source-quench message
may well be delivered. Thus, receipt of the message does not imply delivery or
nondelivery of the corresponding datagram.
A router sends a redirect message to a host on a directly connected router to
advise the host of a better route to a particular destination; the following is an
example of its use. A router, R1, receives a datagram from a host on a network to
which the router is attached. The router, R1, checks its routing table and obtains the
address for the next router, R2, on the route to the datagram's internet destination
network, X. If R2 and the host identified by the internet source address of the datagram
are on the same network, a redirect message is sent to the host. The redirect
message advises the host to send its traffic for network X directly to router R2, as
this is a shorter path to the destination. The router forwards the original datagram
to its internet destination (via R2). The address of R2 is contained in the parameter
field of the redirect message.
The echo and echo-reply messages provide a mechanism for testing that communication
is possible between entities. The recipient of an echo message is obligated
to return the message in an echo-reply message. An identifier and sequence
number are associated with the echo message to be matched in the echo-reply message.
The identifier might be used like a service access point to identify a particular
session, and the sequence number might be incremented on each echo request sent.
The timestamp and timestamp-reply messages provide a mechanism for sampling
the delay characteristics of the internet. The sender of a timestamp message
may include an identifier and sequence number in the parameters field and include
the time that the message is sent (originate timestamp). The receiver records the
time it received the message and the time that it transmits the reply message in the
timestamp-reply message. If the timestamp message is sent using strict source routing,
then the delay characteristics of a particular route can be measured.
The address-mask-request and address-mask-reply messages are useful in an
environment that includes what are referred to as subnets. The concept of the subnet
was introduced to address the following requirement. Consider an internet that
includes one or more WANs and a number of sites, each of which has a number of
LANs. We would like to allow arbitrary complexity of interconnected LAN structures
within an organization, while insulating the overall internet against explosive
growth in network numbers and routing complexity. One approach to this problem
is to assign a single network number to all of the LANs at a site. From the point of
view of the rest of the internet, there is a single network at that site, which simplifies
addressing and routing. To allow the routers within the site to function properly,
each LAN is assigned a subnet number. The host portion of the internet
address is partitioned into a subnet number and a host number to accommodate this
new level of addressing.
Within the subnetted network, the local routers must route on the basis of an
extended network number consisting of the network portion of the IP address and
the subnet number. The bit positions containing this extended network number are
indicated by the address mask. The address-mask request and reply messages allow
a host to learn the address mask for the LAN to which it connects. The host broadcasts
an address-mask request message on the LAN. The router on the LAN
responds with an address-mask reply message that contains the address mask. The
use of the address mask allows the host to determine whether an outgoing datagram
is destined for a host on the same LAN (send directly) or another LAN (send datagram
to router). It is assumed that some other means (e.g., manual configuration)
is used to create address masks and to make them known to the local routers.
 ROUTING PROTOCOLS
The routers in an internet are responsible for receiving and forwarding packets
through the interconnected set of subnetworks. Each router makes routing decisions
based on knowledge of the topology and on the conditions of the internet. In
a simple internet, a fixed routing scheme is possible. In more complex internets, a
degree of dynamic cooperation is needed among the routers. In particular, the
router must avoid portions of the network that have failed and should avoid portions
of the network that are congested. In order to make such dynamic routing
decisions, routers exchange routing information using a special routing protocol for
that purpose. Information is needed about the status of the internet, in terms of
which networks can be reached by which routes, and in terms of the delay characteristics
of various routes.
In considering the routing function of routers, it is important to distinguish
two concepts:
Routing information. Information about the topology and delays of the internet.
Routing algorithm. The algorithm used to make a routing decision for a particular
datagram, based on current routing information.
There is another way to partition the problem that is useful from two points
of view: allocating routing functions properly and effective standardization; this is
to partition the routing function into
Routing between end systems (ESs) and routers
e Routing between routers
The reason for the partition is that there are fundamental differences between
what an ES must know to route a packet and what a router must know. In the case
of an ES, the router must first know whether the destination ES is on the same subnet.
If the answer is yes, then data can be delivered directly using the subnetwork
access protocol; otherwise, the ES must forward the data to a router attached to the
same subnetwork. If there is more than one such router, it is simply a matter of
choosing one. The router forwards datagrams on behalf of other systems and needs
to have some idea of the overall topology of the network in order to make a global
routing decision.
In this section, we look at an example of an ES-to-router and router-to-router
protocol.
Autonomous Systems
In order to proceed in our discussion of router-router protocols, we need to introduce
the concept of an autonomous system. An autonomous system is an internet
connected by homogeneous routers; generally, the routers are under the administrative
control of a single entity. An interior router protocol (IRP) passes routing
information between routers within an autonomous system. The protocol used
within the autonomous system does not need to be implemented outside of the system.
This flexibility allows IRPs to be custom-tailored to specific applications and
requirements.
It may happen, however, that an internet will be constructed of more than one
autonomous system. For example, all of the LANs at a site, such as an office complex
or campus, could be linked by routers to form an autonomous system. This system
might be linked through a wide-area network to other autonomous systems.
The situation is illustrated in Figure 16.10. In this case, the routing algorithmi and
routing tables used by routers in different autonomous systems may differ themselves.
Nevertheless, the routers in one autonomous system need at least a minimal
level of information concerning networks that can be reached outside the system.
The protocol used to pass routing information between routers in different
autonomous systems is referred to as an exterior router protocol (ERP).
We can expect that an ERP will need to pass less information and be simpler
than an IRP, for the following reason. If a datagram is to be transferred from a host
in one autonomous system to a host in another autonomous system, a router in the
first system need only determine the target autonomous system and devise a route
to get into that system. Once the datagram enters the target autonomous system,
the routers there can cooperate to finally deliver the datagram.
In the remainder of this section, we look at what are perhaps the most important
examples of these two types of routing protocols.
Border Gateway Protocol
The Border Gateway Protocol (BGP) was developed for use in conjunction with
internets that employ the TCPIIP protocol suite, although the concepts are applicable
to any internet. BGP has become the standardized exterior router protocol for
the Internet.
Functions
BGP was designed to allow routers, called gateways in the standard, in different
autonomous systems (Ass) to cooperate in the exchange of routing information.
The protocol operates in terms of messages, which are sent over TCP connections.
The repertoire of messages is summarized in Table 16.4.
Three functional procedures are involved in BGP:
Neighbor acquisition
Neighbor reachability
Network reachability
Two routers are considered to be neighbors if they are attached to the same
subnetwork. If the two routers are in different autonomous systems, they may wish
to exchange routing information. For this purpose, it is necessary to first perform
neighbor acquisition. The term "neighbor" refers to two routers that share the same
subnetwork. In essence, neighbor acquisition occurs when two neighboring routers
in different autonomous systems agree to regularly exchange routing information.
A formal acquisition procedure is needed because one of the routers may not wish
to participate. For example, the router may be overburdened and does not want
to be responsible for traffic coming in from outside the system. In the neighboracquisition
process, one router sends a request message to the other, which may
either accept or refuse the offer. The protocol does not address the issue of how one
router knows the address, or even the existence of, another router, nor how it
decides that it needs to exchange routing information with that particular router.
These issues must be addressed at configuration time or by active intervention of a
network manager.
To perform neighbor acquisition, one router sends an Open message to
another. If the target router accepts the request, it returns a Keepalive message in
response.
Once a neighbor relationship is established, the neighbor-reachability procedure
is used to maintain the relationship. Each partner needs to be assured that the
other partner still exists and is still engaged in the neighbor relationship. For this
purpose, the two routers periodically issue Keepalive messages to each other.
The final procedure specified by BGP is network reachability. Each router
maintains a database of the subnetworks that it can reach and the preferred route
for reaching that subnetwork. Whenever a change is made to this database, the
router issues an Update message that is broadcast to all other routers implementing
BGP. By the broadcasting of these Update message, all of the BGP routers can
build up and maintain routing information.
BGP Messages
Figure 16.11 illustrates the formats of all of the BGP messages. Each message
begins with a 19-octet header containing three fields, as indicated by the shaded
portion of each message in the figure:
Marker. Reserved for authentication. The sender may insert a value in this
field that would be used as part of an authentication mechanism to enable the
recipient to verify the identity of the sender.
Length. Length of message in octets.
Type. Type of message: Open, Update, Notification, Keepalive.
To acquire a neighbor, a router first opens a TCP connection to the neighbor
router of interest. It then sends an Open message. This message identifies the AS to
which the sender belongs and provides the IP address of the router. It also includes
a Hold Time parameter, which indicates the number of seconds that the sender proposes
for the value of the Hold Timer. If the recipient is prepared to open a neighbor
relationship, it calculates a value of Hold Timer that is the minimum of its Hold
Time and the Hold Time in the Open message. This calculated value is the maximum
number of seconds that may elapse between the receipt of successive
Keepalive and/or Update messages by the sender.
The Keepalive message consists simply of the header. Each router issues these
messages to each of its peers often enough to prevent the Hold Time from expiring.
The Update message communicates two types of information:
1. Information about a single route through the internet. This information is
available to be added to the database of any recipient router.
2. A list of routes previously advertised by this router that are being withdrawn.
An Update message may contain one or both types of information. Let us consider
the first type of information first. Information about a single route through the
network involves three fields: the Network Layer Reachability Information (NLRI)
field, the Total Path Attributes Length field, and the Path Attributes field. The
NLRI field consists of a list of identifiers of subnetworks that can be reached by this
route. Each subnetwork is identified by its IP address, which is actually a portion of
a full IP address. Recall that an IP address is a 32-bit quantity of the form {network,
end system}. The left-hand, or prefix portion of this quantity, identifies a particular
subnetwork.
The Path Attributes field contains a list of attributes that apply to this particular
route. The following are the defined attributes:
Origin. Indicates whether this information was generated by an interior router
protocol (e.g., OSPF) or an exterior router protocol (in particular, BGP).
AS-Path. A list of the ASS that are traversed for this route.
Next-Hop. The IP address of the border router that should be used as the
next hop to the destinations listed in the NLRI field.
Multi-Exit-Disc. Used to communicate some information about routes internal
to an AS. This is described later in this section.
Local-Pref. Used by a router to inform other routers within the same AS of
its degree of preference for a particular route. It has no significance to routers
in other ASS.
Atomic-Aggregate, Aggregator. These two fields implement the concept
of route aggregation. In essence, an internet and its corresponding address
space can be organized hierarchically, or as a tree. In this case, subnetwork
addresses are structured in two or more parts. All of the subnetworks of a
given subtree share a common partial internet address. Using this common
partial address, the amount of information that must be communicated in
NLRI can be significantly reduced.
The AS-Path attribute actually serves two purposes. Because it lists the ASS
that a datagram must traverse if it follows this route, the AS-Path information
enables a router to perform policy routing. That is, a router may decide to avoid a
particular path in order to avoid transiting a particular AS. For example, information
that is confidential may be limited to certain kinds of ASS. Or, a router may
have information about the performance or quality of the portion of the internet
that is included in an AS that leads the router to avoid that AS. Examples of performance
or quality metrics include link speed, capacity, tendency to become congested,
and overall quality of operation. Another criterion that could be used is
minimizing the number of transit ASS.
The reader may wonder about the purpose of the Next-Hop attribute. The
requesting router will necessarily want to know which networks are reachable via
the responding router, but why provide information about other routers? This is
best explained with reference to Figure 16.10. In this example, router R1 in
autonomous system 1, and router R5 in autonomous system 2, implement BGP and
acquire a neighbor relationship. R1 issues Update messages to R5 indicating which
networks it could reach and the distances (network hops) involved. R1 also provides
the same information on behalf of R2. That is, R1 tells R.5 what networks are
reachable via R2. In this example, R2 does not implement BGP. Typically, most of
the routers in an autonomous system will not implement BGP; only a few will be
assigned responsibility for communicating with routers in other autonomous systems.
A final point: R1 is in possession of the necessary information about R2, as
R1 and R2 share an interior router protocol (IRP).
The second type of update information is the withdrawal of one or more
routes. In each case, the route is identified by the IP address of the destination subnetwork.
Finally, the notification message is sent when an error condition is detected.
The following errors may be reported:
Message header error. Includes authentication and syntax errors.
Open message error. Includes syntax errors and options not recognized in an
Open message. This message can also be used to indicate that a proposed
Hold Time in an Open message is unacceptable.
Update message error. Includes syntax and validity errors in an Update
message.
Hold timer expired. If the sending router has not received successive
Keepalive and/or Update and/or Notification messages within the Hold Time
period, then this error is communicated and the connection is closed.
Finite state machine error. Includes any procedural error.
Cease. Used by a router to close a connection with another router in the
absence of any other error.
BGP Routing Information Exchange
The essence of BGP is the exchange of routing information among participating
routers in multiple ASS. This process can be quite complex. In what follows, we provide
a simplified overview.
Let us consider router R1 in autonomous system 1 (ASl), in Figure 16.10. To
begin, a router that implements BGP will also implement an internal routing protocol,
such as OSPF. Using OSPF, R1 can exchange routing information with other
routers within AS1 and build up a picture of the topology of the subnetworks and
routers in AS1 and construct a routing table. Next, R1 can issue an Update message
to R5 in AS2. The Update message could include the following:
AS-Path: the identity of AS1
Next-Hop: the IP address of R1
NLRI: a list of all of the subnetworks in AS1
This message informs R5 that all of the subnetworks listed in NLRI are reachable
via R1 and that the only autonomous system traversed is AS1.
Suppose now that R5 also has a neighbor relationship with another router in
another autonomous system, say R9 in AS3. R5 will forward the information just
received from R1 to R9 in a new Update message. This message includes the
following:
AS-Path: the list of identifiers (AS2, AS11
Next-Hop: the IP address of R5
NLRI: a list of all of the subnetworks in AS1
This message informs R9 that all of the subnetworks listed in NLRI are reachable
via R5 and that the autonomous systems traversed are AS2 and AS1. R9 must
now decide if this is its preferred route to the subnetworks listed. It may have
knowledge of an alternate route to some or all of these subnetworks that it prefers
for reasons of performance or some other policy metric. If R9 decides that the route
provided in R5's update message is preferable, then R9 incorporates that routing
information into its routing database and forwards this new routing information
to other neighbors. This new message will include an AS-Path field of {ASl,
AS2, AS3).
In the above fashion, routing update information is propagated through the
larger internet consisting of a number of interconnected autonomous systems. The
AS-Path field is used to assure that such messages do not circulate indefinitely: If
an Update message is received by a router in an AS that is included in the AS-Path
field, that router will not forward the update information to other routers, thereby
preventing looping of messages.
The preceding discussion leaves out several details that are briefly summarized
here. Routers within the same AS, called internal neighbors, may exchange
BGP information. In this case, the sending router does not add the identifier of the
common AS to the AS-Path field. When a router has selected a preferred route to
an external destination, it transmits this route to all of its internal neighbors. Each
of these routers then decides if the new route is preferred; if so, the new route is
added to its database and a new Update message goes out.
When there are multiple entry points into an AS that are available to a border
router in another AS, the Multi-Exit-Disc attribute may be used to choose
among them. This attribute contains a number that reflects some internal metric for
reaching destinations within an AS. For example, suppose in Figure 16.10 that both
R1 and R2 implemented BGP and that both had a neighbor relationship with R5.
Each provides an Update message to R5 for subnetwork 1.3 that includes a routing
metric used internally to AS1, such as a routing metric associated with the OSPF
internal router protocol. R5 could then use these two metrics as the basis for choosing
between the two routes.
Open Shortest Path First (OSPF) Protocol
The history of interior routing protocols on the Internet mirrors that of packetswitching
protocols on ARPANET. Recall that ARPANET began with a protocol
based on the Bellman-Ford algorithm. The resulting protocol required each node to
exchange path-delay information with its neighbors. Information about a change in
network conditions would gradually ripple through the network. A second generation
protocol was based on Dijkstra's algorithm and required each node to exchange
link-delay information with all other nodes using flooding. It was found that
this latter technique was more effective.
Similarly, the initial interior routing protocol on the DARPA internet was the
Routing Information Protocol (RIP), which is essentially the same protocol as the
first-generation ARPANET protocol. This protocol requires each router to transmit
its entire routing table. Although the algorithm is simple and easy to implement,
as the internet expands, routing updates grow larger and consume significantly
more network bandwidth. Accordingly, OSPF operates in a fashion similar to the
revised ARPANET routing algorithm. OSPF uses what is known as a link-state
routing algorithm. Each router maintains descriptions of the state of its local links
to subnetworks, and from time to time transmits updated state information to all of
the routers of which it is aware. Every router receiving an update packet must
acknowledge it to the sender. Such updates produce a minimum of routing traffic
because the link descriptions are small and rarely need to be sent.
The OSPF protocol (RFC 1583) is now widely used as the interior router protocol
in TCPIIP networks. OSPF computes a route through the internet that incurs
the least cost based on a user-configurable metric of cost. The user can configure
the cost to express a function of delay, data rate, dollar cost, or other factors. OSPF
is able to equalize loads over multiple equal-cost paths.
Each router maintains a database that reflects the known topology of the autonomous
system of which it is a part. The topology is expressed as a directed graph.
The graph consists of
Vertices, or nodes, of two types:
1. router
2. network, which is, in turn, of two types:
a) transit, if it can carry data that neither originates nor terminates on an
end system attached to this network.
b) stub, if it is not a transit network.
Edges of two types:
1. graph edges that connect two router vertices when the corresponding
routers are connected to each other by a direct point-to-point link.
2. graph edges that connect a router vertex to a network vertex when the
router is directly connected to the network.
Figure 16.12 shows an example of an autonomous system, and Figure 16.13 is
the resulting directed graph. The mapping is straightforward:
Two routers joined by a point-to-point link are represented in the graph as
being directly connected by a pair of edges, one in each direction (e.g., routers
6 and 10).
When multiple routers are attached to a network (such as a LAN or packetswitching
network), the directed graph shows all routers bidirectionally
connected to the network vertex (e.g., routers 1,2,3, and 4 all connect to network
3).
If a single router is attached to a network, the network will appear in the
graph as a stub connection (e.g., network 7).
An end system, called a host, can be directly connected to a router; such a case
is depicted in the corresponding graph (e.g., host 1).
If a router is connected to other autonomous systems, then the path cost to
each network in the other system must be obtained by some exterior routing
protocol (ERP). Each such network is represented on the graph by a stub and
an edge to the router with the known path cost (e.g., networks 12 through 15).
A cost is associated with the output side of each router interface. This cost is
configurable by the system administrator. Arcs on the graph are labeled with the
cost of the corresponding router-output interface. Arcs having no labeled cost have
a cost of 0. Note that arcs leading from networks to routers always have a cost of 0.
A database corresponding to the directed graph is maintained by each router.
It is pieced together from link-state messages from other routers in the internet.
Using Dijkstra's algorithm (see Appendix 9A), a router calculates the least-cost
path to all destination networks. The result for router 6 of Figure 16.12 is shown as
a tree in Figure 16.14, with R6 as the root of the tree. The tree gives the entire route
to any destination network or host. However, only the next hop to the destination
is used in the forwarding process. The resulting routing table for router 6 is shown
in Table 16.5. The table includes entries for routers advertising external routes

No comments:

Post a Comment

silahkan membaca dan berkomentar