5.3
Congestion Control Algorithms
When too many packets are present in
(a part of) the subnet, performance degrades. This situation is called congestion.
Figure 5-25 depicts the symptom. When the number
of packets dumped into the subnet by the hosts is within its carrying capacity,
they are all delivered (except for a few that are afflicted with transmission
errors) and the number delivered is proportional to the number sent. However,
as traffic increases too far, the routers are no longer able to cope and they
begin losing packets. This tends to make matters worse. At very high trafffic,
performance collapses completely and almost no packets are delivered.
Congestion can be brought on by
several factors. If all of a sudden, streams of packets begin arriving on three
or four input lines and all need the same output line, a queue will build up.
If there is insufficient memory to hold all of them, packets will be lost.
Adding more memory may help up to a point, but Nagle (1987) discovered that if
routers have an infinite amount of memory, congestion gets worse, not better,
because by the time packets get to the front of the queue, they have already
timed out (repeatedly) and duplicates have been sent. All these packets will be
dutifully forwarded to the next router, increasing the load all the way to the
destination.
Slow processors can also cause
congestion. If the routers' CPUs are slow at performing the bookkeeping tasks
required of them (queueing buffers, updating tables, etc.), queues can build
up, even though there is excess line capacity. Similarly, low-bandwidth lines
can also cause congestion. Upgrading the lines but not changing the processors,
or vice versa, often helps a little, but frequently just shifts the bottleneck.
Also, upgrading part, but not all, of the system, often just moves the
bottleneck somewhere else. The real problem is frequently a mismatch between
parts of the system. This problem will persist until all the components are in
balance.
It is worth explicitly pointing out
the difference between congestion control and flow control, as the relationship
is subtle. Congestion control has to do with making sure the subnet is able to
carry the offered traffic. It is a global issue, involving the behavior of all
the hosts, all the routers, the store-and-forwarding processing within the
routers, and all the other factors that tend to diminish the carrying capacity
of the subnet.
Flow control, in contrast, relates
to the point-to-point traffic between a given sender and a given receiver. Its
job is to make sure that a fast sender cannot continually transmit data faster
than the receiver is able to absorb it. Flow control frequently involves some
direct feedback from the receiver to the sender to tell the sender how things
are doing at the other end.
To see the difference between these
two concepts, consider a fiber optic network with a capacity of 1000
gigabits/sec on which a supercomputer is trying to transfer a file to a
personal computer at 1 Gbps. Although there is no congestion (the network
itself is not in trouble), flow control is needed to force the supercomputer to
stop frequently to give the personal computer a chance to breathe.
At the other extreme, consider a
store-and-forward network with 1-Mbps lines and 1000 large computers, half of
which are trying to transfer files at 100 kbps to the other half. Here the
problem is not that of fast senders overpowering slow receivers, but that the
total offered traffic exceeds what the network can handle.
The reason congestion control and
flow control are often confused is that some congestion control algorithms
operate by sending messages back to the various sources telling them to slow
down when the network gets into trouble. Thus, a host can get a ''slow down''
message either because the receiver cannot handle the load or because the
network cannot handle it. We will come back to this point later.
We will start our study of
congestion control by looking at a general model for dealing with it. Then we
will look at broad approaches to preventing it in the first place. After that,
we will look at various dynamic algorithms for coping with it once it has set
in.
Many problems in complex systems,
such as computer networks, can be viewed from a control theory point of view.
This approach leads to dividing all solutions into two groups: open loop and
closed loop. Open loop solutions attempt to solve the problem by good design,
in essence, to make sure it does not occur in the first place. Once the system
is up and running, midcourse corrections are not made.
Tools for doing open-loop control
include deciding when to accept new traffic, deciding when to discard packets
and which ones, and making scheduling decisions at various points in the
network. All of these have in common the fact that they make decisions without
regard to the current state of the network.
In contrast, closed loop solutions
are based on the concept of a feedback loop. This approach has three parts when
applied to congestion control:
- Monitor the system to detect when and where congestion occurs.
- Pass this information to places where action can be taken.
- Adjust system operation to correct the problem.
A variety of metrics can be used to
monitor the subnet for congestion. Chief among these are the percentage of all
packets discarded for lack of buffer space, the average queue lengths, the
number of packets that time out and are retransmitted, the average packet
delay, and the standard deviation of packet delay. In all cases, rising numbers
indicate growing congestion.
The second step in the feedback loop
is to transfer the information about the congestion from the point where it is
detected to the point where something can be done about it. The obvious way is
for the router detecting the congestion to send a packet to the traffic source
or sources, announcing the problem. Of course, these extra packets increase the
load at precisely the moment that more load is not needed, namely, when the
subnet is congested.
However, other possibilities also
exist. For example, a bit or field can be reserved in every packet for routers
to fill in whenever congestion gets above some threshold level. When a router
detects this congested state, it fills in the field in all outgoing packets, to
warn the neighbors.
Still another approach is to have
hosts or routers periodically send probe packets out to explicitly ask about
congestion. This information can then be used to route traffic around problem
areas. Some radio stations have helicopters flying around their cities to
report on road congestion to make it possible for their mobile listeners to
route their packets (cars) around hot spots.
In all feedback schemes, the hope is
that knowledge of congestion will cause the hosts to take appropriate action to
reduce the congestion. For a scheme to work correctly, the time scale must be
adjusted carefully. If every time two packets arrive in a row, a router yells
STOP and every time a router is idle for 20 µsec, it yells GO, the system will oscillate
wildly and never converge. On the other hand, if it waits 30 minutes to make
sure before saying anything, the congestion control mechanism will react too
sluggishly to be of any real use. To work well, some kind of averaging is
needed, but getting the time constant right is a nontrivial matter.
Many congestion control algorithms
are known. To provide a way to organize them in a sensible way, Yang and Reddy
(1995) have developed a taxonomy for congestion control algorithms. They begin
by dividing all algorithms into open loop or closed loop, as described above.
They further divide the open loop algorithms into ones that act at the source
versus ones that act at the destination. The closed loop algorithms are also
divided into two subcategories: explicit feedback versus implicit feedback. In
explicit feedback algorithms, packets are sent back from the point of
congestion to warn the source. In implicit algorithms, the source deduces the
existence of congestion by making local observations, such as the time needed
for acknowledgements to come back.
The presence of congestion means
that the load is (temporarily) greater than the resources (in part of the
system) can handle. Two solutions come to mind: increase the resources or
decrease the load. For example, the subnet may start using dial-up telephone
lines to temporarily increase the bandwidth between certain points. On
satellite systems, increasing transmission power often gives higher bandwidth.
Splitting traffic over multiple routes instead of always using the best one may
also effectively increase the bandwidth. Finally, spare routers that are
normally used only as backups (to make the system fault tolerant) can be put
on-line to give more capacity when serious congestion appears.
However, sometimes it is not
possible to increase the capacity, or it has already been increased to the
limit. The only way then to beat back the congestion is to decrease the load.
Several ways exist to reduce the load, including denying service to some users,
degrading service to some or all users, and having users schedule their demands
in a more predictable way.
Some of these methods, which we will
study shortly, can best be applied to virtual circuits. For subnets that use
virtual circuits internally, these methods can be used at the network layer.
For datagram subnets, they can nevertheless sometimes be used on transport
layer connections.
Let us begin our study of methods to
control congestion by looking at open loop systems. These systems are designed
to minimize congestion in the first place, rather than letting it happen and reacting
after the fact. They try to achieve their goal by using appropriate policies at
various levels. In Fig. 5-26 we see different data link, network,
and transport policies that can affect congestion (Jain, 1990).
Let us start at the data link layer
and work our way upward. The retransmission policy is concerned with how fast a
sender times out and what it transmits upon timeout. A jumpy sender that times
out quickly and retransmits all outstanding packets using go back n will put a
heavier load on the system than will a leisurely sender that uses selective
repeat. Closely related to this is the buffering policy. If receivers routinely
discard all out-of-order packets, these packets will have to be transmitted
again later, creating extra load. With respect to congestion control, selective
repeat is clearly better than go back n.
Acknowledgement policy also affects
congestion. If each packet is acknowledged immediately, the acknowledgement
packets generate extra traffic. However, if acknowledgements are saved up to
piggyback onto reverse traffic, extra timeouts and retransmissions may result.
A tight flow control scheme (e.g., a small window) reduces the data rate and
thus helps fight congestion.
At the network layer, the choice
between using virtual circuits and using datagrams affects congestion since
many congestion control algorithms work only with virtual-circuit subnets.
Packet queueing and service policy relates to whether routers have one queue
per input line, one queue per output line, or both. It also relates to the
order in which packets are processed (e.g., round robin or priority based).
Discard policy is the rule telling which packet is dropped when there is no
space. A good policy can help alleviate congestion and a bad one can make it
worse.
A good routing algorithm can help
avoid congestion by spreading the traffic over all the lines, whereas a bad one
can send too much traffic over already congested lines. Finally, packet
lifetime management deals with how long a packet may live before being
discarded. If it is too long, lost packets may clog up the works for a long
time, but if it is too short, packets may sometimes time out before reaching
their destination, thus inducing retransmissions.
In the transport layer, the same
issues occur as in the data link layer, but in addition, determining the
timeout interval is harder because the transit time across the network is less
predictable than the transit time over a wire between two routers. If the
timeout interval is too short, extra packets will be sent unnecessarily. If it
is too long, congestion will be reduced but the response time will suffer
whenever a packet is lost.
The congestion control methods
described above are basically open loop: they try to prevent congestion from
occurring in the first place, rather than dealing with it after the fact. In
this section we will describe some approaches to dynamically controlling
congestion in virtual-circuit subnets. In the next two, we will look at
techniques that can be used in any subnet.
One technique that is widely used to
keep congestion that has already started from getting worse is admission
control. The idea is simple: once congestion has been signaled, no more virtual
circuits are set up until the problem has gone away. Thus, attempts to set up
new transport layer connections fail. Letting more people in just makes matters
worse. While this approach is crude, it is simple and easy to carry out. In the
telephone system, when a switch gets overloaded, it also practices admission
control by not giving dial tones.
An alternative approach is to allow
new virtual circuits but carefully route all new virtual circuits around
problem areas. For example, consider the subnet of Fig. 5-27(a), in which two routers are congested,
as indicated.
Figure 5-27. (a) A congested subnet.
(b) A redrawn subnet that eliminates the congestion. A virtual circuit from A
to B is also shown.
Suppose that a host attached to
router A wants to set up a connection to a host attached to router B. Normally,
this connection would pass through one of the congested routers. To avoid this
situation, we can redraw the subnet as shown in Fig. 5-27(b), omitting the congested routers and
all of their lines. The dashed line shows a possible route for the virtual
circuit that avoids the congested routers.
Another strategy relating to virtual
circuits is to negotiate an agreement between the host and subnet when a
virtual circuit is set up. This agreement normally specifies the volume and
shape of the traffic, quality of service required, and other parameters. To
keep its part of the agreement, the subnet will typically reserve resources
along the path when the circuit is set up. These resources can include table
and buffer space in the routers and bandwidth on the lines. In this way,
congestion is unlikely to occur on the new virtual circuits because all the
necessary resources are guaranteed to be available.
This kind of reservation can be done
all the time as standard operating procedure or only when the subnet is
congested. A disadvantage of doing it all the time is that it tends to waste
resources. If six virtual circuits that might use 1 Mbps all pass through the
same physical 6-Mbps line, the line has to be marked as full, even though it
may rarely happen that all six virtual circuits are transmitting full blast at
the same time. Consequently, the price of the congestion control is unused
(i.e., wasted) bandwidth in the normal case.
No comments:
Post a Comment
silahkan membaca dan berkomentar