5.2.10
Routing in Ad Hoc Networks
We have now seen how to do routing
when the hosts are mobile but the routers are fixed. An even more extreme case
is one in which the routers themselves are mobile. Among the possibilities are:
- Military vehicles on a battlefield with no existing infrastructure.
- A fleet of ships at sea.
- Emergency workers at an earthquake that destroyed the infrastructure.
- A gathering of people with notebook computers in an area lacking 802.11.
In all these cases, and others, each
node consists of a router and a host, usually on the same computer. Networks of
nodes that just happen to be near each other are called ad hoc networks or
MANETs (Mobile Ad hoc NETworks). Let us now examine them briefly. More
information can be found in (Perkins, 2001).
What makes ad hoc networks different
from wired networks is that all the usual rules about fixed topologies, fixed
and known neighbors, fixed relationship between IP address and location, and
more are suddenly tossed out the window. Routers can come and go or appear in
new places at the drop of a bit. With a wired network, if a router has a valid
path to some destination, that path continues to be valid indefinitely (barring
a failure somewhere in the system). With an ad hoc network, the topology may be
changing all the time, so desirability and even validity of paths can change
spontaneously, without warning. Needless to say, these circumstances make
routing in ad hoc networks quite different from routing in their fixed
counterparts.
A variety of routing algorithms for
ad hoc networks have been proposed. One of the more interesting ones is the
AODV (Ad hoc On-demand Distance Vector) routing algorithm (Perkins and Royer,
1999). It is a distant relative of the Bellman-Ford distance vector algorithm
but adapted to work in a mobile environment and takes into account the limited
bandwidth and low battery life found in this environment. Another unusual
characteristic is that it is an on-demand algorithm, that is, it determines a
route to some destination only when somebody wants to send a packet to that
destination. Let us now see what that means.
At any instant of time, an ad hoc
network can be described by a graph of the nodes (routers + hosts). Two nodes
are connected (i.e., have an arc between them in the graph) if they can
communicate directly using their radios. Since one of the two may have a more
powerful transmitter than the other, it is possible that A is connected to B
but B is not connected to A. However, for simplicity, we will assume all
connections are symmetric. It should also be noted that the mere fact that two
nodes are within radio range of each other does not mean that they are
connected. There may be buildings, hills, or other obstacles that block their
communication.
To describe the algorithm, consider
the ad hoc network of Fig. 5-20, in which a process at node A wants to
send a packet to node I. The AODV algorithm maintains a table at each node,
keyed by destination, giving information about that destination, including
which neighbor to send packets to in order to reach the destination. Suppose
that A looks in its table and does not find an entry for I. It now has to
discover a route to I. This property of discovering routes only when they are
needed is what makes this algorithm ''on demand.''
Figure 5-20. (a) Range of A's
broadcast. (b) After B and D have received A's broadcast. (c) After C, F, and G
have received A's broadcast. (d) After E, H, and I have received A's broadcast.
The shaded nodes are new recipients. The arrows show the possible reverse
routes.
To locate I, A constructs a special
ROUTE REQUEST packet and broadcasts it. The packet reaches B and D, as
illustrated in Fig. 5-20(a). In fact, the reason B and D are
connected to A in the graph is that they can receive communication from A. F,
for example, is not shown with an arc to A because it cannot receive A's radio
signal. Thus, F is not connected to A.
The format of the ROUTE REQUEST
packet is shown in Fig. 5-21. It contains the source and destination
addresses, typically their IP addresses, which identify who is looking for
whom. It also contains a Request ID, which is a local counter maintained
separately by each node and incremented each time a ROUTE REQUEST is broadcast.
Together, the Source address and Request ID fields uniquely identify the ROUTE
REQUEST packet to allow nodes to discard any duplicates they may receive.
In addition to the Request ID
counter, each node also maintains a second sequence counter incremented
whenever a ROUTE REQUEST is sent (or a reply to someone else's ROUTE REQUEST).
It functions a little bit like a clock and is used to tell new routes from old
routes. The fourth field of Fig. 5-21 is A's sequence counter; the fifth
field is the most recent value of I's sequence number that A has seen (0 if it
has never seen it). The use of these fields will become clear shortly. The
final field, Hop count, will keep track of how many hops the packet has made.
It is initialized to 0.
When a ROUTE REQUEST packet arrives
at a node (B and D in this case), it is processed in the following steps.
- The (Source address, Request ID) pair is looked up in a local history table to see if this request has already been seen and processed. If it is a duplicate, it is discarded and processing stops. If it is not a duplicate, the pair is entered into the history table so future duplicates can be rejected, and processing continues.
- The receiver looks up the destination in its route table. If a fresh route to the destination is known, a ROUTE REPLY packet is sent back to the source telling it how to get to the destination (basically: Use me). Fresh means that the Destination sequence number stored in the routing table is greater than or equal to the Destination sequence number in the ROUTE REQUEST packet. If it is less, the stored route is older than the previous route the source had for the destination, so step 3 is executed.
- Since the receiver does not know a fresh route to the destination, it increments the Hop count field and rebroadcasts the ROUTE REQUEST packet. It also extracts the data from the packet and stores it as a new entry in its reverse route table. This information will be used to construct the reverse route so that the reply can get back to the source later. The arrows in Fig. 5-20 are used for building the reverse route. A timer is also started for the newly-made reverse route entry. If it expires, the entry is deleted.
Neither B nor D knows where I is, so
each of them creates a reverse route entry pointing back to A, as shown by the
arrows in Fig. 5-20, and broadcasts the packet with Hop
count set to 1. The broadcast from B reaches C and D. C makes an entry for it
in its reverse route table and rebroadcasts it. In contrast, D rejects it as a
duplicate. Similarly, D's broadcast is rejected by B. However, D's broadcast is
accepted by F and G and stored, as shown in Fig. 5-20(c). After E, H, and I receive the
broadcast, the ROUTE REQUEST finally reaches a destination that knows where I
is, namely, I itself, as illustrated in Fig. 5-20(d). Note that although we have shown
the broadcasts in three discrete steps here, the broadcasts from different
nodes are not coordinated in any way.
In response to the incoming request,
I builds a ROUTE REPLY packet, as shown in Fig. 5-22. The Source address, Destination
address, and Hop count are copied from the incoming request, but the
Destination sequence number taken from its counter in memory. The Hop count
field is set to 0. The Lifetime field controls how long the route is valid.
This packet is unicast to the node that the ROUTE REQUEST packet came from, in
this case, G. It then follows the reverse path to D and finally to A. At each
node, Hop count is incremented so the node can see how far from the destination
(I) it is.
At each intermediate node on the way
back, the packet is inspected. It is entered into the local routing table as a
route to I if one or more of the following three conditions are met:
- No route to I is known.
- The sequence number for I in the ROUTE REPLY packet is greater than the value in the routing table.
- The sequence numbers are equal but the new route is shorter.
In this way, all the nodes on the
reverse route learn the route to I for free, as a byproduct of A's route
discovery. Nodes that got the original REQUEST ROUTE packet but were not on the
reverse path (B, C, E, F, and H in this example) discard the reverse route
table entry when the associated timer expires.
In a large network, the algorithm
generates many broadcasts, even for destinations that are close by. The number
of broadcasts can be reduced as follows. The IP packet's Time to live is
initialized by the sender to the expected diameter of the network and
decremented on each hop. If it hits 0, the packet is discarded instead of being
broadcast.
The discovery process is then
modified as follows. To locate a destination, the sender broadcasts a ROUTE
REQUEST packet with Time to live set to 1. If no response comes back within a
reasonable time, another one is sent, this time with Time to live set to 2.
Subsequent attempts use 3, 4, 5, etc. In this way, the search is first
attempted locally, then in increasingly wider rings.
Because nodes can move or be
switched off, the topology can change spontaneously. For example, in Fig. 5-20, if G is switched off, A will not
realize that the route it was using to I (ADGI) is no longer valid. The
algorithm needs to be able to deal with this. Periodically, each node
broadcasts a Hello message. Each of its neighbors is expected to respond to it.
If no response is forthcoming, the broadcaster knows that that neighbor has moved
out of range and is no longer connected to it. Similarly, if it tries to send a
packet to a neighbor that does not respond, it learns that the neighbor is no
longer available.
This information is used to purge
routes that no longer work. For each possible destination, each node, N, keeps
track of its neighbors that have fed it a packet for that destination during
the last DT seconds. These are called N's active neighbors for that
destination. N does this by having a routing table keyed by destination and
containing the outgoing node to use to reach the destination, the hop count to
the destination, the most recent destination sequence number, and the list of
active neighbors for that destination. A possible routing table for node D in
our example topology is shown in Fig. 5-23(a).
When any of N's neighbors becomes
unreachable, it checks its routing table to see which destinations have routes
using the now-gone neighbor. For each of these routes, the active neighbors are
informed that their route via N is now invalid and must be purged from their
routing tables. The active neighbors then tell their active neighbors, and so
on, recursively, until all routes depending on the now-gone node are purged from
all routing tables.
As an example of route maintenance,
consider our previous example, but now with G suddenly switched off. The
changed topology is illustrated in Fig. 5-23(b). When D discovers that G is gone, it
looks at its routing table and sees that G was used on routes to E, G, and I.
The union of the active neighbors for these destinations is the set {A, B}. In
other words, A and B depend on G for some of their routes, so they have to be
informed that these routes no longer work. D tells them by sending them packets
that cause them to update their own routing tables accordingly. D also purges
the entries for E, G, and I from its routing table.
It may not have been obvious from
our description, but a critical difference between AODV and Bellman-Ford is
that nodes do not send out periodic broadcasts containing their entire routing
table. This difference saves both bandwidth and battery life.
AODV is also capable of doing
broadcast and multicast routing. For details, consult (Perkins and Royer,
2001). Ad hoc routing is a red-hot research area. A great deal has been
published on the topic. A few of the papers include (Chen et al., 2002; Hu and
Johnson, 2001; Li et al., 2001; Raju and Garcia-Luna-Aceves, 2001; Ramanathan
and Redi, 2002; Royer and Toh, 1999; Spohn and Garcia-Luna-Aceves, 2001; Tseng
et al., 2001; and Zadeh et al., 2002).
A relatively new phenomenon is
peer-to-peer networks, in which a large number of people, usually with
permanent wired connections to the Internet, are in contact to share resources.
The first widespread application of peer-to-peer technology was for mass crime:
50 million Napster users were exchanging copyrighted songs without the
copyright owners' permission until Napster was shut down by the courts amid
great controversy. Nevertheless, peer-to-peer technology has many interesting
and legal uses. It also has something similar to a routing problem, although it
is not quite the same as the ones we have studied so far. Nevertheless, it is
worth a quick look.
What makes peer-to-peer systems
interesting is that they are totally distributed. All nodes are symmetric and
there is no central control or hierarchy. In a typical peer-to-peer system the
users each have some information that may be of interest to other users. This
information may be free software, (public domain) music, photographs, and so
on. If there are large numbers of users, they will not know each other and will
not know where to find what they are looking for. One solution is a big central
database, but this may not be feasible for some reason (e.g., nobody is willing
to host and maintain it). Thus, the problem comes down to how a user finds a
node that contains what he is looking for in the absence of a centralized
database or even a centralized index.
Let us assume that each user has one
or more data items such as songs, photographs, programs, files, and so on that
other users might want to read. Each item has an ASCII string naming it. A
potential user knows just the ASCII string and wants to find out if one or more
people have copies and, if so, what their IP addresses are.
As an example, consider a
distributed genealogical database. Each genealogist has some on-line records
for his or her ancestors and relatives, possibly with photos, audio, or even
video clips of the person. Multiple people may have the same great grandfather,
so an ancestor may have records at multiple nodes. The name of the record is
the person's name in some canonical form. At some point, a genealogist
discovers his great grandfather's will in an archive, in which the great
grandfather bequeaths his gold pocket watch to his nephew. The genealogist now
knows the nephew's name and wants to find out if any other genealogist has a
record for him. How, without a central database, do we find out who, if anyone,
has records?
Various algorithms have been
proposed to solve this problem. The one we will examine is Chord (Dabek et al.,
2001a; and Stoica et al., 2001). A simplified explanation of how it works is as
follows. The Chord system consists of n participating users, each of whom may
have some stored records and each of whom is prepared to store bits and pieces
of the index for use by other users. Each user node has an IP address that can
be hashed to an m-bit number using a hash function, hash. Chord uses SHA-1 for
hash. SHA-1 is used in cryptography;. For now, it is just a function that takes
a variable-length byte string as argument and produces a highly-random 160-bit
number. Thus, we can convert any IP address to a 160-bit number called the node
identifier.
Conceptually, all the 2160
node identifiers are arranged in ascending order in a big circle. Some of them
correspond to participating nodes, but most of them do not. In Fig. 5-24(a) we show the node identifier circle
for m = 5 (just ignore the arcs in the middle for the moment). In this example,
the nodes with identifiers 1, 4, 7, 12, 15, 20, and 27 correspond to actual
nodes and are shaded in the figure; the rest do not exist.
Figure 5-24. (a) A set of 32 node
identifiers arranged in a circle. The shaded ones correspond to actual
machines. The arcs show the fingers from nodes 1, 4, and 12. The labels on the
arcs are the table indices. (b) Examples of the finger tables.
Let us now define the function
successor(k) as the node identifier of the first actual node following k around
the circle clockwise. For example, successor (6) = 7, successor (8) = 12, and
successor (22) = 27.
The names of the records (song
names, ancestors' names, and so on) are also hashed with hash (i.e., SHA-1) to
generate a 160-bit number, called the key. Thus, to convert name (the ASCII
name of the record) to its key, we use key = hash(name). This computation is
just a local procedure call to hash. If a person holding a genealogical record
for name wants to make it available to everyone, he first builds a tuple
consisting of (name, my-IP-address) and then asks successor(hash(name)) to
store the tuple. If multiple records (at different nodes) exist for this name,
their tuple will all be stored at the same node. In this way, the index is
distributed over the nodes at random. For fault tolerance, p different hash
functions could be used to store each tuple at p nodes, but we will not
consider that further here.
If some user later wants to look up
name, he hashes it to get key and then uses successor (key) to find the IP
address of the node storing its index tuples. The first step is easy; the
second one is not. To make it possible to find the IP address of the node
corresponding to a certain key, each node must maintain certain administrative
data structures. One of these is the IP address of its successor node along the
node identifier circle. For example, in Fig. 5-24, node 4's successor is 7 and node 7's
successor is 12.
Lookup can now proceed as follows.
The requesting node sends a packet to its successor containing its IP address
and the key it is looking for. The packet is propagated around the ring until
it locates the successor to the node identifier being sought. That node checks
to see if it has any information matching the key, and if so, returns it
directly to the requesting node, whose IP address it has.
As a first optimization, each node
could hold the IP addresses of both its successor and its predecessor, so that
queries could be sent either clockwise or counterclockwise, depending on which
path is thought to be shorter. For example, node 7 in Fig. 5-24 could go clockwise to find node
identifier 10 but counterclockwise to find node identifier 3.
Even with two choices of direction,
linearly searching all the nodes is very inefficient in a large peer-to-peer
system since the mean number of nodes required per search is n/2. To greatly
speed up the search, each node also maintains what Chord calls a finger table.
The finger table has m entries, indexed by 0 through m - 1, each one pointing
to a different actual node. Each of the entries has two fields: start and the
IP address of successor(start), as shown for three example nodes in Fig. 5-24(b). The values of the fields for entry
i at node k are:
Note that each node stores the IP
addresses of a relatively small number of nodes and that most of these are
fairly close by in terms of node identifier.
Using the finger table, the lookup
of key at node k proceeds as follows. If key falls between k and successor (k),
then the node holding information about key is successor (k) and the search
terminates. Otherwise, the finger table is searched to find the entry whose
start field is the closest predecessor of key. A request is then sent directly
to the IP address in that finger table entry to ask it to continue the search.
Since it is closer to key but still below it, chances are good that it will be
able to return the answer with only a small number of additional queries. In
fact, since every lookup halves the remaining distance to the target, it can be
shown that the average number of lookups is log2n.
As a first example, consider looking
up key = 3 at node 1. Since node 1 knows that 3 lies between it and its
successor, 4, the desired node is 4 and the search terminates, returning node
4's IP address.
As a second example, consider
looking up key = 14 at node 1. Since 14 does not lie between 1 and 4, the
finger table is consulted. The closest predecessor to 14 is 9, so the request
is forwarded to the IP address of 9's entry, namely, that of node 12. Node 12
sees that 14 falls between it and its successor (15), so it returns the IP address
of node 15.
As a third example, consider looking
up key = 16 at node 1. Again a query is sent to node 12, but this time node 12
does not know the answer itself. It looks for the node most closely preceding
16 and finds 14, which yields the IP address of node 15. A query is then sent
there. Node 15 observes that 16 lies between it and its successor (20), so it
returns the IP address of 20 to the caller, which works its way back to node 1.
Since nodes join and leave all the
time, Chord needs a way to handle these operations. We assume that when the
system began operation it was small enough that the nodes could just exchange
information directly to build the first circle and finger tables. After that an
automated procedure is needed, as follows. When a new node, r, wants to join,
it must contact some existing node and ask it to look up the IP address of
successor (r) for it. The new node then asks successor (r) for its predecessor.
The new node then asks both of these to insert r in between them in the circle.
For example, if 24 in Fig. 5-24 wants to join, it asks any node to look
up successor (24), which is 27. Then it asks 27 for its predecessor (20). After
it tells both of those about its existence, 20 uses 24 as its successor and 27
uses 24 as its predecessor. In addition, node 27 hands over those keys in the
range 21–24, which now belong to 24. At this point, 24 is fully inserted.
However, many finger tables are now
wrong. To correct them, every node runs a background process that periodically
recomputes each finger by calling successor. When one of these queries hits a
new node, the corresponding finger entry is updated.
When a node leaves gracefully, it
hands its keys over to its successor and informs its predecessor of its
departure so the predecessor can link to the departing node's successor. When a
node crashes, a problem arises because its predecessor no longer has a valid
successor. To alleviate this problem, each node keeps track not only of its
direct successor but also its s direct successors, to allow it to skip over up
to s - 1 consecutive failed nodes and reconnect the circle.
Chord has been used to construct a
distributed file system (Dabek et al., 2001b) and other applications, and
research is ongoing. A different peer-to-peer system, Pastry, and its
applications are described in (Rowstron and Druschel, 2001a; and Rowstron and
Druschel, 2001b). A third peer-to-peer system, Freenet, is discussed in (Clarke
et al., 2002). A fourth system of this type is described in (Ratnasamy et al.,
2001).
No comments:
Post a Comment
silahkan membaca dan berkomentar