8.1
Cryptography
Cryptography comes from the Greek
words for ''secret writing.'' It has a long and colorful history going back
thousands of years. In this section we will just sketch some of the highlights,
as background information for what follows. For a complete history of
cryptography, Kahn's (1995) book is recommended reading. For a comprehensive
treatment of the current state-of-the-art in security and cryptographic
algorithms, protocols, and applications, see (Kaufman et al., 2002). For a more
mathematical approach, see (Stinson, 2002). For a less mathematical approach,
see (Burnett and Paine, 2001).
Professionals make a distinction
between ciphers and codes. A cipher is a character-for-character or bit-for-bit
transformation, without regard to the linguistic structure of the message. In
contrast, a code replaces one word with another word or symbol. Codes are not
used any more, although they have a glorious history. The most successful code
ever devised was used by the U.S. armed forces during World War II in the
Pacific. They simply had Navajo Indians talking to each other using specific
Navajo words for military terms, for example chay-dagahi-nail-tsaidi
(literally: tortoise killer) for antitank weapon. The Navajo language is highly
tonal, exceedingly complex, and has no written form. And not a single person in
Japan knew anything about it.
In September 1945, the San Diego
Union described the code by saying ''For three years, wherever the Marines
landed, the Japanese got an earful of strange gurgling noises interspersed with
other sounds resembling the call of a Tibetan monk and the sound of a hot water
bottle being emptied.'' The Japanese never broke the code and many Navajo code
talkers were awarded high military honors for extraordinary service and
bravery. The fact that the U.S. broke the Japanese code but the Japanese never
broke the Navajo code played a crucial role in the American victories in the
Pacific.
Historically, four groups of people
have used and contributed to the art of cryptography: the military, the
diplomatic corps, diarists, and lovers. Of these, the military has had the most
important role and has shaped the field over the centuries. Within military
organizations, the messages to be encrypted have traditionally been given to
poorly-paid, low-level code clerks for encryption and transmission. The sheer
volume of messages prevented this work from being done by a few elite
specialists.
Until the advent of computers, one
of the main constraints on cryptography had been the ability of the code clerk
to perform the necessary transformations, often on a battlefield with little
equipment. An additional constraint has been the difficulty in switching over
quickly from one cryptographic method to another one, since this entails
retraining a large number of people. However, the danger of a code clerk being
captured by the enemy has made it essential to be able to change the
cryptographic method instantly if need be. These conflicting requirements have
given rise to the model of Fig. 8-2.
The messages to be encrypted, known
as the plaintext, are transformed by a function that is parameterized by a key.
The output of the encryption process, known as the ciphertext, is then
transmitted, often by messenger or radio. We assume that the enemy, or intruder,
hears and accurately copies down the complete ciphertext. However, unlike the
intended recipient, he does not know what the decryption key is and so cannot
decrypt the ciphertext easily. Sometimes the intruder can not only listen to
the communication channel (passive intruder) but can also record messages and
play them back later, inject his own messages, or modify legitimate messages
before they get to the receiver (active intruder). The art of breaking ciphers,
called cryptanalysis, and the art devising them (cryptography) is collectively
known as cryptology.
It will often be useful to have a
notation for relating plaintext, ciphertext, and keys. We will use C = EK(P)
to mean that the encryption of the plaintext P using key K gives the ciphertext
C. Similarly, P = DK(C) represents the decryption of C to get the
plaintext again. It then follows that
This notation suggests that E and D
are just mathematical functions, which they are. The only tricky part is that
both are functions of two parameters, and we have written one of the parameters
(the key) as a subscript, rather than as an argument, to distinguish it from
the message.
A fundamental rule of cryptography
is that one must assume that the cryptanalyst knows the methods used for
encryption and decryption. In other words, the cryptanalyst knows how the
encryption method, E, and decryption, D,of Fig. 8-2 work in detail. The amount of effort
necessary to invent, test, and install a new algorithm every time the old
method is compromised (or thought to be compromised) has always made it
impractical to keep the encryption algorithm secret. Thinking it is secret when
it is not does more harm than good.
This is where the key enters. The
key consists of a (relatively) short string that selects one of many potential
encryptions. In contrast to the general method, which may only be changed every
few years, the key can be changed as often as required. Thus, our basic model
is a stable and publicly-known general method parameterized by a secret and
easily changed key. The idea that the cryptanalyst knows the algorithms and
that the secrecy lies exclusively in the keys is called Kerckhoff's principle,
named after the Flemish military cryptographer Auguste Kerckhoff who first
stated it in 1883 (Kerckhoff, 1883). Thus, we have:
Kerckhoff's principle: All
algorithms must be public; only the keys are secret
The nonsecrecy of the algorithm
cannot be emphasized enough. Trying to keep the algorithm secret, known in the
trade as security by obscurity, never works. Also, by publicizing the
algorithm, the cryptographer gets free consulting from a large number of
academic cryptologists eager to break the system so they can publish papers
demonstrating how smart they are. If many experts have tried to break the
algorithm for 5 years after its publication and no one has succeeded, it is
probably pretty solid.
Since the real secrecy is in the
key, its length is a major design issue. Consider a simple combination lock.
The general principle is that you enter digits in sequence. Everyone knows
this, but the key is secret. A key length of two digits means that there are
100 possibilities. A key length of three digits means 1000 possibilities, and a
key length of six digits means a million. The longer the key, the higher the work
factor the cryptanalyst has to deal with. The work factor for breaking the
system by exhaustive search of the key space is exponential in the key length.
Secrecy comes from having a strong (but public) algorithm and a long key. To
prevent your kid brother from reading your e-mail, 64-bit keys will do. For routine
commercial use, at least 128 bits should be used. To keep major governments at
bay, keys of at least 256 bits, preferably more, are needed.
From the cryptanalyst's point of
view, the cryptanalysis problem has three principal variations. When he has a
quantity of ciphertext and no plaintext, he is confronted with the ciphertext-only
problem. The cryptograms that appear in the puzzle section of newspapers pose
this kind of problem. When the cryptanalyst has some matched ciphertext and
plaintext, the problem is called the known plaintext problem. Finally, when the
cryptanalyst has the ability to encrypt pieces of plaintext of his own
choosing, we have the chosen plaintext problem. Newspaper cryptograms could be
broken trivially if the cryptanalyst were allowed to ask such questions as:
What is the encryption of ABCDEFGHIJKL?
Novices in the cryptography business
often assume that if a cipher can withstand a ciphertext-only attack, it is
secure. This assumption is very naive. In many cases the cryptanalyst can make
a good guess at parts of the plaintext. For example, the first thing many
computers say when you call them up is login: . Equipped with some matched
plaintext-ciphertext pairs, the cryptanalyst's job becomes much easier. To
achieve security, the cryptographer should be conservative and make sure that
the system is unbreakable even if his opponent can encrypt arbitrary amounts of
chosen plaintext.
Encryption methods have historically
been divided into two categories: substitution ciphers and transposition
ciphers. We will now deal with each of these briefly as background information
for modern cryptography.
In a substitution cipher each letter
or group of letters is replaced by another letter or group of letters to
disguise it. One of the oldest known ciphers is the Caesar cipher, attributed
to Julius Caesar. In this method, a becomes D, b becomes E, c becomes F, ... ,
and z becomes C. For example, attack becomes DWWDFN. In examples, plaintext
will be given in lower case letters, and ciphertext in upper case letters.
A slight generalization of the
Caesar cipher allows the ciphertext alphabet to be shifted by k letters,
instead of always 3. In this case k becomes a key to the general method of
circularly shifted alphabets. The Caesar cipher may have fooled Pompey, but it
has not fooled anyone since.
The next improvement is to have each
of the symbols in the plaintext, say, the 26 letters for simplicity, map onto
some other letter. For example,
plaintext: a b c d e f g h i j k l m
n o p q r s t u v w x y z
ciphertext: Q W E R T Y U I O P A S
D F G H J K L Z X C V B N M
The general system of
symbol-for-symbol substitution is called a monoalphabetic substitution, with the
key being the 26-letter string corresponding to the full alphabet. For the key
above, the plaintext attack would be transformed into the ciphertext QZZQEA.
At first glance this might appear to
be a safe system because although the cryptanalyst knows the general system
(letter-for-letter substitution), he does not know which of the 26! 4 x 1026 possible keys is in use.
In contrast with the Caesar cipher, trying all of them is not a promising
approach. Even at 1 nsec per solution, a computer would take 1010
years to try all the keys.
Nevertheless, given a surprisingly
small amount of ciphertext, the cipher can be broken easily. The basic attack
takes advantage of the statistical properties of natural languages. In English,
for example, e is the most common letter, followed by t, o, a, n, i, etc. The
most common two-letter combinations, or digrams, are th, in, er, re, and an.
The most common three-letter combinations, or trigrams, are the, ing, and, and ion.
A cryptanalyst trying to break a
monoalphabetic cipher would start out by counting the relative frequencies of
all letters in the ciphertext. Then he might tentatively assign the most common
one to e and the next most common one to t. He would then look at trigrams to
find a common one of the form tXe, which strongly suggests that X is h.
Similarly, if the pattern thYt occurs frequently, the Y probably stands for a.
With this information, he can look for a frequently occurring trigram of the
form aZW, which is most likely and. By making guesses at common letters,
digrams, and trigrams and knowing about likely patterns of vowels and
consonants, the cryptanalyst builds up a tentative plaintext, letter by letter.
Another approach is to guess a
probable word or phrase. For example, consider the following ciphertext from an
accounting firm (blocked into groups of five characters):
CTBMN
BYCTC BTJDS QXBNS GSTJC BTSWX CTQTZ CQVUJ
QJSGS
TJQZZ MNQJS VLNSX VSZJU JDSTS JQUUS JUBXJ
DSKSU
JSNTK BGAQJ ZBGYQ TLCTZ BNYBN QJSW
A likely word in a message from an
accounting firm is financial. Using our knowledge that financial has a repeated
letter (i), with four other letters between their occurrences, we look for
repeated letters in the ciphertext at this spacing. We find 12 hits, at
positions 6, 15, 27, 31, 42, 48, 56, 66, 70, 71, 76, and 82. However, only two
of these, 31 and 42, have the next letter (corresponding to n in the plaintext)
repeated in the proper place. Of these two, only 31 also has the a correctly
positioned, so we know that financial begins at position 30. From this point
on, deducing the key is easy by using the frequency statistics for English
text.
Substitution ciphers preserve the
order of the plaintext symbols but disguise them. Transposition ciphers, in
contrast, reorder the letters but do not disguise them. Figure 8-3 depicts a common transposition cipher,
the columnar transposition. The cipher is keyed by a word or phrase not
containing any repeated letters. In this example, MEGABUCK is the key. The
purpose of the key is to number the columns, column 1 being under the key
letter closest to the start of the alphabet, and so on. The plaintext is
written horizontally, in rows, padded to fill the matrix if need be. The
ciphertext is read out by columns, starting with the column whose key letter is
the lowest.
To break a transposition cipher, the
cryptanalyst must first be aware that he is dealing with a transposition
cipher. By looking at the frequency of E, T, A, O, I, N, etc., it is easy to
see if they fit the normal pattern for plaintext. If so, the cipher is clearly
a transposition cipher, because in such a cipher every letter represents
itself, keeping the frequency distribution intact.
The next step is to make a guess at
the number of columns. In many cases a probable word or phrase may be guessed
at from the context. For example, suppose that our cryptanalyst suspects that
the plaintext phrase milliondollars occurs somewhere in the message. Observe
that digrams MO, IL, LL, LA, IR and OS occur in the ciphertext as a result of
this phrase wrapping around. The ciphertext letter O follows the ciphertext
letter M (i.e., they are vertically adjacent in column 4) because they are
separated in the probable phrase by a distance equal to the key length. If a
key of length seven had been used, the digrams MD, IO, LL, LL, IA, OR, and NS
would have occurred instead. In fact, for each key length, a different set of
digrams is produced in the ciphertext. By hunting for the various
possibilities, the cryptanalyst can often easily determine the key length.
The remaining step is to order the
columns. When the number of columns, k, is small, each of the k(k - 1) column
pairs can be examined to see if its digram frequencies match those for English
plaintext. The pair with the best match is assumed to be correctly positioned.
Now each remaining column is tentatively tried as the successor to this pair.
The column whose digram and trigram frequencies give the best match is
tentatively assumed to be correct. The predecessor column is found in the same
way. The entire process is continued until a potential ordering is found.
Chances are that the plaintext will be recognizable at this point (e.g., if milloin
occurs, it is clear what the error is).
Some transposition ciphers accept a
fixed-length block of input and produce a fixed-length block of output. These
ciphers can be completely described by giving a list telling the order in which
the characters are to be output. For example, the cipher of Fig. 8-3 can be seen as a 64 character block
cipher. Its output is 4, 12, 20, 28, 36, 44, 52, 60, 5, 13 , ... , 62. In other
words, the fourth input character, a, is the first to be output, followed by
the twelfth, f, and so on.
Constructing an unbreakable cipher
is actually quite easy; the technique has been known for decades. First choose
a random bit string as the key. Then convert the plaintext into a bit string,
for example by using its ASCII representation. Finally, compute the XOR
(eXclusive OR) of these two strings, bit by bit. The resulting ciphertext
cannot be broken, because in a sufficiently large sample of ciphertext, each
letter will occur equally often, as will every digram, every trigram, and so
on. This method, known as the one-time pad, is immune to all present and future
attacks no matter how much computational power the intruder has. The reason
derives from information theory: there is simply no information in the message
because all possible plaintexts of the given length are equally likely.
An example of how one-time pads are
used is given in Fig. 8-4. First, message 1, ''I love you.'' is
converted to 7-bit ASCII. Then a one-time pad, pad 1, is chosen and XORed with
the message to get the ciphertext. A cryptanalyst could try all possible
one-time pads to see what plaintext came out for each one. For example, the
one-time pad listed as pad 2 in the figure could be tried, resulting in
plaintext 2, ''Elvis lives'', which may or may not be plausible. In fact, for
every 11-character ASCII plaintext, there is a one-time pad that generates it.
That is what we mean by saying there is no information in the ciphertext: you can
get any message of the correct length out of it.
Figure 8-4. The use of a one-time pad for encryption and the
possibility of getting any possible plaintext from the ciphertext by the use of
some other pad.
One-time pads are great in theory
but have a number of disadvantages in practice. To start with, the key cannot
be memorized, so both sender and receiver must carry a written copy with them.
If either one is subject to capture, written keys are clearly undesirable.
Additionally, the total amount of data that can be transmitted is limited by
the amount of key available. If the spy strikes it rich and discovers a wealth
of data, he may find himself unable to transmit it back to headquarters because
the key has been used up. Another problem is the sensitivity of the method to
lost or inserted characters. If the sender and receiver get out of
synchronization, all data from then on will appear garbled.
With the advent of computers, the
one-time pad might potentially become practical for some applications. The
source of the key could be a special DVD that contains several gigabytes of
information and if transported in a DVD movie box and prefixed by a few minutes
of video, would not even be suspicious. Of course, at gigabit network speeds,
having to insert a new DVD every 30 sec could become tedious. And the DVDs must
be personally carried from the sender to the receiver before any messages can
be sent, which greatly reduces their practical utility.
Interestingly, there may be a
solution to the problem of how to transmit the one-time pad over the network,
and it comes from a very unlikely source: quantum mechanics. This area is still
experimental, but initial tests are promising. If it can be perfected and be
made efficient, virtually all cryptography will eventually be done using
one-time pads since they are provably secure. Below we will briefly explain how
this method, quantum cryptography, works. In particular, we will describe a
protocol called BB84 after its authors and publication year (Bennet and
Brassard, 1984).
A user, Alice, wants to establish a
one-time pad with a second user, Bob. Alice and Bob are called principals, the
main characters in our story. For example, Bob is a banker with whom Alice
would like to do business. The names ''Alice'' and ''Bob'' have been used for
the principals in virtually every paper and book on cryptography in the past
decade. Cryptographers love tradition. If we were to use ''Andy'' and
''Barbara'' as the principals, no one would believe anything in this chapter.
So be it.
If Alice and Bob could establish a
one-time pad, they could use it to communicate securely. The question is: How
can they establish it without previously exchanging DVDs? We can assume that
Alice and Bob are at opposite ends of an optical fiber over which they can send
and receive light pulses. However, an intrepid intruder, Trudy, can cut the
fiber to splice in an active tap. Trudy can read all the bits in both
directions. She can also send false messages in both directions. The situation
might seem hopeless for Alice and Bob, but quantum cryptography can shed some
new light on the subject.
Quantum cryptography is based on the
fact that light comes in little packets called photons, which have some
peculiar properties. Furthermore, light can be polarized by being passed
through a polarizing filter, a fact well known to both sunglasses wearers and
photographers. If a beam of light (i.e., a stream of photons) is passed through
a polarizing filter, all the photons emerging from it will be polarized in the
direction of the filter's axis (e.g., vertical). If the beam is now passed
through a second polarizing filter, the intensity of the light emerging from
the second filter is proportional to the square of the cosine of the angle
between the axes. If the two axes are perpendicular, no photons get through.
The absolute orientation of the two filters does not matter; only the angle
between their axes counts.
To generate a one-time pad, Alice
needs two sets of polarizing filters. Set one consists of a vertical filter and
a horizontal filter. This choice is called a rectilinear basis. A basis
(plural: bases) is just a coordinate system. The second set of filters is the
same, except rotated 45 degrees, so one filter runs from the lower left to the
upper right and the other filter runs from the upper left to the lower right.
This choice is called a diagonal basis. Thus, Alice has two bases, which she
can rapidly insert into her beam at will. In reality, Alice does not have four
separate filters, but a crystal whose polarization can be switched electrically
to any of the four allowed directions at great speed. Bob has the same equipment
as Alice. The fact that Alice and Bob each have two bases available is
essential to quantum cryptography.
For each basis, Alice now assigns
one direction as 0 and the other as 1. In the example presented below, we
assume she chooses vertical to be 0 and horizontal to be 1. Independently, she
also chooses lower left to upper right as 0 and upper left to lower right as 1.
She sends these choices to Bob as plaintext.
Now Alice picks a one-time pad, for
example based on a random number generator (a complex subject all by itself).
She transfers it bit by bit to Bob, choosing one of her two bases at random for
each bit. To send a bit, her photon gun emits one photon polarized
appropriately for the basis she is using for that bit. For example, she might choose
bases of diagonal, rectilinear, rectilinear, diagonal, rectilinear, etc. To
send her one-time pad of 1001110010100110 with these bases, she would send the
photons shown in Fig. 8-5(a). Given the one-time pad and the
sequence of bases, the polarization to use for each bit is uniquely determined.
Bits sent one photon at a time are called qubits.
Bob does not know which bases to
use, so he picks one at random for each arriving photon and just uses it, as
shown in Fig. 8-5(b). If he picks the correct basis, he
gets the correct bit. If he picks the incorrect basis, he gets a random bit
because if a photon hits a filter polarized at 45 degrees to its own
polarization, it randomly jumps to the polarization of the filter or to a
polarization perpendicular to the filter with equal probability. This property
of photons is fundamental to quantum mechanics. Thus, some of the bits are
correct and some are random, but Bob does not know which are which. Bob's
results are depicted in Fig. 8-5(c).
How does Bob find out which bases he
got right and which he got wrong? He simply tells Alice which basis he used for
each bit in plaintext and she tells him which are right and which are wrong in
plaintext, as shown in Fig. 8-5(d). From this information both of them
can build a bit string from the correct guesses, as shown in Fig. 8-5(e). On the average, this bit string will
be half the length of the original bit string, but since both parties know it,
they can use it as a one-time pad. All Alice has to do is transmit a bit string
slightly more than twice the desired length and she and Bob have a one-time pad
of the desired length. Problem solved.
But wait a minute. We forgot Trudy.
Suppose that she is curious about what Alice has to say and cuts the fiber,
inserting her own detector and transmitter. Unfortunately for her, she does not
know which basis to use for each photon either. The best she can do is pick one
at random for each photon, just as Bob does. An example of her choices is shown
in Fig. 8-5(f). When Bob later reports (in
plaintext) which bases he used and Alice tells him (in plaintext) which ones
are correct, Trudy now knows when she got it right and when she got it wrong.
In Fig. 8-5 she got it right for bits 0, 1, 2, 3, 4,
6, 8, 12, and 13. But she knows from Alice's reply in Fig. 8-5(d) that only bits 1, 3, 7, 8, 10, 11,
12, and 14 are part of the one-time pad. For four of these bits (1, 3, 8, and
12), she guessed right and captured the correct bit. For the other four (7, 10,
11, and 14) she guessed wrong and does not know the bit transmitted. Thus, Bob
knows the one-time pad starts with 01011001, from Fig. 8-5(e) but all Trudy has is 01?1??0?, from Fig. 8-5(g).
Of course, Alice and Bob are aware
that Trudy may have captured part of their one-time pad, so they would like to
reduce the information Trudy has. They can do this by performing a
transformation on it. For example, they could divide the one-time pad into
blocks of 1024 bits and square each one to form a 2048-bit number and use the
concatenation of these 2048-bit numbers as the one-time pad. With her partial
knowledge of the bit string transmitted, Trudy has no way to generate its square
and so has nothing. The transformation from the original one-time pad to a
different one that reduces Trudy's knowledge is called privacy amplification.
In practice, complex transformations in which every output bit depends on every
input bit are used instead of squaring.
Poor Trudy. Not only does she have
no idea what the one-time pad is, but her presence is not a secret either.
After all, she must relay each received bit to Bob to trick him into thinking
he is talking to Alice. The trouble is, the best she can do is transmit the
qubit she received, using the polarization she used to receive it, and about
half the time she will be wrong, causing many errors in Bob's one-time pad.
When Alice finally starts sending
data, she encodes it using a heavy forward-error-correcting code. From Bob's
point of view, a 1-bit error in the one-time pad is the same as a 1-bit
transmission error. Either way, he gets the wrong bit. If there is enough
forward error correction, he can recover the original message despite all the
errors, but he can easily count how many errors were corrected. If this number
is far more than the expected error rate of the equipment, he knows that Trudy
has tapped the line and can act accordingly (e.g., tell Alice to switch to a
radio channel, call the police, etc.). If Trudy had a way to clone a photon so
she had one photon to inspect and an identical photon to send to Bob, she could
avoid detection, but at present no way to clone a photon perfectly is known.
But even if Trudy could clone photons, the value of quantum cryptography to
establish one-time pads would not be reduced.
Although quantum cryptography has
been shown to operate over distances of 60 km of fiber, the equipment is
complex and expensive. Still, the idea has promise. For more information about
quantum cryptography, see (Mullins, 2002).
Although we will study many
different cryptographic systems in the pages ahead, two principles underlying
all of them are important to understand.
The first principle is that all
encrypted messages must contain some redundancy, that is, information not
needed to understand the message. An example may make it clear why this is
needed. Consider a mail-order company, The Couch Potato (TCP), with 60,000
products. Thinking they are being very efficient, TCP's programmers decide that
ordering messages should consist of a 16-byte customer name followed by a
3-byte data field (1 byte for the quantity and 2 bytes for the product number).
The last 3 bytes are to be encrypted using a very long key known only by the
customer and TCP.
At first this might seem secure, and
in a sense it is because passive intruders cannot decrypt the messages.
Unfortunately, it also has a fatal flaw that renders it useless. Suppose that a
recently-fired employee wants to punish TCP for firing her. Just before
leaving, she takes the customer list with her. She works through the night
writing a program to generate fictitious orders using real customer names.
Since she does not have the list of keys, she just puts random numbers in the
last 3 bytes, and sends hundreds of orders off to TCP.
When these messages arrive, TCP's
computer uses the customer's name to locate the key and decrypt the message.
Unfortunately for TCP, almost every 3-byte message is valid, so the computer
begins printing out shipping instructions. While it might seem odd for a
customer to order 837 sets of children's swings or 540 sandboxes, for all the
computer knows, the customer might be planning to open a chain of franchised
playgrounds. In this way an active intruder (the ex-employee) can cause a
massive amount of trouble, even though she cannot understand the messages her
computer is generating.
This problem can be solved by the
addition of redundancy to all messages. For example, if order messages are
extended to 12 bytes, the first 9 of which must be zeros, then this attack no
longer works because the ex-employee can no longer generate a large stream of
valid messages. The moral of the story is that all messages must contain
considerable redundancy so that active intruders cannot send random junk and
have it be interpreted as a valid message.
However, adding redundancy also
makes it easier for cryptanalysts to break messages. Suppose that the mail
order business is highly competitive, and The Couch Potato's main competitor,
The Sofa Tuber, would dearly love to know how many sandboxes TCP is selling.
Consequently, they have tapped TCP's telephone line. In the original scheme
with 3-byte messages, cryptanalysis was nearly impossible, because after
guessing a key, the cryptanalyst had no way of telling whether the guess was
right. After all, almost every message is technically legal. With the new
12-byte scheme, it is easy for the cryptanalyst to tell a valid message from an
invalid one. Thus, we have
Cryptographic principle 1: Messages
must contain some redundancy
In other words, upon decrypting a
message, the recipient must be able to tell whether it is valid by simply
inspecting it and perhaps performing a simple computation. This redundancy is
needed to prevent active intruders from sending garbage and tricking the
receiver into decrypting the garbage and acting on the ''plaintext.'' However,
this same redundancy makes it much easier for passive intruders to break the
system, so there is some tension here. Furthermore, the redundancy should never
be in the form of n zeros at the start or end of a message, since running such
messages through some cryptographic algorithms gives more predictable results,
making the cryptanalysts' job easier. A CRC polynomial is much better than a
run of 0s since the receiver can easily verify it, but it generates more work
for the cryptanalyst. Even better is to use a cryptographic hash, a concept we
will explore later.
Getting back to quantum cryptography
for a moment, we can also see how redundancy plays a role there. Due to Trudy's
interception of the photons, some bits in Bob's one-time pad will be wrong. Bob
needs some redundancy in the incoming messages to determine that errors are
present. One very crude form of redundancy is repeating the message two times.
If the two copies are not identical, Bob knows that either the fiber is very
noisy or someone is tampering with the transmission. Of course, sending
everything twice is overkill; a Hamming or Reed-Solomon code is a more
efficient way to do error detection and correction. But it should be clear that
some redundancy is needed to distinguish a valid message from an invalid
message, especially in the face of an active intruder.
The second cryptographic principle
is that some measures must be taken to ensure that each message received can be
verified as being fresh, that is, sent very recently. This measure is needed to
prevent active intruders from playing back old messages. If no such measures
were taken, our ex-employee could tap TCP's phone line and just keep repeating
previously sent valid messages. Restating this idea we get:
Cryptographic principle 2: Some
method is needed to foil replay attacks
One such
measure is including in every message a timestamp valid only for, say, 10
seconds. The receiver can then just keep messages around for 10 seconds, to
compare newly arrived messages to previous ones to filter out duplicates.
Messages older than 10 seconds can be thrown out, since any replays sent more
than 10 seconds later will be rejected as too old. Measures other than
timestamps will be discussed later.
No comments:
Post a Comment
silahkan membaca dan berkomentar