8.4
Digital Signatures
The authenticity of many legal,
financial, and other documents is determined by the presence or absence of an
authorized handwritten signature. And photocopies do not count. For
computerized message systems to replace the physical transport of paper and ink
documents, a method must be found to allow documents to be signed in an
unforgeable way.
The problem of devising a
replacement for handwritten signatures is a difficult one. Basically, what is
needed is a system by which one party can send a signed message to another
party in such a way that the following conditions hold:
- The receiver can verify the claimed identity of the sender.
- The sender cannot later repudiate the contents of the message.
- The receiver cannot possibly have concocted the message himself.
The first requirement is needed, for
example, in financial systems. When a customer's computer orders a bank's
computer to buy a ton of gold, the bank's computer needs to be able to make
sure that the computer giving the order really belongs to the company whose
account is to be debited. In other words, the bank has to authenticate the
customer (and the customer has to authenticate the bank).
The second requirement is needed to
protect the bank against fraud. Suppose that the bank buys the ton of gold, and
immediately thereafter the price of gold drops sharply. A dishonest customer
might sue the bank, claiming that he never issued any order to buy gold. When
the bank produces the message in court, the customer denies having sent it. The
property that no party to a contract can later deny having signed it is called nonrepudiation.
The digital signature schemes that we will now study help provide it.
The third requirement is needed to
protect the customer in the event that the price of gold shoots up and the bank
tries to construct a signed message in which the customer asked for one bar of
gold instead of one ton. In this fraud scenario, the bank just keeps the rest
of the gold for itself.
One approach to digital signatures
is to have a central authority that knows everything and whom everyone trusts,
say Big Brother (BB). Each user then chooses a secret key and carries it by
hand to BB's office. Thus, only Alice and BB know Alice's secret key, KA,
and so on.
When Alice wants to send a signed
plaintext message, P, to her banker, Bob, she generates KA(B, RA,
t, P), where B is Bob's identity, RA is a random number chosen by
Alice, t is a timestamp to ensure freshness, and KA(B, RA,
t, P) is the message encrypted with her key, KA. Then she sends it
as depicted in Fig. 8-18. BB sees that the message is from
Alice, decrypts it, and sends a message to Bob as shown. The message to Bob
contains the plaintext of Alice's message and also the signed message KBB
(A, t, P). Bob now carries out Alice's request.
What happens if Alice later denies
sending the message? Step 1 is that everyone sues everyone (at least, in the
United States). Finally, when the case comes to court and Alice vigorously
denies sending Bob the disputed message, the judge will ask Bob how he can be
sure that the disputed message came from Alice and not from Trudy. Bob first
points out that BB will not accept a message from Alice unless it is encrypted
with KA, so there is no possibility of Trudy sending BB a false
message from Alice without BB detecting it immediately.
Bob then dramatically produces
Exhibit A: KBB (A, t, P). Bob says that this is a message signed by BB
which proves Alice sent P to Bob. The judge then asks BB (whom everyone trusts)
to decrypt Exhibit A. When BB testifies that Bob is telling the truth, the
judge decides in favor of Bob. Case dismissed.
One potential problem with the
signature protocol of Fig. 8-18 is Trudy replaying either message. To
minimize this problem, timestamps are used throughout. Furthermore, Bob can
check all recent messages to see if RA was used in any of them. If
so, the message is discarded as a replay. Note that based on the timestamp, Bob
will reject very old messages. To guard against instant replay attacks, Bob
just checks the RA of every incoming message to see if such a
message has been received from Alice in the past hour. If not, Bob can safely
assume this is a new request.
A structural problem with using
symmetric-key cryptography for digital signatures is that everyone has to agree
to trust Big Brother. Furthermore, Big Brother gets to read all signed
messages. The most logical candidates for running the Big Brother server are
the government, the banks, the accountants, and the lawyers. Unfortunately,
none of these organizations inspire total confidence in all citizens. Hence, it
would be nice if signing documents did not require a trusted authority.
Fortunately, public-key cryptography
can make an important contribution in this area. Let us assume that the
public-key encryption and decryption algorithms have the property that E(D(P))
= P in addition, of course, to the usual property that D(E(P)) = P. (RSA has
this property, so the assumption is not unreasonable.) Assuming that this is
the case, Alice can send a signed plaintext message, P, to Bob by transmitting EB(DA(P)).
Note carefully that Alice knows her own (private) key, DA, as well
as Bob's public key, EB, so constructing this message is something
Alice can do.
When Bob receives the message, he
transforms it using his private key, as usual, yielding DA(P), as
shown in Fig. 8-19. He stores this text in a safe place
and then applies EA to get the original plaintext.
To see how the signature property
works, suppose that Alice subsequently denies having sent the message P to Bob.
When the case comes up in court, Bob can produce both P and DA(P).
The judge can easily verify that Bob indeed has a valid message encrypted by DA
by simply applying EA to it. Since Bob does not know what Alice's
private key is, the only way Bob could have acquired a message encrypted by it
is if Alice did indeed send it. While in jail for perjury and fraud, Alice will
have plenty of time to devise interesting new public-key algorithms.
Although using public-key
cryptography for digital signatures is an elegant scheme, there are problems
that are related to the environment in which they operate rather than with the
basic algorithm. For one thing, Bob can prove that a message was sent by Alice
only as long as DA remains secret. If Alice discloses her secret
key, the argument no longer holds, because anyone could have sent the message,
including Bob himself.
The problem might arise, for
example, if Bob is Alice's stockbroker. Alice tells Bob to buy a certain stock
or bond. Immediately thereafter, the price drops sharply. To repudiate her
message to Bob, Alice runs to the police claiming that her home was burglarized
and the PC holding her key was stolen. Depending on the laws in her state or
country, she may or may not be legally liable, especially if she claims not to
have discovered the break-in until getting home from work, several hours later.
Another problem with the signature
scheme is what happens if Alice decides to change her key. Doing so is clearly
legal, and it is probably a good idea to do so periodically. If a court case
later arises, as described above, the judge will apply the current EA
to DA(P) and discover that it does not produce P. Bob will look
pretty stupid at this point.
In principle, any public-key
algorithm can be used for digital signatures. The de facto industry standard is
the RSA algorithm. Many security products use it. However, in 1991, NIST
proposed using a variant of the El Gamal public-key algorithm for their new Digital
Signature Standard (DSS). El Gamal gets its security from the difficulty of
computing discrete logarithms, rather than from the difficulty of factoring
large numbers.
As usual when the government tries
to dictate cryptographic standards, there was an uproar. DSS was criticized for
being
- Too secret (NSA designed the protocol for using El Gamal).
- Too slow (10 to 40 times slower than RSA for checking signatures).
- Too new (El Gamal had not yet been thoroughly analyzed).
- Too insecure (fixed 512-bit key).
In a subsequent revision, the fourth
point was rendered moot when keys up to 1024 bits were allowed. Nevertheless,
the first two points remain valid.
One criticism of signature methods
is that they often couple two distinct functions: authentication and secrecy.
Often, authentication is needed but secrecy is not. Also, getting an export
license is often easier if the system in question provides only authentication
but not secrecy. Below we will describe an authentication scheme that does not
require encrypting the entire message.
This scheme is based on the idea of
a one-way hash function that takes an arbitrarily long piece of plaintext and
from it computes a fixed-length bit string. This hash function, MD, often
called a message digest, has four important properties:
- Given P, it is easy to compute MD(P).
- Given MD(P), it is effectively impossible to find P.
- Given P no one can find P' such that MD (P') = MD(P).
- A change to the input of even 1 bit produces a very different output.
To meet criterion 3, the hash should
be at least 128 bits long, preferably more. To meet criterion 4, the hash must
mangle the bits very thoroughly, not unlike the symmetric-key encryption
algorithms we have seen.
Computing a message digest from a
piece of plaintext is much faster than encrypting that plaintext with a public-key
algorithm, so message digests can be used to speed up digital signature
algorithms. To see how this works, consider the signature protocol of Fig. 8-18 again. Instead of signing P with KBB
(A, t, P), BB now computes the message digest by applying MD to P, yielding MD(P).
BB then encloses KBB (A, t, MD(P)) as the fifth item in the list
encrypted with KB that is sent to Bob, instead of KBB (A,
t, P).
If a dispute arises, Bob can produce
both P and KBB (A, t, MD(P)). After Big Brother has decrypted it for
the judge, Bob has MD(P), which is guaranteed to be genuine, and the alleged P.
However, since it is effectively impossible for Bob to find any other message
that gives this hash, the judge will easily be convinced that Bob is telling
the truth. Using message digests in this way saves both encryption time and
message transport costs.
Message digests work in public-key
cryptosystems, too, as shown in Fig. 8-20. Here, Alice first computes the message
digest of her plaintext. She then signs the message digest and sends both the
signed digest and the plaintext to Bob. If Trudy replaces P underway, Bob will
see this when he computes MD(P) himself.
A variety of message digest
functions have been proposed. The most widely used ones are MD5 (Rivest, 1992)
and SHA-1 (NIST, 1993). MD5 is the fifth in a series of message digests designed
by Ronald Rivest. It operates by mangling bits in a sufficiently complicated
way that every output bit is affected by every input bit. Very briefly, it
starts out by padding the message to a length of 448 bits (modulo 512). Then
the original length of the message is appended as a 64-bit integer to give a
total input whose length is a multiple of 512 bits. The last precomputation
step is initializing a 128-bit buffer to a fixed value.
Now the computation starts. Each
round takes a 512-bit block of input and mixes it thoroughly with the 128-bit
buffer. For good measure, a table constructed from the sine function is also
thrown in. The point of using a known function like the sine is not because it
is more random than a random number generator, but to avoid any suspicion that
the designer built in a clever back door through which only he can enter.
Remember that IBM's refusal to disclose the principles behind the design of the
S-boxes in DES led to a great deal of speculation about back doors. Rivest wanted
to avoid this suspicion. Four rounds are performed per input block. This
process continues until all the input blocks have been consumed. The contents
of the 128-bit buffer form the message digest.
MD5 has been around for over a
decade now, and many people have attacked it. Some vulnerabilities have been
found, but certain internal steps prevent it from being broken. However, if the
remaining barriers within MD5 fall, it may eventually fail. Nevertheless, at
the time of this writing, it was still standing.
The other major message digest
function is SHA-1 (Secure Hash Algorithm 1), developed by NSA and blessed by
NIST in FIPS 180-1. Like MD5, SHA-1 processes input data in 512-bit blocks,
only unlike MD5, it generates a 160-bit message digest. A typical way for Alice
to send a nonsecret but signed message to Bob is illustrated in Fig. 8-21. Here her plaintext message is fed into
the SHA-1 algorithm to get a 160-bit SHA-1 hash. Alice then signs the hash with
her RSA private key and sends both the plaintext message and the signed hash to
Bob.
After receving the message, Bob
computes the SHA-1 hash himself and also applies Alice's public key to the
signed hash to get the original hash, H. If the two agree, the message is
considered valid. Since there is no way for Trudy to modify the (plaintext)
message while its is in transit and produce a new one that hashes to H, Bob can
easily detect any changes Trudy has made to the message. For messages whose
integrity is important but whose contents are not secret, the scheme of Fig. 8-21 is widely used. For a relatively small
cost in computation, it guarantees that any modifications made to the plaintext
message in transit can be detected with very high probability.
Now let us briefly see how SHA-1
works. It starts out by padding the message by adding a 1 bit to the end,
followed by as many 0 bits as are needed to make the length a multiple of 512
bits. Then a 64-bit number containing the message length before padding is ORed
into the low-order 64 bits. In Fig. 8-22, the message is shown with padding on
the right because English text and figures go from left to right (i.e., the
lower right is generally perceived as the end of the figure). With computers,
this orientation corresponds to big-endian machines such as the SPARC, but
SHA-1 always pads the end of the message, no matter which endian machine is
used.
Figure 8-22. (a) A message padded
out to a multiple of 512 bits. (b) The output variables. (c) The word array.
During the computation, SHA-1
maintains five 32-bit variables, H0 through H4, where the
hash accumulates. These are shown in Fig. 8-22(b). They are initialized to constants
specified in the standard.
Each of the blocks M0
through Mn -1 is now processed in turn. For the current
block, the 16 words are first copied into the start of an auxiliary 80-word
array, W, as shown in Fig. 8-22(c). Then the other 64 words in W are
filled in using the formula
where Sb(W) represents
the left circular rotation of the 32-bit word, W, by b bits. Now five scratch
variables, A through E are initialized from H0 through H4,
respectively.
The actual calculation can be
expressed in pseudo-C as
for
(i = 0; i < 80; i++) {
temp = S5(A) + fi (B,
C, D) + E + Wi +Ki;
E=D; D=C; C=S30(B); B = A; A =
temp;
}
where the Ki constants
are defined in the standard. The mixing functions fi are defined as
When all 80 iterations of the loop
are completed, A through E are added to H 0 through H 4,
respectively.
Now that the first 512-bit block has
been processed, the next one is started. The W array is reinitialized from the
new block, but H is left as it was. When this block is finished, the next one
is started, and so on, until all the 512-bit message blocks have been tossed
into the soup. When the last block has been finished, the five 32-bit words in
the H array are output as the 160-bit cryptographic hash. The complete C code
for SHA-1 is given in RFC 3174.
New versions of SHA-1 are under
development for hashes of 256, 384, and 512 bits, respectively.
In the world of crypto, nothing is
ever what it seems to be. One might think that it would take on the order of 2m
operations to subvert an m-bit message digest. In fact, 2m/2
operations will often do using the birthday attack, an approach published by
Yuval (1979) in his now-classic paper ''How to Swindle Rabin.''
The idea for this attack comes from
a technique that math professors often use in their probability courses. The
question is: How many students do you need in a class before the probability of
having two people with the same birthday exceeds 1/2? Most students expect the
answer to be way over 100. In fact, probability theory says it is just 23.
Without giving a rigorous analysis, intuitively, with 23 people, we can form
(23 x 22)/2 = 253 different pairs, each of which has a probability of 1/365 of
being a hit. In this light, it is not really so surprising any more.
More generally, if there is some
mapping between inputs and outputs with n inputs (people, messages, etc.) and k
possible outputs (birthdays, message digests, etc.), there are n(n - 1)/2 input
pairs. If n(n - 1)/2 >k, the chance of having at least one match is pretty
good. Thus, approximately, a match is likely for . This result
means that a 64-bit message digest can probably be broken by generating about 232
messages and looking for two with the same message digest.
Let us look at a practical example.
The Department of Computer Science at State University has one position for a
tenured faculty member and two candidates, Tom and Dick. Tom was hired two
years before Dick, so he goes up for review first. If he gets it, Dick is out
of luck. Tom knows that the department chairperson, Marilyn, thinks highly of
his work, so he asks her to write him a letter of recommendation to the Dean,
who will decide on Tom's case. Once sent, all letters become confidential.
Marilyn tells her secretary, Ellen,
to write the Dean a letter, outlining what she wants in it. When it is ready,
Marilyn will review it, compute and sign the 64-bit digest, and send it to the
Dean. Ellen can send the letter later by e-mail.
Unfortunately for Tom, Ellen is
romantically involved with Dick and would like to do Tom in, so she writes the
letter below with the 32 bracketed options.
Dear Dean Smith,
This [letter | message] is to give
my [honest | frank] opinion of Prof. Tom Wilson, who is [a candidate | up] for
tenure [now | this year]. I have [known | worked with] Prof. Wilson for [about
| almost] six years. He is an [outstanding | excellent] researcher of great [talent
| ability] known [worldwide | internationally] for his [brilliant | creative]
insights into [many | a wide variety of] [difficult | challenging] problems.
He is also a [highly | greatly] [respected
| admired] [teacher | educator]. His students give his [classes | courses] [rave
| spectacular] reviews. He is [our | the Department's] [most popular |
best-loved] [teacher | instructor].
[In addition | Additionally] Prof.
Wilson is a [gifted | effective] fund raiser. His [grants | contracts] have
brought a [large | substantial] amount of money into [the | our] Department. [This
money has | These funds have] [enabled | permitted] us to [pursue | carry out]
many [special | important] programs, [such as | for example] your State 2000
program. Without these funds we would [be unable | not be able] to continue
this program, which is so [important | essential] to both of us. I strongly
urge you to grant him tenure.
Unfortunately for Tom, as soon as
Ellen finishes composing and typing in this letter, she also writes a second
one:
Dear Dean Smith,
This [letter | message] is to give
my [honest | frank] opinion of Prof. Tom Wilson, who is [a candidate | up] for
tenure [now | this year]. I have [known | worked with] Tom for [about | almost]
six years. He is a [poor | weak] researcher not well known in his [field | area].
His research [hardly ever | rarely] shows [insight in | understanding of] the [key
| major] problems of [the | our] day.
Furthermore, he is not a [respected
| admired] [teacher | educator]. His students give his [classes | courses] [poor
| bad ] reviews. He is [our | the Department's] least popular [teacher | instructor],
known [mostly | primarily] within [the | our] Department for his [tendency | propensity]
to [ridicule | embarrass] students [foolish | imprudent] enough to ask
questions in his classes.
[In addition | Additionally] Tom is
a [poor | marginal] fund raiser. His [grants | contracts] have brought only a [meager
| insignificant] amount of money into [the | our] Department. Unless new [money
is | funds are] quickly located, we may have to cancel some essential programs,
such as your State 2000 program. Unfortunately, under these [conditions | circumstances]
I cannot in good [conscience | faith] recommend him to you for [tenure | a
permanent position].
Now Ellen programs her computer to
compute the 232 message digests of each letter overnight. Chances
are, one digest of the first letter will match one digest of the second letter.
If not, she can add a few more options and try again during the weekend.
Suppose that she finds a match. Call the ''good'' letter A and the ''bad'' one B.
Ellen now e-mails letter A to
Marilyn for her approval. Letter B she keeps completely secret, showing it to
no one. Marilyn, of course, approves, computes her 64-bit message digest, signs
the digest, and e-mails the signed digest off to Dean Smith. Independently,
Ellen e-mails letter B to the Dean (not letter A, as she is supposed to).
After getting the letter and signed
message digest, the Dean runs the message digest algorithm on letter B, sees
that it agrees with what Marilyn sent him, and fires Tom. The Dean does not
realize that Ellen managed to generate two letters with the same message digest
and sent her a different one than Marilyn saw and approved. (Optional ending:
Ellen tells Dick what she did. Dick is appalled and breaks off with her. Ellen
is furious and confesses to Marilyn. Marilyn calls the Dean. Tom gets tenure
after all.) With MD5 the birthday attack is difficult because even at 1 billion
digests per second, it would take over 500 years to compute all 264
digests of two letters with 64 variants each, and even then a match is not
guaranteed. Of course, with 5000 computers working in parallel, 500 years
becomes 5 weeks. SHA-1 is better (because it is longer).
No comments:
Post a Comment
silahkan membaca dan berkomentar