Encryption for Privacy
by James Adrian
Many individuals, businesses
and other organizations have a legitimate and lawful need to keep their intellectual property and many other sorts of information confidential.
Because email has become vital to our work and because email is not secure, encryption is often necessary.
organizations and officials often attempt to limit the use of encryption. By citing terrorism and other justifications, law makers in the United States
have reinterpreted the Fourth Ammendment to the U. S. Constitution, and effectively modified it. The email messages sent and received by American
citizens can now be copied and saved indefinitely in case they might help justify a criminal indictment someday in the future. Perhaps many people
are afraid enough of terrorism to give up some aspects of their constitutional rights, but many are not.
Cryptography and Liberty 1999
, there are no laws restricting the private use of
cryptography in Canada and there are no domestic use or import controls on cryptography within the United States; although, there are restrictions
on export to certain countries.
Restrictions on domestic use
and export were not always as fair as they are today. In the case Bernstein v. Department of Justice in May of 1999, a federal appeals court affirmed
the judgment of a lower court, holding that the Export Administration Regulations unconstitutionally limited the freedom to distribute encryption
software. The court said the following:
The government defendants
appeal the grant of summary judgment to the plaintiff, Professor Daniel J. Bernstein ("Bernstein"), enjoining the enforcement of certain Export
Administration Regulations ("EAR") that limit Bernstein's ability to distribute encryption software. We find that the EAR regulations operate as
a prepublication licensing scheme that burdens scientific expression, vest boundless discretion in government officials, and lack adequate
procedural safeguards. Consequently, we hold that the challenged regulations constitute a prior restraint on speech that offends the First Amendment.
Although we employ a somewhat narrower rationale than did the district court, its judgment is accordingly affirmed.
The lifting of unreasonable
restrictions might not have taken place were it not for the work of many organizations advocating privacy. See
The opponents of encryption
would seem to have taken another tact. I am continually amazed by the encryption propaganda and manipulation that has been introduced into our
culture. In addition to the many government-created, government promoted encryption standards that can
be decrypted by hackers, there
is a discouraging narrative that has been repeated by several news sources, commentators, movies and sit-com episodes. The theory presented is that
no cryptic message can truly facilitate privacy. They say "What one man can assemble, another can disassemble." This is provably, absolutely false.
I will describe the unbreakable One-Time Pad
and its proper use.
The One-Time Pad
Information conveyed over the
Internet is represented by numbers whose digits include only 0 and 1. Each digit is called a bit
. An "E" is a string of 8 bits equal to 01000101.
A memory cell containing 8 bits is called a byte
. There is a unique byte code for each character that we use in our email messages.
The purpose of encryption is to
create as much uncertainty as possible as to which character was present in the message before it was encrypted. The one-time pad does this by adding
(without carry) a separate and randomly chosen bit to each bit in the message. Like the regular operations of adding or subtracting, add without
is an operation. Adding bits A and B, the following sums are created:
A B Sum
0 0 0
0 1 1
1 0 1
1 1 0
Notice that the sum of 1 and 1
in binary is 10, but the carry bit is not included in the sum by the add-without-carry operation. Only the least significant bit of the sum is used (in this
The message is called the
. The encrypted message which is to be sent and received is called the ciphertext
. The bits that are used to encrypt the
message come from a store of random bits called the pad
. The pad has a uniform distribution
of 1's and 0's. Here is the way an
encryption is usually written:
Plaintext 0 1 0 0 0 1 0 1 = E
Bits from Pad 0 0 1 0 1 1 1 1
Ciphertext 0 1 1 0 1 0 1 0
In this case, the message is the
capital letter , E. The random bits of the pad together with the add-without-carry operation produce an entirely arbitrary character code.
The receiver of the message
decrypts it by using the same string from the secret pad shared with the sender:
Ciphertext 0 1 1 0 1 0 1 0
Bits from Pad 0 0 1 0 1 1 1 1
Plaintext 0 1 0 0 0 1 0 1 = E
Notice that the same
numbers used to encrypt the message are used to decrypt the message. This is the characteristic shared by symmetric-key algorithms
. In the case of the one-time pad, a string form the pad is used for these purposes, but for other algorithms and in general,
such a string is called the key
operation is usually called the Exclusive-Or
operation. The sum of A and B is 1 if ether A or
B is 1 unless both are 1. If A and B
are both 1, the sum is 0. (The 1's are excluded
). The exclusive-or operation is notated as XOR. A XOR B = C (a ciphertext bit).
Claude Shannon proved that
the one-time pad is unbreakable provided that the key is truly random, it is shared only by the sender and receiver, and no string of numbers from the
pad are used twice - thus the name, One-Time Pad
I recommend the book
"Claude Elwood Shannon - Collected Papers" edited by N. J. A. Sloane and Aaron D. Wyner.
So Why Are We Not Secure In Our Papers?
There are articles all over the
Internet about the one-time pad. They quickly praise its security and then virtually all of them go on to bemoan the inconvenience of sharing a secret
pad. Imagine the indignity of handing off a big box of duplicated data in a restaurant or on a quiet street. It better be a big box. I'm not in Casablanca
very often. This reminds me of a famous quip by Yogi Berra:
Nobody goes there anymore. It's too crowded.
What does the military or
the State Department do? This is the age of the Internet and of DVD's, and an age in which 10% or more of the population is smart enough to invent
things they have never seen or heard of. (That is not to say they are all trying hard and cooperating.) A large fraction of the population has accesss to
a computer both in their home and at work. Pseudorandom numbers are getting less pseudo every day. UPS and armoured car services are very
affordable. The same secret pseudorandom numbers can be generated in two places at once. This is not the 1600's.
The bias against the use of
the one-time pad has been effective in diminishing the number of customers looking for it. This is despite the fact that variations on the one-time
pad that might arguably provide somewhat less than absolute security can be far more secure than the government-promoted encryption programs.
Besides, shipping DVD's
works. According to
the average size of an email message is 75 KB. That includes all of the unnecessary quoting of the entire thread, and it includes advertising
and news letters sent to you, and large attached pictures and videos - everything. If you encrypted and sent that much ten time per day to your Philadelphia
office, you would use up less than 274 megabytes per year on that secure connection. A single DVD is about 16 times that size. 20 DVD's might last
you until you retire.
Is It Random?
If every 1 in a file is immediately followed by a 0, and every 0 is immediately followed by a 1, there is an obvious pattern (1 0 1 0 1 0 . . . ). On the other hand, if in a long file, a 1 is followed just as often by a 1 as by a 0, and a 0 is followed just as often by a 0 as by a 1, we need to look further if a pattern is to be found. In such a case, a pattern (at a distance of a single bit place along the succession of bits) has not been found. If we find this to be true or false by using arithmetic, we call the calculation a correlation
. If this correlation is equal to zero, the pattern we have tested for has not been found. Correlations greater than zero indicate some degree of pattern.
The arithmetic needs to be as simple as possible because it needs to be repeated for different distances along the string of bits and for different combinations of distances along that string. These add up to a great many calculations. The best way to do this is to change all of the zeros to minus ones so that pairs of each distance can be multiplied together to obtain either a 1 or zero. The sum of each such multiplication is a correlation. Here are two examples where N is the length of the string in bits:
The sum of (xi
) as i goes from 1 to N-1. This correlates pairs at a distance of 1.
The sum of (xi
) as i goes from 1 to N-2. This correlates pairs at a distance of 2.
To correlate pairs at every possible distance, take the sum of (xi
) as i goes from 1 to N-j for every j from 1 to N-1. A separate correlation is calculated for each j. If each of these N-1 correlations are very near zero and N is large, we might be encouraged to believe that the string is random, but then all the triples and larger combinations of fixed distances must be addressed, such as the sum of (xi
) as i goes from 1 to N-28.
For correlations such as these, where a degree of correlation is assigned to a string without relating it to another string, the term becomes autocorrelation. So what we have here is a separate autocorrelation being calculated on each of very many subsets of the file of bits. Only if every such calculation is very near zero do we say that the file is random.
The number of calculations
is truly enormous. For this reason, there are many estimates of randomness available in the form of perhaps complex but nonetheless inadequate
formulas. Trusting in those which do not examine
of the distance combinations is a big mistake.
Two-Time Pads Don't Work
If all of your algorithm is
published, and your algorithm uses part of a pad string twice to encrypt two pieces of plaintext, the hacker has an advantage. If a hacker knows that
your correctly spelled and grammatically correct messages are, in some known instances, encrypted by the same string, the following will take place:
I will meet you at the usual place tomorrow at the usual time.
I cannot believe you were not there when I was!
Each x and each y is an 8 bit
string (the byte code). Both messages start with the capital letter, I, and a space following it. That's 16 bits in a row that are the same in the two
plaintext messages. Where both the plaintext and the pad strings are the same, the ciphertext will be the same. The hacker is looking at the
cypertext. Where whole bytes are the same, the same characters are being spelled out in both plaintext messages. Although this happens nowhere
else in the two messages, there are plenty of places within each byte where the bits can be known to be either the same or different. This narrows
the possibilities and gives the hacker leads for an investigation.
Since I am on the side of
privacy and not on the side of the hackers, I will say no more about code breaking. Just don't use the same pad string for two pieces of plaintext.
Please feel free to write to me
directly for more information or to make suggestions or comments. My email address is
. You can also go to my contact page
to get my full contact information. Suggestions, questions, additional information and critiques are very welcome.