Future Beacon


Compiler Assisted Encryption Key Distribution

by James Adrian



Background


Absolute Security

      Claude Shannon[1] proved that any encryption algorithm possessing these characteristics is absolutely secure:

      1. The encryption keys must be random numbers of uniform distribution.

      2. The keys must be shared in absolute secrecy by the sender and receiver.

      3. Any key encrypting a message must be as at least as long as that message.

      4. Any key used to encrypt a message must not be reused.


The One-Time Pad

      The most famous encryption algorithm having all of these characteristics is the One-Time Pad. By using a random key to encrypt a message (the plain text) with the XOR logical operation, the transmitted result (the ciphertext) is rendered as random as the key. The collection of secret keys shared only by the sender and the receiver is called the pad. Keys are of any length not exceeding that of the pad. Their length matches that of the messages they encrypt. They are erased immediately after their use.


Randomness

      Of the foregoing four security requirements, number 3 and number 4 are not at all difficult to satisfy. They are software features that merely need to be deliberately included; but requirement number 1 is a serious and exacting matter. Regardless of the source of the numbers, the random pad must pass the most thorough test of randomness.


The Secrecy of Keys

      Requirement number 2 is also a serious matter and the one that has the most to do with the lack of customers for One-Time Pad programs. Requirement number 2 states that the keys must be shared in absolute secrecy by the sender and receiver. Currently, hand-delivery is the only method used to securely provide secret keys to both the sender and the receiver.



Introduction


      It is known that encryption can enforce absolute privacy if the encryption keys are not delivered through the communication channel, but instead, delivered by hand. Every attempt to achieve this level of security through a published algorithm that delivers its keys through the communication channel, or reuses keys, has failed.

      This article presents a type of encryption algorithm that has two essential features:

      1.) A protocol is defined that enables the sender to employ a compiler to modify the software and data that is held and used by the sender, and to remotely modify the software and data held and used by those who receive encrypted messages from the sender. This is done to deny the wiretapper or opponent (hereinafter the hacker) knowledge of the algorithm, except in the initial case. The compiler employed for this purpose is bundled with the original software that is installed in the computers that are located with the sender and the receivers. The initial algorithm is the One-Time Pad.

      2.) Transformations are defined and used by the algorithms in an attempt to disguise, randomize and double the size of previously transmitted strings of data. This is undertaken to produce pseudorandom keys having a uniform distribution. The transformations are designed to be exceedingly difficult to reverse.

      The first feature is intended to force hackers to obtain clues about any given message exclusively from a statistical analysis of the ciphertext. The transformations of the second feature are intended to result in shared, pseudorandom keys that pass the most thorough test of randomness. If both features are effective, all possible plaintext messages appear to be equally likely. In such a case, the plaintext messages are secure.

      The algorithm changes from the initial one-time pad algorithm to an unpublished algorithm, and then to a series of such algorithms in order to securely distribute encryption keys on the communication channel. This avoids the task of delivering keys by hand more than once. Please notice that hand delivery is not a requirement of secure encryption algorithms. The issue is the secrecy of the keys. Please also notice that no particular source or type of source of randomness is required of a secure encryption algorithm. If keys pass the most thorough test of randomness, they are no less random on account of being generated by software or hardware.



The Protocol


      Any finite string of data that passes the most thorough test of randomness is nonetheless considered not random if it is famous or well known or discoverable by the hacker. Cryptographic randomness requires secrecy. However convincingly random, data created by software is useless to an encryption effort if the software methods are not secret; therefore, any software method that generates random encryption keys cannot be distributed publicly. If such software or its source code is sent over the communication channel, it must first be securely encrypted.

      Random numbers generated by software can be generated identically in two places or in a number of specific places without becoming known to unauthorized parties. This is an important advantage over hardware generators which cannot duplicate their output in multiple places without being sent over the communication channel or hand delivered. However, this advantage is worthless unless the random strings are random to such an extent that no analysis can uncover information within them.

      Taking this into account, the features of the protocol include the following:

            Each sender is responsible for managing and sharing a pad of keys with a receiver. In the event that the sender sends encrypted messages to more than one receiver, the sender is responsible for managing and sharing a separate and unique pad with each individual receiver.

      After the installation of the software, the initial pads are installed from hand-delivered DVD's containing random numbers generated by hardware random number generators. Communication proceeds in accordance with the one-time pad algorithm until the sender decides that a diminishing pad needs to be appended with new random numbers.

      When this happens, the sender sends a message to the receiver that is encrypted by the existing pad. This message contains the source code for a program modification that is implemented by another program that uses the compiler. Let's call this other program the Installer to distinguish it from the Main Program. The compiler is a program in itself as well, but we can continue to call it the Compiler. Both the compiler and the installer come with the main program as the Software. The software is delivered by hand.

      By these means, the sender can modify the main program, create a Library of programs and functions, and create transformations shared only be the sender and receiver.

      This asymmetrical relationship between the sender and the receiver means that the user of the encryption software must, in each instance, relate to the software as either a sender or a receiver. These two modes of operation together with other functional aspects of the software requires more complexity than is required in the case of the one-time pad, but this does not mean that the user's experience needs to be complicated. That is completely under the control of the software designers. Almost all of the procedures can be automatic.

      The sender has the option to send encrypted random numbers to a receiver. The key used to encrypt this transmission is discarded. In this example, the random numbers thus sent are operated on by a transformation, resulting in a string that is extremely different from the original string and twice the size of the original string. The string can be appended to the pad. This operation is prompted by a code that is sent to the main program. Incoming messages sent to modify the pad must therefore be distinguished from ordinary communication and from messages that prompt program modifications. This makes for at least three classes of messages differentiated by a code written into some specified part of the encrypted messages.

      Because the pad can be modified, expanded, and updated by the sender, sharing the entire pad with the receiver is unnecessary. If messages are always less than a few kilobytes in size, there is no need to maintain a much larger pad at the receiver's location. Maintaining two megabyte pads for each of 400 receivers is not impractical. Also, receiver pad sizes need not be identical.

      Including a compiler for adding functions does not prevent the software from being used as a one-time pad when the need for secrecy is extreme and the availability of tested transformations is lacking.



The Transformations


      The requirement that no key shall be used twice brings up an interesting issue. Each truly random key of any substantial length contains about as many 1's as 0's. A random string of a million bits could be within 5 or 20 of having the same number of them. Clearly, it is not the balance of 1's and 0's that distinguishes keys. It is the order in which these two kinds of bits appear in the string that makes a pair of keys identical, similar, different, or statistically unrelated.

      Starting with a random key and transforming it to another key could proceed repeatedly to such an extent that further transformations could make the newest key slightly more similar to the original key as often as it makes it slightly more different. One could imagine transforming successive versions of a key until there is no correlation with the original key to be found in the latest result.

      Here is the situation that a hacker must confront: An input string is transformed into an output string with an algorithm that is not known. The output string passes the most thorough test of randomness. There is no correlation of any kind between the input string and the output string. Which input string was transformed into the output string is not known. (Strings can be constructed from parts of previous strings before being transformed further.)

      While there are functions or transformations that cannot seem to be reversed, as yet there is no proof that a transformation can be absolutely irreversible. Algorithm writers attempting to create one-way transformations seem to be tempted away from the most promising methods by an unfortunate inadequacy in today's computer hardware and programming languages. The byte, and not the bit, is the smallest unit of data that has its own numerical address in a computer. To access the third bit of a byte, for example, one must first access the byte and then manipulate the bits of the byte in various ways. If many such accesses are required, a program could take a lot of time to run. This creates the temptation to devise algorithms that manipulate bytes only.

      Bit transformations are computationally intensive. This is a case where we need to recognize that the capability of computers is increasing rapidly and we need to recall that their capability has been increasing rapidly for a long time. If we only write algorithms for today's computers, we will arrive at some results later than we could. The computationally intensive nature of bit transformations will be ameliorated in part by assigning hardware addresses to bit locations. In any case, the entire concern over algorithms of this type being too slow will be addresses in due course by faster computers.



The Compiler


      Certain difficulties associated with getting programs through some email clients prompt the creation of a specially designed compiler.

      Writers of computer program languages normally must address many concerns that do not apply in this case. For use in this system, source code specifying any program or library function can be comprised of a string of text characters that each have a purpose, and whose meaning depends on their order, but yet do not form readable statements or commands.



Conclusion


      The protocol provides the secrecy required of the keys shared between the sender and receiver if the transformations of sent data are effective in removing all correlations with that sent data. The four requirements of absolute security are satisfied if requirement number 1 is satisfied by the transformations.

      The benefit of compiler assisted encryption key distribution over known, high security encryption algorithms, is its ability to deliver new random keys over an unlimited span of time without hand delivery.


****************************************************


References

      1. "Claude Elwood Shannon - Collected Papers" edited by N. J. A. Sloane and Aaron D. Wyner.


Contact

      Please feel free to write to me directly for more information or to make suggestions or comments. My email address is jim@futurebeacon.com. You can also go to my contact page to get my full contact information. Suggestions, questions, additional information and critiques are very welcome.