Janne Frösen
Department of Computer Science
Helsinki University of Technology
Janne.Frosen@hut.fi
1. Introduction to Practical Cryptosystems
Cryptography is formally the art of encoding data in a way that only
the intended recipient can decode it, and know that the message is
authentic and unchanged. Cryptography means different things to
different people. Small children play with simple ciphers and
substitution secret languages, bigger children play with crypto
puzzles. Some people are concerned with privacy for various reasons
and use different methods to encrypt sensitive data, with standard
unix "crypt" or "rot-13" variants. None of these have anything to do
with strong encryption and real data security.
Strong encryption is not a technical standard, it means that your encryption cannot be broken by current known methods within feasible time without the data being outdated. Strong encryption can be used to protect your sensitive data against organized crime, government and multinational corporations - all instances with virtually unlimited resources. That has been a cause for recent tries to outlaw strong encryption.
Strong encryption brings many possible applications into daily life. Different applications that require privacy, trust and access control should all use strong encryption methods when possible. Applications include things like electronic money [2, 3], secure communications [4, 5, 5], passwords, and many others. It is in people's own interest that different legal/medical/personal data about their person stays confidential to the instances that have a permit to collect the databases (finnish Tietoturvalaki).
2. Different Types of Cryptosystems
Some cryptograhic methods rely on the secrecy of the algorithms used
in the cipher (security by obscurity). These ciphers are only of
historical interest and are not adequate for a real-world
situation.
All modern algorithms use a key to control the encryption and decryption. The message can only be decrypted if the key matches the one it was encrypted with. The key used for decryption can be different from the key used in encryption, and this divides the algorithms in symmetric (or secret-key) and asymmetric (or public-key) classes.
Modern cryptographic algorithms are meant to be executed by computers or specialized hardware devices for which there are several different cryptographic algorithms and methods. This paper concentrates on those commonly used in data encryption today.
3. Symmetric Algorithms
Symmetric algorithms, also called secret-key algorithms, use
the same key for both encryption and decryption (or in some
cases the key is easily derivable from the other). The key is not to
be leaked to outside enemies (hence the name), should be changed
often and be sufficiently random.
Different symmetric algorithms use different length keys, usually a longer key means higher security.
Symmetric algorithms can be further divided into two categories: stream ciphers, which take and encrypt one bit of plaintext at a time, and block ciphers, which take a number of bits (typically 64) and encrypt them as a single block. Most ciphers belong to the block cipher class.
Symmetric algorithms are generally faster than asymmetric ones and use a much shorter key.
3.1 Data Encryption Standard - DES
DES stands for "The Data Encryption Standard". DES has been certified
by NIST (National Institute of Standards and Technology) for use as
an official US Government encryption standard for less-than-top-secret
secret material. It is recertified every five years. DES was first
certified for government use in 1977, and most recently recertified in
1993, but may not be recertified again [7].
DES is a strong cipher which encrypts a block of 64 bits at a time using a 56 bit key (56 + 8 parity checks = 64) resulting in 64 encrypted bits.
DES encryption itself consists of many rounds of different transformations and permutations, which are linear and easy to reverse. The critical encryption is done using S-boxes. The S-boxes, or substitution boxes, are a set of highly non-linear functions, implemented in DES as a set of lookup tables (of 4 rows and 16 columns). The S-boxes encrypt 4 bits at a time, so encrypting is done in 16 rounds. After the S-boxes the results are still permutated.
The complicated substituting, permutating, XORs and shifts were chosen to have some useful properties:
There are numerous software applications and C libraries with the DES encryption routines widely available. Exporting DES from US is regulated (by NSA - National Security Agency).
A fast 486 PC can encrypt about 400 kb per second using DES [8].
The other more recent method is called differential cryptanalysis. This method reduces the number of keys that must be tested, but it requires that you have 2^47 chosen plaintexts encrypted with the key you are trying to recover. Since it is highly unlikely that anyone will agree to encrypt 2^47 chosen plaintexts with their secret DES key, this attack is unfeasible in practice [7].
Because the DES algorithm is based on the "mysterious" S-boxes, which are just constants without any known connections, it has lead to some rumors that there was a trapdoor in the algorithm [1].
The overall consensus is that DES, when used properly, is secure against all but the most powerful organizations. (like NSA, governments, mafia, big supercomuting/parallel computing companies, etc.). Proper use means avoiding known "weak" keys. (weak keys are result of the key being split to 16 pieces, one for each round of encryption). It's simple to avoid the 4 weak, 12 semi-weak keys and 48 possibly weak keys.
The short key is the main known risk in DES. Given all these points, using simple DES for top-secret data is not a good idea anymore, but is sufficient for everyday use.
3.2 Triple DES - 3DES
If DES with a single key is not sufficiently secure for a given
application, it can be made more secure by encrypting more than once
with different keys. It has been proven that multiple encryptions do
actually improve the security of DES (DES is not an algebraic group!),
and it is thought that triple-encryption with DES is about equivalent
to single-encryption with a 112-bit key [1].
3DES usually uses 2 different keys. First the data is DES encrypted with first key, then decrypted with the second key, and then encrypted with the first key again.
3DES has been proven much more secure than conventional DES, and is a good alternative for current designs. There still remains the same rumors that concern DES (trapdoor?). But as is, it's a better alternative to DES and with current machine technology makes brute force attacks not viable. Applications using DES can easily be converted to use 3DES instead, if the slight effect on speed is not critical.
3.3 International Data Encryption Algorithm - IDEA
IDEA (International Data Encryption Algorithm) is an algorithm
developed at ETH Zurich in Switzerland. IDEA is considered one of the
best and most secure algorithms available today. It uses a 128 bit key
to encrypt 64 bit blocks, and the same algorithm is used for
encryption and decryption. IDEA is a fairly new algorithm, but it has
already been around for several years, and no practical attacks on it
have been published despite of numerous attempts to analyze it.
IDEA mixes algorithms from three algebraic groups: XOR, Addition (modulo 2^16), Multiplication (modulo 2^16 + 1). The 64-bit data is divided into 4 16-bit sub-blocks on which the operations (8 total rounds of swapping, XORring, addition, multiplication).
IDEA is normally used in CBC and CFB modes like DES, and many software implementations are publicly available (PGP being most famous). IDEA is patented in the United States and in most of the European countries. The patent is held by Ascom-Tech. Non-commercial use of IDEA is free. Commercial licenses can be obtained by contacting idea@ascom.ch. (Patent not valid in Finland).
A Triple-IDEA (similar to 3DES) cipher should be sufficient for even the most paranoid people.
The main risk in IDEA is its relatively short age (in cryptography an algorithm needs to be around a couple of decades before it starts to get believed truly unbreakable). Only time will prove the strength of IDEA, authorities today think it is safe.
3.4 Skipjack
Skipjack is an NSA-developed encryption algorithm for the Clipper (and
Capstone) chips. The algorithm is classified as secret. What is known
about it is that it's symmetric, uses a 80-bit key and has 32 rounds
of processing per each encrypt or decrypt operation.
The Clipper-chip is a commercial chip made by NSA for encryption, using the Skipjack algorithm. At least AT&T will be using the Clipper for encrypted voice phonelines.
Clipper uses Skipjack with two keys. The idea behind it is that anyone who knows the chip's "master key" can decrypt all messages encrypted with it. So, basically NSA could decrypt Clipper-encrypted messages with this backdoor. This method of tampering with the algorithms is called key escrow.
There are many movements and foundations campaigning against the Clipper-chip, concerned about privacy. There are rumors that a similar chip nicknamed EuroClipper could be introduced within EU, but the latest more direct news convince this to be just a rumor.
3.5 RC4
RC4 is a cipher designed by RSA Data Security, Inc. It used to be a
trade secret, until someone posted the source code in Usenet
News. The main good point in RC4 is it's speed, the algorithm is very
fast. Several hundred kb per second can be encrypted on a fast 486
PC. It is unknown if RC4 is a strong cipher, but at least there hasn't
been confirmed proof against it [8].
It is interesting to know that the exportable version of SSL (Netscape's Secure Socket Layer), which uses RC4-40, was recently broken by at least two independent groups. Breaking it took about eight days [9].
3.6 Others
Some other more or less known symmetric ciphers are:
| Cipher | Security | Speed (486 pc) | Key length |
|---|---|---|---|
| DES | low | 400 kb/s | 56 bits |
| 3DES | good | 150 kb/s | 112 bits |
| IDEA | good* | 200 kb/s | 128 bits |
| 3IDEA | very good* | ~100 kb/s | 256 bits |
| Skipjack | good* | ~400 kb/s | 80 bits |
| CLIPPER chip | "good"** | - | 80 bits |
* the algorithm is believed to be strong
** the algorithm itself is good, but it has a built-in weakness
The speed at which computing power has been increasing (16 times more powerful now than 10 years ago) compared to the key-lengths of symmetric algorithms leads at a rough formula of adding one bit to key-length for every year the algorithm is wanted to still hold against brute force attacks, keeping the normal 56-bit key of DES at today's standard. That means a 112 bit 3DES key should be expected to hold at least for about 50 next years, unless any extremely dramatic technical breakthroughs are achieved.
4. Asymmetric Algorithms
4.1 RSA
RSA is a public-key cryptosystem for both encryption and
authentication, it was invented in 1977 by Ron Rivest, Adi Shamir, and
Leonard Adleman (RSA). RSA is the most widely used public-key
cryptosystem today and has often been called a de facto standard.
RSA works as follows: take two large primes, p and q, and find their product n = pq. Choose a number, e, less than n and relatively prime to (p-1)(q-1), and find its inverse, d, mod (p-1)(q-1), which means that ed = 1 mod (p-1)(q-1); e and d are called the public and private exponents, respectively. The public key is the pair (n,e); the private key is d. The factors p and q must be kept secret, or destroyed.
It is difficult (presumably) to obtain the private key d from the public key (n,e). If one could factor n into p and q, however, then one could obtain the private key d. Thus the entire security of RSA is predicated on the assumption that factoring is difficult; an easy factoring method would break RSA.
There are many applications today using RSA, most notably PGP and Ssh.
Encryption and authentication take place without any sharing of private keys: each person uses only other people's public keys and his or her own private key. Anyone can send an encrypted message or verify a signed message, using only public keys, but only someone in possession of the correct private key can decrypt or sign a message.
There are many commercially available hardware implementations of RSA, and there are frequent announcements of newer and faster chips. The fastest current RSA chip has a throughput greater than 600 Kbits per second with a 512-bit modulus, implying that it performs over 1000 RSA private-key operations per second. It is expected that RSA speeds will reach 1 Mbit/second within a year or so.
By comparison, DES is much faster than RSA. In software, DES is generally at least 100 times as fast as RSA. In hardware, DES is between 1,000 and 10,000 times as fast, depending on the implementations. RSA will probably narrow the gap a bit in coming years, as it finds growing commercial markets, but will never match the performance of DES.
Another way to break RSA is to find a technique to compute e-th roots mod n. Since c = m^e, the e-th root of c is the message m. This attack would allow someone to recover encrypted messages and forge signatures even without knowing the private key. This attack is not known to be equivalent to factoring. No methods are currently known that attempt to break RSA in this way.
RSA is very vulnerable to chosen-plaintext attacks, and a good guess can reveal the used key. It is also advisable to include some random data (at least 64 bits) to the encrypted plaintext.
4.2 Diffie-Hellman
Diffie-Hellman is a commonly used public-key algorithm for key
exchange. It is generally considered to be secure when sufficiently
long keys and proper prime generators are used. The security of
Diffie-Hellman relies on the difficulty of the discrete logarithm
problem (which is believed to be computationally equivalent to
factoring large integers).
4.3 Digital Signature Standard - DSS
DSS stands for United States government Digital Signature
Standard. It's security is based on the discrete logarithm problem
(like Diffie-Hellman), although it's design has not been made public.
DSS shouldn't be used since both Diffie-Hellman and RSA methods are better.
4.4 Others asymmetric cryptosystems
Some other symmetric ciphers include:
| Cipher | Security | Speed | Key length |
|---|---|---|---|
| RSA | good | fast | varies(1024 safe) |
| Diffie-Hellman | good | slower than RSA | varies(1028 safe) |
| DSS | low | - | 512 bits |
5.1 Message Digest Algorithm 5 - MD5
Message Digest Algorithm 5 (MD5) is a secure hash algorithm developed
at RSA Data Security, Inc. It can be used to hash an arbitrary length
byte string into a 128 bit value. MD5 is in wide use, and is
considered reasonably secure.
MD5 processes the input text in 512-bit blocks, divided into 16 32-bit sub-blocks. The output is a set of 4 32-bit blocks, which are concatenated to a single 128-bit hash value.
Still, MD5 is considered to be relatively secure and good enough for most purposes.
5.2 MD2, MD4
MD2 and MD4 are earlier attempts of RSA Inc. They have more flaws and
it's recommended to use MD5 instead.
5.3 Secure Hash Algorithm - SHA (SHS)
Secure Hash Algorithm (SHA), also Secure Hash Standard (SHS) was
published by the US Government. It produces an 160 bit hash value
from an arbitrary length string.
SHS is structurally similar to MD4 and MD5. It is roughly 25% slower than MD5 but may be more secure, because it produces message digests that are 25% longer than those produced by the MD functions.
6. Zero Knowledge Systems
Abstractly, a zero knowledge proof is an interactive proof with a
prover and a verifier, where the prover convinces the verifier of a
statement (with high probability) without revealing any information
about how to go about proving that statement. Mathematical proof is
found in [10].
Zero knowledge proofs can be used for authentication before exchanging key information. Details about the zero knowledge systems can be found in Hannu Aronsson's paper [11].
7. Conclusions
The best applications combine different cryptosystems. The primary
advantage of public-key cryptography is increased security: the
private keys do not ever need to be transmitted or revealed to anyone. In
a secret-key system, by contrast, there is always a chance that an
enemy could discover the secret key while it is being transmitted.
Another major advantage of public-key systems is that they can provide a method for digital signatures and authentication. Secret-key systems would require a third party for this.
A disadvantage of using public-key cryptography for encryption is speed.
For encryption, the best solution is to combine public- and secret-key systems in order to get both the security advantages of public-key systems and the speed advantages of secret-key systems. The public-key system can be used to encrypt a secret key which is then used to encrypt the bulk of a file or message.
| Plaintext Cleartext |
The original data. This may be a text message, file or a stream of communication |
| Encryption | Encoding a message so that hides it hides the contents from outsiders |
| Ciphertext | The encrypted message |
| Decryption | Retrieving the plaintext from ciphertext |
| Cipher | A method of encryption and decryption |
| Key | A key is usually used in encrypting, decrypting is only possible by knowing the key |
| Cryptography | Art or science of encoding data and keeping the message secret |
| Cryptoanalysis | Art of breaking ciphertext without knowing the key |
| Cryptology | Branch of mathemathics that studies cryptographic methods |
There are a huge number of governmental and other organizations that deal with cryptography. In the following text we ofter refer to some of them. The most important US governmental organizations linked with cryptoglogy can be found in the following HTML3 table.
| NSA | National Security Agency |
| NIST | National Institute of Standards and Technology |