The RSA cryptographic cipher is the most widely used example of public key cryptography. RSA exploits the mathematical properties of modular exponentiation and relies on the difficulty believed to be inherent in factoring large prime numbers and in taking discrete logarithms.
The two fundamental equations governing RSA are
C Pe (mod n)
P Cd (mod n)
Where P is the plaintext, C is the ciphertext, e is the encryption exponent, d is the decryption exponent, and n is the system modulus. The Public Key consists of the encryption key and the modulus, [e, n], while the private key consists of the decryption key and the modulus, [d, n].
In order to work, the encryption and decryption exponents must be chosen such that
P Cd (mod n) (Pe)d (mod n)
P P(ed) (mod n)
Since exponents are reduced modulo the totient of the system modulus, a more explicit representation of the above equations would be
P P((ed) mod (n)) (mod n)
Hence the requirement is that the decryption key must be the modulo (n) multiplicative inverse of the encryption key.
Recall that the public key is going to consist of the encrypting exponent, e, and the overall modulus, n. The piece of information that an attacker needs to acquire to break our system is the multiplicative inverse of e which they can easily determine if they can determine (n). However, the totient function requires knowledge not of n, but of its prime factorization. Hence, given n, the attacker must first factor it into its prime constituents. It is believed that, like the discrete logarithm problem, this is a very difficult task.
While the following algorithm implements the heart of the RSA cipher, it does so using tiny numbers for illustration purposes. Furthermore, it does not embody the extensions and techniques, such as message padding, necessary to create a secure cryptosystem in practice.
Step 1: Pick a modulus n.
Step 2: Determine m = (n).
Step 3: Pick a value of e that is relatively prime to m.
Step 4: Determine the value d, which is the modulo-m inverse of e. Note that d will only exist if e and m are relatively prime.
Step 4: Publish the pair (e, n) as the Public Encryption Key.
Step 5: Retain the pair (d, n) as the Private Decryption Key.
Step 1: Take the plaintext message and convert it to a suitable sequence of integers such that, given that sequence, the plaintext can be uniquely recovered from it. We'll call this the "plaintext sequence". Note that this mapping can be anything, as long as it is reversible. In general, we want this mapping to be such that most, if not all, of the legal values for P are possible and, ideally, that are all equally likely. More will be said concerning the consequences of not doing this later.
Step 2: Generate the sequence of integers that will constitute the ciphertext sequence by raising each integer in the plaintext sequence to the power e and reducing the result modulo-n.
Step 3: Pack the ciphertext sequence into a cipher packet that is suitable for transmission.
Step 4: Transmit the cipher packet to the intended recipient.
Step 1: Receive the cipher packet from the sender.
Step 2: Extract the ciphertext sequence from the cipher packet.
Step 3: Recover the plaintext sequence by raising each integer in the ciphertext sequence to the power d and reducing the result modulo-n.
Step 4: Recover the plaintext from the plaintext sequence.
Step 1: Pick a modulus n. (n = 55) (p = 5, q = 11)
Step 2: Determine m = (n). (m = 40) (m = (5-1)(11-1))
Step 3: Pick a value of e that is relatively prime to m. (e = 3)
Step 4: Determine the value d, which is the modulo-m inverse of e. (d = 27)
Step 5: Publish the pair (e, n) as the Public Encryption Key. (3, 55)
Step 6: Retain the pair (d, n) as the Private Decryption Key. (27, 55)
Step 1: Convert plaintext to a suitable plaintext sequence of integers.
Let's send the message "RSA".
Let's convert this to a sequence of integers based on the position of each letter in the alphabet.
MSG = "RSA"
P = {'R', 'S', 'A'} => {18, 19, 1}
Step 2: Form the ciphertext sequence by encrypting each integer in the plaintext sequence.
C = Pe mod n = {183 mod 55, 193 mod 55, 13 mod 55}
C = {5832 mod 55, 6859 mod 55, 13 mod 55} = {2, 39, 1}
Step 3: Form the cipher packet.
Since each integer in the sequence is at most two digits, we'll make them all two digits by padding with leading zeros if necessary. We'll then form the packet by simply concatenating all of the integers together.
Packet = "023901"
Step 1: Obtain the cipher packet.
Packet = "023901"
Step 2: Extract the ciphertext sequence from the cipher packet.
C = {2, 39, 1}
Step 3: Form the plaintext sequence by decrypting each integer in the ciphertext sequence.
P = Cd mod n = {227 mod 55, 3927 mod 55, 127 mod 55}
P = Cd mod n = {18, 19, 1} (using the Square and Multiply Technique for modular exponentiation)
Step 4: Recover the plaintext from the plaintext sequence.
P = {18, 19, 1} = {'R', 'S', 'A'}
MSG = "RSA"
While the fundamental security underpinnings of RSA are believed to be on solid ground, that is not sufficient to create a secure cipher in practice. This section will briefly discuss just some of the issues that must be dealt with to achieve that goal.
RSA is fundamentally a block cipher - the plaintext is broken into blocks and each block undergoes the encryption process. The simplest block size to use would be single characters. Leaving aside the fact that we want a cipher than can encrypt any form of data, not just plain text (despite the name), doing so would be disastrous since the end result would be a simple substitution cipher. As mentioned in the RSA-Lite section, the blocksize should be chosen so that the blocks used can take on a large fraction of the possible values in the plaintext space. If the system modulus is 2048 bits, then the plaintext should be a significant portion of this length.
There are also pathological block values, namely 0 and 1, that will always be unchanged in the encrypted text. Any time an attacker sees one of these values, they know what the corresponding block of plaintext is. A conscientious implementation will ensure that these values never occur, even if they do occur in the original plaintext - which means that a reversible means of removing them and then restoring them after decryption is required.
Small block values present additional problems since small encryption exponents, such as 3, are frequently used. Unless the plaintext raised to the encryption exponent exceeds the modulus, no modulo reduction will occur and then normal logarithms will produce the correct result. An attacker could assume that some fraction of the blocks will fall into this category simply decrypt all of the blocks accordingly and then look for blocks that appear to have actually been decrypted properly.
The basic cipher is also deterministic, meaning that every time the same block is encrypted it always results in the same block of ciphertext. If the data being encrypted is rigidly formatted, then there is the possibility that some blocks, such as file headers, will appear in many different ciphertexts. Since the attacker has the public key, they have the ability to encrypt many different types of files and build up a table of commonly occurring ciphertexts and what they mean. But even if the attacker doesn't know what that block represents, just knowing that files of a certain, if unknown, type are going to certain places leaks information. As does the ability to track entire documents if the same document coming from multiple places but all going to the same destination all produce the same ciphertext.
There are also a whole host of attacks that go after the peculiarities of particular implementations instead of the cipher itself. For instance, if the attacker has the ability to monitor how much power is consumed or how much time is required to encrypt - or decrypt - each block then they might be able to gain information about the structure of the key and/or the message.
To overcome most of these issues stochastic padding schemes can be used. To give just a feeling for how an ad-hoc padding scheme might be used, consider that the sender wants to encrypt blocks consisting of 2048 bits. They could build up that block by first generating five bits of random data, then taking 120 16-bit words from the data with a single random bit after each inserted word (including the last one) and ending with a final five bits of random data. The receiver, of course, would simply strip out the random bits.
Actual padding schemes, such as OAEP - Optimal Asymmetric Encryption Padding, are very carefully constructed by people that are among the best in the world at what they do and then are continually subjected to attack by many individuals on an ongoing basis. The reason is quite simple - the people attacking these algorithms for real are also among the best in the world at what they do.