Cryptanalysis is the study of ciphers or ciphertexts to find weaknesses in them, which will let/permit the attacker to retrieve the plaintext from the ciphertext (without necessarily knowing the key). Cryptanalysis, in fact, is part of the cryptology science. There are several techniques used in cryptanalysis that can decrypt a message such as, substitution cipher, brute force, frequency analysis and more.
Substitution cipher encrypts letters rather than bits. The idea is to replace/substitute each occurrence of a plaintext letter with the same ciphertext letter randomly. The table below shows the substitution table.
Plaintext letters |
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z |
Ciphertext letters |
D K V Q F I B J W E P S C X H T M Y A U O L R G Z N |
For example, CARDIFF is encrypted to VDYQWII; also, HELLO is encrypted to JFSSH.
The advantage of this substitution cipher would ensure that this message can be read only by Alice and Bob, since it means nothing to Oscar without knowing the key (substitution table).
Note: When replacing the letters, you cannot replace one plaintext with two ciphertext or vice versa. Every plaintext should be substituted with only one ciphertext.
In order to break this cipher, the attacker could implement some techniques to decrypt the encrypted message (original message) such as, the brute force attack or the frequency analysis. Brute force is a cryptanalysis attack, which is trying all the possible combinations until the message is decrypted. If the attacker is lucky he/she would be able to find the key quickly, otherwise it might take a very long time. Some software, nowadays, go through all the possible combinations until they find the key.
The question is: How many possible substitution tables are there? and the simple answer would be about 2^{88}. Since we have 26 letters and we would like to try all the possible combinations, the result would be:
26 x 25 x 24 x 23 x 22 x ……………. x 4 x 3 x 2 x 1 = 26! ≈ 2^{88} |
However, if you have modern computers with a high process and memory space, you would be able to break the key faster than old computers (with old hardware). This attack is also known as an exhaustive key search.
The second technique is the frequency analysis. This is the process of replacing plaintext letter by the same ciphertext letter. Note: Plaintext letter frequencies are not identical. In addition to that, hackers can use frequencies of letter pairs or triples such as, ‘th’ , ‘the’ , ‘as’ , ‘he’ , ‘she’, ‘I’m’, ‘is’, ‘are’ and many more. The table below shows the most common 5 letters use in English frequently.
Note: Even though the substitution cipher has a sufficiently large key space of 2^{88}, but it can easily be defeated with analytical methods.