"Brute
force" - attacks to break the Cipher are hopeless since there are 26! = 403291461126605635584000000
or about 4 *
1026
many possible ways to encode the 26 letters of the English alphabet. In order
to crack
the random substitution cipher, however, we take advantage of the fact that the underlying letter
frequencies of the original plain text don�t get lost. This fact makes the
Random Substitution Cipher very susceptible to Cipher attacks. An eavesdropper
literally just needs to count the letter frequencies of the Cipher letters.
Recall that the most frequent letters in the English language are ETNORIA
which � except for the O - occur even as the most frequent letters in the
brief virus carrier message. And the longer the messages are the more do the
relative frequencies of the cipher letters approach the expected frequencies.
Let�s take a
look at another random-substituted encrypted message:
q |
v |
v |
|
j |
d |
s |
|
j
|
q
|
o
|
s
|
u
|
|
y
|
q
|
c
|
w
|
|
a
|
l
|
|
j
|
d
|
s
|
|
q
|
i
|
s
|
n
|
q |
e |
s
|
|
q
|
t
|
s
|
n
|
c
|
z
|
q
|
g
|
|
q
|
n
|
s
|
|
u
|
y
|
s
|
g
|
j
|
a
|
l
|
|
j
|
d |
s |
|
e
|
b
|
i
|
s
|
n
|
g
|
t
|
s
|
g
|
j
|
|
c
|
g
|
|
v
|
s
|
u
|
u
|
|
j
|
d
|
q
|
g
|
|
q |
|
u
|
s
|
z
|
b
|
g
|
w
|
|
|
|
|
|
|
|
|
|
|
|
|
|
How could we go about breaking this message? Certainly,
we shall take advantage of the known letter frequencies.
Step 1:
Compute the letter frequencies here: We
first find the Cipher E. The letter frequencies show that "s" corresponds to E.
Step 2:
Secondly, we try to detect the most common English 3-letter word �THE�.
We, therefore, have to look for repetitive 3-letter-sequences ending in
"s".
We observe even without the help of the blank spaces that jds
occurs three times, more than any other tri-gram. It is very likely, that we
revealed the two correspondences j=T and d=H yielding
|
|
|
|
T |
H |
E |
|
T
|
|
|
E
|
|
|
|
|
|
|
|
|
|
|
T
|
H
|
E
|
|
q |
v |
v |
|
j |
d |
s |
|
j
|
q
|
o
|
s
|
u
|
|
y
|
q
|
c
|
w
|
|
a
|
l
|
|
j
|
d
|
s
|
|
|
|
E
|
|
|
|
E
|
|
|
|
E
|
|
|
|
|
|
|
|
|
E
|
|
|
|
E
|
|
T
|
q
|
i
|
s
|
n
|
q |
e |
s
|
|
q
|
t
|
s
|
n
|
c
|
z
|
q
|
g
|
|
q
|
n
|
s
|
|
u
|
y
|
s
|
g
|
j
|
|
|
|
T
|
H |
E |
|
|
|
|
E
|
|
|
|
E
|
|
T
|
|
|
|
|
|
E
|
|
|
|
a
|
l
|
|
j
|
d |
s |
|
e
|
b
|
i
|
s
|
n
|
g
|
t
|
s
|
g
|
j
|
|
c
|
g
|
|
v
|
s
|
u
|
u
|
|
T
|
H
|
|
|
|
|
|
|
E
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
j
|
d
|
q
|
g
|
|
q |
|
u
|
s
|
z
|
b
|
g
|
w
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The knowledge of the letters
T, H and E
reveals words like there, this, than, thus, that, etc. So, let�s look for them. We find
jdqg
which could be this or thus,
however, it could not be that. Why
not? Checking the frequency of q
shows that it is likely to be one of the most common letters ETNORIA, and
since it is a vowel (why?) we may reduce the possible choices for
q
to
O, I
or A. I
or A are more likely to follow TH
and may form the second to last 1-letter-words
q.
Step 3:
We
now form possible words of the given letters and test if the found letters
make sense in other words. Say,
we choose q to be A.
What word could the first word qvv
be? ALL and ADD
are possible, ARE is not. If
q
would be I then ILL
is possible. A seems to be the
more reasonable choice. Substituting A
for q yields THA_
for jdqg. We know it can not be THAT,
therefore, THAN makes more sense.
It makes sense that g is N
since the g is one of the most
frequent letters. Thus, using altogether the correspondences A=q, N=g L=v
yields
A |
L |
L |
|
T |
H |
E |
|
T
|
A
|
|
E
|
|
|
|
A
|
|
|
|
|
|
|
T
|
H
|
E
|
|
q |
v |
v |
|
j |
d |
s |
|
j
|
q
|
o
|
s
|
u
|
|
y
|
q
|
c
|
w
|
|
a
|
l
|
|
j
|
d
|
s
|
|
A
|
|
E
|
|
A |
|
E
|
|
A
|
|
E
|
|
|
|
A
|
N
|
|
A
|
|
E
|
|
|
|
E
|
N
|
T
|
q
|
i
|
s
|
n
|
q |
e |
s
|
|
q
|
t
|
s
|
n
|
c
|
z
|
q
|
g
|
|
q
|
n
|
s
|
|
u
|
y
|
s
|
g
|
j
|
|
|
|
T
|
H |
E |
|
|
|
|
E
|
|
N
|
|
E
|
N
|
T
|
|
|
N
|
|
L
|
E
|
|
|
|
a
|
l
|
|
j
|
d |
s |
|
e
|
b
|
i
|
s
|
n
|
g
|
t
|
s
|
g
|
j
|
|
c
|
g
|
|
v
|
s
|
u
|
u
|
|
T
|
H
|
A
|
N
|
|
A |
|
|
E
|
|
|
N
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
j
|
d
|
q
|
g
|
|
q |
|
u
|
s
|
z
|
b
|
g
|
w
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Step 4:
Now,
words will appear. vsuu looks very
much like LESS, then S_ENT in uysgj
should be SPENT. A_E in qns seems to be ARE.
This is very likely since the encrypted R,
the n, appears frequently. We now
have:
A |
L |
L |
|
T |
H |
E |
|
T
|
A
|
|
E
|
S
|
|
P
|
A
|
|
|
|
|
|
|
T
|
H
|
E
|
|
q |
v |
v |
|
j |
d |
s |
|
j
|
q
|
o
|
s
|
u
|
|
y
|
q
|
c
|
w
|
|
a
|
l
|
|
j
|
d
|
s
|
|
A
|
V
|
E
|
|
A |
|
E
|
|
A
|
|
E
|
R
|
|
|
A
|
N
|
|
A
|
R
|
E
|
|
S
|
P
|
E
|
N
|
T
|
q
|
i
|
s
|
n
|
q |
e |
s
|
|
q
|
t
|
s
|
n
|
c
|
z
|
q
|
g
|
|
q
|
n
|
s
|
|
u
|
y
|
s
|
g
|
j
|
|
|
|
T
|
H |
E |
|
|
|
|
E
|
R
|
N
|
|
E
|
N
|
T
|
|
|
N
|
|
L
|
E
|
S
|
S
|
|
a
|
l
|
|
j
|
d |
s |
|
e
|
b
|
i
|
s
|
n
|
g
|
t
|
s
|
g
|
j
|
|
c
|
g
|
|
v
|
s
|
u
|
u
|
|
T
|
H
|
A
|
N
|
|
A |
|
S
|
E
|
|
|
N
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
j
|
d
|
q
|
g
|
|
q |
|
u
|
s
|
z
|
b
|
g
|
w
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Continuing
to detect words, we see that A_ERA_E
in qlsnqes
looks like AVERAGE,
and A_ER__AN
in qtsnczqg
looks very much like AMERICAN.
What other words can you find? Try to finish it by yourself. Replacing these
letters yields
A |
L |
L |
|
T |
H |
E |
|
T
|
A
|
|
E
|
S
|
|
P
|
A
|
I
|
|
|
|
|
|
T
|
H
|
E
|
|
q |
v |
v |
|
j |
d |
s |
|
j
|
q
|
o
|
s
|
u
|
|
y
|
q
|
c
|
w
|
|
a
|
l
|
|
j
|
d
|
s
|
|
A
|
V
|
E
|
R
|
A |
G |
E
|
|
A
|
M
|
E
|
R
|
I
|
C
|
A
|
N
|
|
A
|
R
|
E
|
|
S
|
P
|
E
|
N
|
T
|
q
|
i
|
s
|
n
|
q |
e |
s
|
|
q
|
t
|
s
|
n
|
c
|
z
|
q
|
g
|
|
q
|
n
|
s
|
|
u
|
y
|
s
|
g
|
j
|
|
|
|
T
|
H |
E |
|
G
|
|
V
|
E
|
R
|
N
|
M
|
E
|
N
|
T
|
|
I
|
N
|
|
L
|
E
|
S
|
S
|
|
a
|
l
|
|
j
|
d |
s |
|
e
|
b
|
i
|
s
|
n
|
g
|
t
|
s
|
g
|
j
|
|
c
|
g
|
|
v
|
s
|
u
|
u
|
|
T
|
H
|
A
|
N
|
|
A |
|
S
|
E
|
C
|
|
N
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
j
|
d
|
q
|
g
|
|
q |
|
u
|
s
|
z
|
b
|
g
|
w
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
And we
finally have:
All the taxes paid by the average American are spent
by the government in less than a second
Resumee
Cracking
Random Substitution Ciphers can be accomplished by a combination of finding
most frequent letters and tri-grams as well as clever guessing and testing
missing letters. The more Random Substitution Ciphers you will crack the more
experienced you will become. As a side effect: so called �cryptograms�
that you find in the newspapers are Random Substitution Ciphers that you will solve with ease.
Exercise 1:
You now have the opportunity to practice the art of cracking such random
substitution ciphers. Here is your cipher to crack:
yxdy
pq yjc xzpvpyw ya icqdepzc ayjceq xq yjcw qcc yjcuqcvrcq. xzexjxu vpsdavs
tact
is the ability to describe others as they see themselves. abraham lincoln
Exercise
2:
Crack the following
�Cryptoclip� found in the �Daily News of the US Virgin Islands� on
March 21, 2001.
vkbjp v avpq bphvu
baj hfj bahjk hy yjxbjxfjq bc bdjxbr rjvpy hx baj fccujp
(Hint: the cipher "e" is j. To find more letter frequencies click
here.)
after a hard trial the ice thief is
sentenced to twenty years in the cooler
|