The Book of Gehn

Break Hill Cipher with a Known Plaintext Attack

January 2, 2019

Given a matrix secret key \(K\) with shape \(n\textrm{x}n\), the Hill cipher splits the plaintext into blocks of length \(n\) and for each block, computes the ciphertext block doing a linear transformation in module \(m\)

$$ K p_i = c_i\quad(\textrm{mod } m)$$

For decrypting, we apply the inverse of \(K\)

$$ p_i = [K]^{-1} c_i \quad(\textrm{mod } m)$$

To make sense, the secret key \(K\) must be chosen such as its inverse exists in module \(m\).

Ready to break it?

Known plaintext attack

Because the Hill cipher is a linear cipher, it is vulnerable to a known plaintext attack.

For a secret key \(K\) with shape \(n\textrm{x}n\), we need \(n\) pairs of known plaintext and ciphertext blocks, each of length \(n\).

The resulting equations no only need to be linear independent in general but in modulo \(m\) too. If not, the calculus of the inverse of the system matrix will fail.

Let’s be the following ciphertext encrypted with an unknown matrix \(K\) with shape \(2\textrm{x}2\) module \(251\).

>>> from cryptonita import B

>>> ciphertext = B(
...     b'\x19\xdb&\x05,\x9f\x8a2\xeb.\x8fJ\x9b\xbcZb]7e\xe2f\x83\x96'
...     b'\xa8j[\xb2\x15\x89\x95\x19\xf04p\x061\xc8\xbf\xa0\xd8\xd0\xba'
...     b'L\xa4Jl\x98\xd9\x89\x95\n\x9b\xa8\x88=KL\xa0#\xddJl\xbcE\xb3'
...     b'\xad\xf5\xa5e\xe26\xf9\xc1Y\xb2\x15\x87\x08?\x95\xf4\r\xcb\x9e"'
...     b'\x85\xd8\xa0\xc8lMA\xcb\x9eZb\x97-\xb7\xd9~\xb7Bq\t\x03\x94\x1c'
...     b'@\x01/n\x83\x891\x92p\x10F\xech\xf7\xb8\xc5\xbb\xa8\x9cY\xcf\n')

>>> n = 2
>>> m = 251

Let’s be a known (partial) plaintext of 4 bytes (2 blocks of length 2)

>>> known_plaintext = B('Hill')
>>> at = 4

>>> partial_ciphertext = ciphertext[at:at+len(known_plaintext)]
>>> partial_ciphertext
',\x9f\x8a2'

With these two pairs Hi -> ,\x9f and ll -> \x8a2 we can build the following equation system:

$$ K p_1 = c_1 \quad(\textrm{mod } m) \\ K p_2 = c_2 \quad(\textrm{mod } m) $$

Each pair adds one equation or two if we see them in an unrolled way (we decompose each vector and matrix and make the dot product explicit):

$$ K_{1,1} p_{1,1} + K_{1,2} p_{1,2} = c_{1,1} \quad(\textrm{mod } m) \\ K_{2,1} p_{1,1} + K_{2,2} p_{1,2} = c_{1,2} \quad(\textrm{mod } m) \\ K_{1,1} p_{2,1} + K_{1,2} p_{2,2} = c_{2,1} \quad(\textrm{mod } m) \\ K_{2,1} p_{2,1} + K_{2,2} p_{2,2} = c_{2,2} \quad(\textrm{mod } m) \\ $$

All those equations can be seen as a single one if we see all the plaintext and ciphertext blocks/vectors as two matrices.

$$K P = C \quad(\textrm{mod } m)$$
>>> import numpy as np

>>> P = np.array(list(known_plaintext)).reshape(2,2).T
>>> P
array([[ 72, 108],
       [105, 108]])

>>> C = np.array(list(partial_ciphertext)).reshape(2,2).T
>>> C
array([[ 44, 138],
       [159,  50]])

Find the secret key matrix K

Then:

$$ K = C [P]^{-1} \quad(\textrm{mod } m)$$

Where \([P]^{-1}\) is the inverse of the matrix \(P\) in \((\textrm{mod } m)\) so we cannot apply a standard inverse operation.

Thankfully cryptonita already implements this inverse for us.

>>> from cryptonita.mod import inv_matrix

>>> iP = inv_matrix(P, m)

>>> np.dot(P, iP) % m
array([[1, 0],
       [0, 1]])

>>> K = np.dot(C, iP) % m
>>> K
array([[  4,  67],
       [123, 161]])

>>> (np.dot(K, P) % m == C).all()
True

Decrypt the plaintext

To decrypt the ciphertext we need the inverse of \(K\) in \((\textrm{mod } m)\)

>>> iK = inv_matrix(K, m)
>>> iK
array([[ 95,   1],
       [ 88, 191]])

>>> (np.dot(iK, C) % m == P).all()
True

Finally:

>>> cblocks = ciphertext.nblocks(2)

>>> plaintext = []
>>> for cblk in cblocks:
...     ci = np.array(list(cblk)).reshape(2, 1)
...     pi = np.dot(iK, ci) % m
...
...     plaintext.append(B(list(pi.ravel())))

>>> plaintext = b''.join(plaintext)
>>> print(plaintext)           # byexample: +norm-ws
b'The Hill cipher is a polygraphic substitution cipher based on linear
  algebra invented by Lester S. Hill in 1929. - From Wikipedia.'

Beyond a known plaintext attack (open questions)

Some open questions for a future post:

Break Hill Cipher with a Known Plaintext Attack - January 2, 2019 - Martin Di Paola