8.5 Cracking Keyword Substitution Cipher

Attacking a keyword substitution cipher is not always easy. Although we can benefit from frequency analysis, it may only be good enough to give us a few elements from the key, and a long and complicated key may well cause serious problems.

There is however another technique, which can also be used in solving other ciphers, that we can employ, and is known as Hill Climbing.

The idea is as follows: for each potential key we assign a score as to how suitable that key is to be the correct key (we can use any suitable technique we like, such as some other type of frequency analysis). If small changes in the key result in small changes in the score, then we can then think of the results as providing us with a graph with the higher values representing more likely keys. We then ‘guess’ an initial key and work out its score. We then change the key slightly and recalculate the score. If this new score is ‘worse’ than the original, then we continue to make small changes to the key until we either get a ‘better’ key or run out of possible changes to the key. If we get a better key, then we adopt that one as our current best bet and start the process over again, otherwise we stop the algorithm and our present key is then our best bet.

Hopefully, this will enable us to find the local maximum of the graph of scores, and with some luck, this local maximum may provide us with the correct key.

In Appendix I we have set up a form to encrypt and decrypt keyword substitution ciphers and added a version of the hill-climbing algorithm for you to play with.In our algorithm we use frequency analysis based on the frequency of trigrams, groups of three consecutive letters, but analysis using quadgrams (groups of 4 letters) is usually preferred as being more accurate.