Briefly, the CSA breaks up the input ciphertext into a list of words, and recursively tries to fit as much of the list as possible to the words in a fairly long list of English words, longest word first. It tries to find the trial solution with the maximum number of words fitted with the same key, among all the recursive tree walks made.
The CSA is very strongly dependent on the input having the word boundaries and lengths correct, so that, for example, a ciphergram redivided into five letter groups will not work. It is a little less dependent on having no wrong letters in the text, but it thinks words of three letters or less are all exact and in its word list, so a mistake in a short word will almost surely cause it to fail. Two newspaper cryptoquotes in 1997 had the word "oh" in them, and because that word was not in the known word list, they failed, so now the list contains "oh". The "Los" of Los Angeles tripped the CSA once. Nearly all cases that take a long time are caused by input having a letter left out of a word, or a wrong letter in a short word.
The CSA knows no grammar or word meanings. It is merely a brute-force pattern matching algorithm. The upside is that you could replace the word list with, for example, a fairly complete and very accurate German list, and the CSA would solve German ciphergrams.
The CSA usually does not get stumped on the Cryptoquotes in the newspapers; that's probably because those have to be relatively easy.
Last updated 2002 September 17