Markov Chain Algorithms
(A not very technical explanation)

How the algorithms work.

Markov algorithms do not require either mathematics or computers. It is possible to perform a Markov text generation algorithm with a pencil and paper (I will discuss how to do this below). The only other thing required is an input text. There are mathematically oriented accounts available that also discuss such matters. This one will avoid equations entirely. To illustrate my discussion, I will use this paragraph.

Markov algorithms work with patterns.

Markov algorithms determine how likely it is that a word will follow another.  What happens when it is equally likely that one word will follow as another one? The answer is it tosses a coin, or it makes a random choice, to put it another way.

 Example: in the first paragraph the word “will” appears three times. It is followed by


If a Markov algorithm were to encounter the word “will” in the above, it could randomly choose one of these three words available to it as being equally probable. It might then take for instance the word “discuss” and follow it up with either “how” or “such”. It continues to do this each time, taking a word, finding one to follow it, taking the last word found and then adding another. As you see, it deals with one pair of words at a time. It adds a word and this makes a new pair. (It is possible for the algorithm to use threes and fours or more, but pairs seem to produce more interesting results.)

With short texts, like the first paragraph, it doesn’t have many options available to it as most of the words appear only once. But with longer texts it can produce many variations.

The texts it produces seem often quite like the source text. Frequently they are also rather strange sounding.

One of the things about a Markov chain algorithm is that it treats punctuation as part of a word. So, it would treat “word.” as a word, if you see what I mean. This is quite useful as frequently it manages to punctuate plausibly by shifting letters and punctuation marks together. Sometimes it does not though and can put punctuation marks in funny places.

Markov algorithms have often been used to produce texts that are both nonsensical and rather plausible.

A quite well known example is “Mark V. Shaney”. He is discussed in:
  Kernighan, Brian W and Pike, R. (1999) The Practice of Programming.
  Reading, MA, Addison-Wesley.

Apparently he confused people who thought he was ‘real’. He’s still out there living on the web somewhere.

A Few Observations.

My own use of Kernighan and Pike’s algorithm differs quite a bit in that I am not really interested in hoodwinking. The text that my Markov Generator uses is my own research into text generation. I was trying to generate a text that was a text generation text…

A slightly more technical point is that Markov algorithms tend to always start with the same word. This is repetitious of them after a while. So I run my text through a ‘shuffle’ first. This stops it doing that.

Markov algorithms are only as syntactical – at best – as the texts they use. If the text that is fed in is only one word repeated a lot, the algorithm will only produce the same text. If the text consists of a jumble of words appearing with equal frequency, then the text made is not likely to be any more like English than the input. (Obviously in English for example we choose words with slightly more care.)

In other words Markov algorithms just do a calculation. They cannot produce sentences for themselves. For that you need something like a recursive grammar (my Virtual Dictionary uses this to make its texts). These should always produce grammatical sentences as long as they are written that way. They probably produce nonsense too, however. See, Bulhak, A.C. (1996) On the Simulation of Postmodernism and Mental Debility using Recursive Transition Networks. Natural Language Programming, tries to overcome this. But that is the subject of another discussion.

Lastly, Markov algorithms have a very short memory. That is to say, they produce a text based on their count of word frequencies. Each time the algorithm is run, it starts anew. For this reason Markov processes are called 'finite state machines': each state is determined by the one previous. A Markov algorithm only looks through a text on a pair by word pair basis (although it can handle longer sequences). If you want something that seems to live and grow, you might be interested in artificial life programming.


Shannon’s Pen and Paper Markov Method.

Shannon is the founder of Information Theory. He wrote a (1948) paper,  ‘A Mathematical Theory of Communication’ where he explains the basic technique (although this version is a little different).

Choose a pair of words at random from a novel. Read through the novel until the second of the words is encountered again. Write down the word that follows it. Carry on until you hit the word you just wrote down. When you find it write down the word that follows that word. Continue until you make something interesting or exhaustion sets in.
This is based on a now obscure text by J. R. Pierce describing Shannon’s techniques.  See ‘A chance for art’, in Cybernetics, art and ideas, (1971) Jasia Reichardt (Ed.), London, Studio Vista.

The method may produce mostly garbage with occasional more interesting passages. It’s also very slow. Easier to get a computer to do it?

Wayne Clements  01/05

  Essays published on this website are copylefted according to the GNU General Public License