deepmoo was my entry in the programmer of the month competition.
The journal I kept while working on deepmoo.
The source code to deepmoo. The version submitted for the contest was automatically generated from this. If you want to compile it you'll also need the trigraph data it #includes.
My program's guesses on the contest words:
C:\sean\potm>deepmoo -v DUCK WITHOUT RESOURCES Guessing 'DUCK' 1 ETAO 0 4 2 INSH 0 4 3 RDLU 0 2 4 MPCG 1 3 5 WYBF 0 4 6 JKQV 0 3 7 DUCK 4 0 Guessing 'WITHOUT' 1 ETTAAAA 1 5 2 OIINNNN 1 4 3 SHHRRRR 0 5 4 DLLUUUU 1 3 5 MPPCCCC 0 7 6 GWYBFJK 0 6 7 VQXXZZZ 0 7 8 BITTHOU 2 1 9 TITOUGH 2 1 10 FITUTHO 2 1 11 WITHOUT 7 0 Guessing 'RESOURCES' 1 ETTAAAAAA 0 8 2 OIINNNNNN 0 8 3 SHHRRRRRR 1 2 4 DLLUUUUUU 1 3 5 MPPCCCCCC 1 3 6 GWWYYYYBB 0 9 7 FJJKKKQXZ 0 9 8 SERECEOUS 2 0 9 SOROSCUSE 1 0 10 CERCOUSER 2 0 11 COOCUEERS 2 0 12 REOURESCE 2 0 13 RESOURCES 9 0
By total coincidence, deepmoo solves 'SEAN' in 3 guesses!
On dictionaries, lossy compression, and wordspaces:
--------------------------------------------------- Feb 15, 1999 I spent a long time exploring the notion of "wordspace". It's illegal for us to use a dictionary. It's not illegal for us to use digraphs or trigraphs or quadgraphs, though, is it? (Well, quadgraphs for 4-letter words would be a dictionary!) I don't know how to reconcile this, since any restrictions on the "set of words the program might generate" could be interpreted as a dictionary. And the workaround (explore the "dictionary" first, then fallback on the full set of strings) prevents that definition from literally applying. Even just applying a little bit of smarts (you can't have the same three letters appear in a row) restricts the wordspace, so it doesn't seem fair to define a dictionary as a restriction of a wordspace. But what if I lossily compress a dictionary so that the wordspace I pursue is the original dictionary plus another 10% of garbage words or so? What if it's 100% bigger? What if it's 10 times as big? What if it's 100 times as big? Here's information about some "restricted wordspaces" or "synthetic dictionaries" for 9-letter words: N 5,429,503,678,976 Strings from AAAAAAAAA...ZZZZZZZZZ 184,511,547,557 Three adjacent trigs (using 5693 found in the dictionary) 62,902,335,833 Three adjacent trigs (using 3977 found in 9 letter words) 1,832,210,380 Three adjacent trigs (1289 1st, 1988 2nd, 715 3rd) 1,191,748,033 All 7 trigraphs valid (of 3977) 100,129,697 All 7 trigraphs valid for that position 65,278,757 All 6 quadgraphs valid (of 12580) 3,813,306 All 6 quads positionally valid from full dict 449,505 All 6 quads positionally valid from 9-letter-word dict 6,475 9-letter words in the computer dictionary I've constructed
Unfortunately, at this point POTM became a tedious, boring exercise in trying to compress those tables, since that seemed the best way of improving my program. I never got it past "All 7 trigraphs valid for that position", though.
A run of my program using the 3,813,306 word synthetic wordspace:
C:\sean\potm>deepmoo -q -v DUCK WITHOUT RESOURCES Guessing 'DUCK' 1 ETAO 0 4 2 INSH 0 4 3 RDLU 0 2 4 MPCG 1 3 5 WYBF 0 4 6 JKQV 0 3 7 DUCK 4 0 Guessing 'WITHOUT' 1 ETTAAAA 1 5 2 OIINNNN 1 4 3 SHHRRRR 0 5 4 DLLUUUU 1 3 5 MPPCCCC 0 7 6 GWYBFJK 0 6 7 VQXXZZZ 0 7 8 TITHOUG 5 1 9 WITHOUT 7 0 Guessing 'RESOURCES' 1 ETTAAAAAA 0 8 2 OIINNNNNN 0 8 3 SHHRRRRRR 1 2 4 DLLUUUUUU 1 3 5 MPPCCCCCC 1 3 6 GWWYYYYBB 0 9 7 FJJKKKQXZ 0 9 8 SORCOUSES 2 0 9 CORRESCUE 1 0 10 CROCURESS 3 0 11 RESOURCES 9 0
Not good enough to win, anyway!