deepmoo was my entry in the programmer of the month competition.
The journal I kept while working on deepmoo.
The source code to deepmoo. The version submitted for the contest was automatically generated from this. If you want to compile it you'll also need the trigraph data it #includes.
My program's guesses on the contest words:
C:\sean\potm>deepmoo -v DUCK WITHOUT RESOURCES Guessing 'DUCK' 1 ETAO 0 4 2 INSH 0 4 3 RDLU 0 2 4 MPCG 1 3 5 WYBF 0 4 6 JKQV 0 3 7 DUCK 4 0 Guessing 'WITHOUT' 1 ETTAAAA 1 5 2 OIINNNN 1 4 3 SHHRRRR 0 5 4 DLLUUUU 1 3 5 MPPCCCC 0 7 6 GWYBFJK 0 6 7 VQXXZZZ 0 7 8 BITTHOU 2 1 9 TITOUGH 2 1 10 FITUTHO 2 1 11 WITHOUT 7 0 Guessing 'RESOURCES' 1 ETTAAAAAA 0 8 2 OIINNNNNN 0 8 3 SHHRRRRRR 1 2 4 DLLUUUUUU 1 3 5 MPPCCCCCC 1 3 6 GWWYYYYBB 0 9 7 FJJKKKQXZ 0 9 8 SERECEOUS 2 0 9 SOROSCUSE 1 0 10 CERCOUSER 2 0 11 COOCUEERS 2 0 12 REOURESCE 2 0 13 RESOURCES 9 0
By total coincidence, deepmoo solves 'SEAN' in 3 guesses!
On dictionaries, lossy compression, and wordspaces:
---------------------------------------------------
Feb 15, 1999
I spent a long time exploring the notion of "wordspace".
It's illegal for us to use a dictionary. It's not
illegal for us to use digraphs or trigraphs or
quadgraphs, though, is it? (Well, quadgraphs
for 4-letter words would be a dictionary!)
I don't know how to reconcile this, since any restrictions
on the "set of words the program might generate" could
be interpreted as a dictionary. And the workaround
(explore the "dictionary" first, then fallback on the
full set of strings) prevents that definition from literally
applying.
Even just applying a little bit of smarts (you can't have
the same three letters appear in a row) restricts the
wordspace, so it doesn't seem fair to define a dictionary
as a restriction of a wordspace.
But what if I lossily compress a dictionary so that the
wordspace I pursue is the original dictionary plus
another 10% of garbage words or so?
What if it's 100% bigger? What if it's 10 times
as big? What if it's 100 times as big?
Here's information about some "restricted wordspaces"
or "synthetic dictionaries" for 9-letter words:
N
5,429,503,678,976 Strings from AAAAAAAAA...ZZZZZZZZZ
184,511,547,557 Three adjacent trigs (using 5693 found in the dictionary)
62,902,335,833 Three adjacent trigs (using 3977 found in 9 letter words)
1,832,210,380 Three adjacent trigs (1289 1st, 1988 2nd, 715 3rd)
1,191,748,033 All 7 trigraphs valid (of 3977)
100,129,697 All 7 trigraphs valid for that position
65,278,757 All 6 quadgraphs valid (of 12580)
3,813,306 All 6 quads positionally valid from full dict
449,505 All 6 quads positionally valid from 9-letter-word dict
6,475 9-letter words in the computer dictionary I've constructed
Unfortunately, at this point POTM became a tedious, boring exercise in trying to compress those tables, since that seemed the best way of improving my program. I never got it past "All 7 trigraphs valid for that position", though.
A run of my program using the 3,813,306 word synthetic wordspace:
C:\sean\potm>deepmoo -q -v DUCK WITHOUT RESOURCES Guessing 'DUCK' 1 ETAO 0 4 2 INSH 0 4 3 RDLU 0 2 4 MPCG 1 3 5 WYBF 0 4 6 JKQV 0 3 7 DUCK 4 0 Guessing 'WITHOUT' 1 ETTAAAA 1 5 2 OIINNNN 1 4 3 SHHRRRR 0 5 4 DLLUUUU 1 3 5 MPPCCCC 0 7 6 GWYBFJK 0 6 7 VQXXZZZ 0 7 8 TITHOUG 5 1 9 WITHOUT 7 0 Guessing 'RESOURCES' 1 ETTAAAAAA 0 8 2 OIINNNNNN 0 8 3 SHHRRRRRR 1 2 4 DLLUUUUUU 1 3 5 MPPCCCCCC 1 3 6 GWWYYYYBB 0 9 7 FJJKKKQXZ 0 9 8 SORCOUSES 2 0 9 CORRESCUE 1 0 10 CROCURESS 3 0 11 RESOURCES 9 0
Not good enough to win, anyway!