What would be the best way to go about getting a function that returns a random English word (preferably a noun), without keeping a list of all possible words in a file before hand?
-
This isn't a sensible question. Could you provide some additional context or clue as to what you're trying to do. Generating English words without an English dictionary is a logical contradiction. Please clarify this. – S.Lott Feb 27 '09 at 11:14
-
fetching a word from any online resource designed to provide random words looks like a good idea. :-) – Paulo Guedes Feb 27 '09 at 11:50
-
@joshhunt: What constitutes "massive"? Spellcheck dictionaries for English are about 400K. See http://aspell.net/ for a good one. – S.Lott Feb 27 '09 at 15:56
8 Answers
Word lists need not take up all that much space.
Here's a JSON wordlist with 2,465 words, all nouns. It clocks in at under 50K, the size of a medium-sized jpeg image.
I'll leave choosing a random one as an exercise for the reader.

- 1
- 1

- 207,056
- 34
- 155
- 173
-
3This really is the best option. You could easily keep the entire list in memory and you'll have complete control over the source -- no unexpected changes, no connection issues, no security concerns, and overall should be much easier to implement. – Whatsit Feb 27 '09 at 14:44
-
You can't. There is no algorithm to generate meaningful words. You can only generate words that sound like English, but they won't have any meaning.

- 13,504
- 1
- 40
- 61
You could have the function try and parse an online resource such as:

- 50,926
- 41
- 133
- 199
Another theoretical approach: you could scrape the random wikipedia article page and return the N-th word of the article.

- 102,760
- 52
- 202
- 249
-
It's a nice idea, but you might need to filter out dates and numbers and non-Engilsh words. – Ben Feb 27 '09 at 12:43
-
1The results wouldn't be very random -- you'd tend to get the same few words a lot, and all sorts of other problems. – Whatsit Feb 27 '09 at 14:36
-
1@Whatsit I guess you're right. On the other hand: what des random english word really mean? If you ask somebody for a random word, it will be a similar statistical distribution – splattne Feb 27 '09 at 14:41
There's a random word generator here - it's not English but it's English-ish, i.e. the words are similar enough to language that a user can read the words and store them in short-term memory.
Source code is in C# and a bit kludged, but you could use a similar approach in Python to generate lots of words without having to store a massive list.
Alternatively, you could call the web service on the demo page directly - it's hosted on GoDaddy though, so no guarantees it will work in production!
You can download the "words common to SOWPODS and TWL" lists from http://www.math.toronto.edu/jjchew/scrabble/lists/ . I put all the words in those files together and the list weighed in at about 642k. Not huge by any standards. The lists do contain a whole lot of obscure words though, since they are meant for tournament Scrabble use. The good thing is that the lists form a substantial subset of the English language.

- 62,729
- 22
- 87
- 114
Well, you have three options:
- Hard-code the list of words and initialize an array with it.
- Fetch the list from an internet location instead of a file.
- Keep a list of possible words in a file.
The only way to avoid the above is if you're not concerned whether the word is real: you can just generate random-length strings of characters. (There's no way to programmatically generate words without a dictionary list to go from.)

- 113,939
- 20
- 158
- 187