1

I want most frequent words in english. Basically, I am processing wikipedia text and am stuck with lot of words even after removing stop words. I tried googling for frequent words, but got the below link.

http://en.wiktionary.org/wiki/Wiktionary:Frequency_lists#English

I have to manually scrape the data from these link. Is there a known source for these words that can be directly downloaded?

Thank you

dnagirl
  • 20,196
  • 13
  • 80
  • 123
Boolean
  • 14,266
  • 30
  • 88
  • 129
  • Related posts / possible duplicates: http://stackoverflow.com/questions/2213607/how-to-get-english-language-word-database, http://stackoverflow.com/questions/824422/can-i-get-an-english-dictionary-word-list-somewhere, http://stackoverflow.com/questions/1594098/where-to-get-a-list-of-almost-all-the-words-in-english-language – Péter Török Sep 02 '10 at 07:57
  • Do you understand the question? – Boolean Sep 02 '10 at 08:00
  • Have you checked that none of the websites referred to in the linked threads contain the data you are looking for? – Péter Török Sep 02 '10 at 08:24

1 Answers1

2

As in all statistics your answer will depend on what you are sampling. Is your definition of "English" - language used in Wikipedia. As the page you have linked suggests the frequency of words differs based on different samples. Doing a Literature review on language processing work may give you a dated list.

And trust someone to make a website with the that name - wordfrequency. More specifically this.

whatnick
  • 5,400
  • 3
  • 19
  • 35