2

I'm doing a pretty simple homework problem for a Python class involving all sorts of statistics on characters, words and their relative frequencies etc. At the moment I'm trying to analyse a string of text and get a list of every unique word in the text followed by the number of times it is used. I have very limited knowledge of Python (or any language for that matter) as this is an introductory course and so have only come up with the following code:

for k in (""",.’?/!":;«»"""):
    text=text.replace(k,"")
text=text.split()
list1=[(text.count(text[n]),text[n]) for n in range(0,len(text))]
for item in sorted(list1, reverse=True):
    print("%s : %s" % (item[1], item[0]))

This unfortunately prints out each individual word of the text (in order of appearance), followed by its frequency n, n times. Obviously this is extremely useless, and I'm wondering if I can add in a nifty little bit of code to what I've already written to make each word appear in this list only once, and then eventually in descending order. All the other questions like this I've seen use a lot of code we haven't learned, so I think the answer should be relatively simple.

Bart
  • 19,692
  • 7
  • 68
  • 77
  • How do you think you are asking `sorted(..)` to sort on the "frequencies"? – UltraInstinct Jun 11 '12 at 10:46
  • Are you already familiar with dictionaries (dict, {})? You could use one to associate words with their appearance count. A Counter, as Martijn suggests, is a specialized kind of dictionary. – Pieter Witvoet Jun 11 '12 at 10:47
  • Also see: http://stackoverflow.com/questions/4088265/word-frequency-count-using-python – BioGeek Jun 11 '12 at 11:13

1 Answers1

6

Take a look at collections.Counter. You can use it to count your word frequencies, and it'll help you print out the list in sorted order, with the most_common method.

(No example code as this is a homework question, you'll have to do some work yourself).

Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343