0

Right now my code prints how many times every word is used in the txt file. I'm trying to get it to ONLY print the top 3 words used with capital letters within the txt file...

file=open("novel.txt","r+")
wordcount={}
for word in file.read().split():
    if word not in wordcount:
        wordcount[word] = 1
    else:
        wordcount[word] += 1
for a,b in wordcount.items():
    print (b, a)
  • [`str.istitle`](https://docs.python.org/3/library/stdtypes.html#str.istitle) or [`str.isupper`](https://docs.python.org/3/library/stdtypes.html#str.isupper) should help you. – Jose A. García Nov 06 '17 at 21:02

2 Answers2

0

First you want to limit your result to only capitalized words using str.istitle():

file=open("novel.txt","r+")
wordcount={}
for word in file.read().split():
    if word.istitle():
        if word not in wordcount:
            wordcount[word] = 1
        else:
            wordcount[word] += 1
for a,b in wordcount.items():
    print (b, a)

Then sort the results using sorted() and print out the first three:

file=open("novel.txt","r+")
wordcount={}
for word in file.read().split():
    if word.istitle():
        if word not in wordcount:
            wordcount[word] = 1
        else:
            wordcount[word] += 1

items = sorted(wordcount.items(), key=lambda tup: tup[1], reverse=True) 

for item in items[:3]:
    print item[0], item[1]
crenshaw-dev
  • 7,504
  • 3
  • 45
  • 81
  • Thank you! it is working to sort the capital letters but still producing an error...6 Jellicle 5 Cats 2 And 1 They 1 Moon Traceback (most recent call last): File "program.py", line 13, in items.sort(key=lambda tup: tup[1], reverse=True) AttributeError: 'dict_items' object has no attribute 'sort' –  Nov 06 '17 at 21:23
  • @hendro3 The code would only work in Python 2. I've updated it to be Python 3 friendly. – crenshaw-dev Nov 06 '17 at 21:30
-1

In Collections there's a Counter class. https://docs.python.org/2/library/collections.html.

cnt = Counter([w for w in file.read().split() if w.lower() != w]).most_common(3)
Jack Homan
  • 383
  • 1
  • 6
  • He wants to have only the ones with capital letters, I suggest you to change `file.read().split()` to [w for w in file.read().split() if w.istitle()]` – Jose A. García Nov 06 '17 at 21:06
  • Thank you Jose! Yes I am trying to read the text file and print the top 3 words used that are in capital letter –  Nov 06 '17 at 21:10
  • `[w for w in file.read().split() if w.istitle()]` seems concise to a fault for a beginner... probably better just to loop through everything and use `if word.istitle():`. I don't think there's a performance or readability benefit to the list comprehension. – crenshaw-dev Nov 06 '17 at 21:17
  • Well I can read `[w for w in file.read().split() if w.istitle()]` as "word for each word in the file that I read and split in words only if word is title". I think is more concise more readable and much more simpler. I mean, he just wants to count, if he wants to implement a counter then he can look at the [standard library](https://github.com/python/cpython/blob/3.6/Lib/collections/__init__.py) or ask Raymond Hettinger. If it is a simple thing to tell then it MUST be a simple thing to code (in python). – Jose A. García Nov 06 '17 at 21:56