0

Here is my code

def getInputFile ():
bad = True
while bad:
    try:
        fileName = input ("Enter file name: ")
        # Open file for input
        f = open(fileName, "r") # Note: "r" means open for reading.
        bad = False
    except Exception as err:
        print ("Please enter a valid file name:")
return f


lines=0
wordCount=0
fileHandler=getInputFile()


for lineOfText in fileHandler.readlines():
    lines += 1
    print(str(lines),str(lineOfText))
    f1=lineOfText.split()
    wordCount=wordCount+len(f1)
    print ("Word count:" +str(wordCount))

Currently, my program counts only the running total of words in the text file but I want it to just count the words in each line of the file. Also I would like the program to analyze the text file at the end and print out things such as "most words in a line" and "average words per line" but I can't do that with my current format. Any help would be greatly appreciated.

A Butler
  • 99
  • 3
  • 9
  • You are doing cumulative addition `wordCount=wordCount+len(f1)` .. of course you will get total at the end – Iron Fist Dec 03 '15 at 03:29
  • Also see: [Counting lines, words, and characters within a text file using Python](http://stackoverflow.com/questions/4783899/counting-lines-words-and-characters-within-a-text-file-using-python) – l'L'l Dec 03 '15 at 03:33

2 Answers2

1

Create a list out of it:

result = [len(line.split()) for line in fileHandler]

Then you can find the total word count:

print(sum(result))

The word count for each line:

print(*result, sep='\n')

The highest word count:

print(max(result))

The average word count:

print(sum(result) / len(result))

If you also want to save each line, read it first:

lines = fileHandler.readlines()

Then count the words:

result = [len(line.split()) for line in lines]

Then zip() these two lists:

print(*('{} -- {}'.format(*item) for item in zip(lines, results)), sep='\n')
TigerhawkT3
  • 48,464
  • 6
  • 60
  • 97
  • I appreciate the help. Would there be any way for me to print the word count for each line after printing out each line? – A Butler Dec 03 '15 at 03:39
  • @AButler - You should save the file contents as well, then, as shown in my edited answer. – TigerhawkT3 Dec 03 '15 at 03:44
  • since `split` splits on whitespace by default, that won't count exactly words. For example `some - sentence` will result in a length of 3. – OneCricketeer Dec 03 '15 at 03:45
  • The rules for counting words are, in a word, complex. They are also somewhat arbitrary, with different implementations having different definitions of what constitutes a word. `split()` may be naive and basic, but it's easy to use and understand. Better word counting methods are not in the scope of a question like this. – TigerhawkT3 Dec 03 '15 at 03:49
1

You're almost there, just need to add a couple things:

lines=0
wordCount=0
mostWordsInLine = 0
fileHandler=getInputFile()


for lineOfText in fileHandler.readlines():
    lines += 1
    print(str(lines),str(lineOfText))
    f1=lineOfText.split()
    wordCount=wordCount+len(f1)
    if len(f1) > mostWordsInLine:
        mostWordsInLine = len(f1)
    print ("Word count:" +str(wordCount))

print "Average words per line: {}".format(wordCount / lines)
print "Most words in a single line: {}".format(mostWordsInLine)

EDIT: To print out the # of words in each line, you can change the print statement inside the for loop.

Currently you're doing print ("Word count:" +str(wordCount)), which prints out the cumulative total. Simply change this to print 'Word count: {}'.format(len(f1))

dursk
  • 4,435
  • 2
  • 19
  • 30