0

I am encountering difficulties with a program I need to create for my programming class. The class utilizes Python 3, and the assignment is to create a program that reads a file and displays a concordance for said file. The problem I am encountering is that when I run a file through the program, it counts how many of each character are in the program as opposed to words. Here is my program:

print ("enter file name")
f = input()
file = open(f)
z = file.read()
numdict = {}
my_num = 0
with open(f) as file:
    [word for line in z for word in line.split()]
for word in z:
    if not word in numdict:
        numdict[word] = []

    numdict[word].append(my_num)

print("word" , "frequency")
for key in sorted(numdict):
    print(key, len(numdict[key]))

2 Answers2

0

First of all, I'm not entirely sure what you're trying to accomplish with

with open(f) as file:
    [word for line in z for word in line.split()]

But that line isn't doing anything because it's not saving the list you're constructing anywhere so you can delete it.

You're problem is in for word in z:. Since z = file.read(), z stores a string containing the text of the whole file which means that when you iterate through it (for word in z:), you're setting word to each character in z.

What you want to do instead is iterate through the result of z.split() which will give you every word in z i.e. for word in z.split()

JJ Xu
  • 71
  • 1
  • 5
0

It seems like you're confusing yourself a bit. Here's a naive, but instructional approach. (Naive because it doesn't deal with edge cases and punctuation). The comments should help understand what's going on:

wordDict = {}

with open('file.txt') as file:
    for line in file:
        #line is one line in the file
        words = line.strip().split()
        #words is an array or words that were in line
        for word in words:
            #word is one word from the words array
            if word in wordDict:
                wordDict[word] += 1
            else:
                wordDict[word] = 1

If file.txt looks like this:

Hello Hello Hello Hello Hello Hello
What is going going on on on

Your wordDict will look like this:

{'Hello': 6, 'What': 1, 'is': 1, 'going': 2, 'on': 3}
Mark
  • 90,562
  • 7
  • 108
  • 148