1

I'm taking an online class and we were assigned the following task:

"Write a program that prompts for a file name, then opens that file and reads through the file, looking for lines of the form: X-DSPAM-Confidence: 0.8475 Count these lines and extract the floating point values from each of the lines and compute the average of those values and produce an output as shown below. You can download the sample data at http://www.pythonlearn.com/code/mbox-short.txt when you are testing below enter mbox-short.txt as the file name."

The desired output is: "Average spam confidence: 0.750718518519"

Here is the code I've written:

fname = raw_input("Enter file name: ")
fh = open(fname)
inp = fh.read()
for line in inp:
    if not line.strip().startswith("X-DSPAM-Confidence: 0.8475") : continue
pos = line.find(':')
num = float(line[pos+1:]) 
total = float(num)
count = float(total + 1)
print 'Average spam confidence: ', float( total / count )

The output I get is: "Average spam confidence: nan"

What am I missing?

jayro
  • 11
  • 1
  • 1
  • 2
  • Try find out out what the values of your variables are. – Scott Hunter Mar 19 '15 at 02:33
  • 1
    for one thing you're not finding anything...if not line.strip().startswith("X-DSPAM-Confidence: 0.8475")...this will find the line that starts with if not line.strip().startswith("X-DSPAM-Confidence: 0.8475"), but you won't find other lines unless they all say X-DSPAM-Confidence: 0.8475. So you are not returning a number with float( total / count ), since you are not appending or adding any numbers – reticentroot Mar 19 '15 at 02:34
  • You don't break out of the for loop when you get the right prefix. You should invert that condition. – sje397 Mar 19 '15 at 02:37

15 Answers15

1
values = []
#fname = raw_input("Enter file name: ")
fname = "mbox-short.txt"
with open(fname, 'r') as fh:
    for line in fh.read().split('\n'): #creating a list of lines
        if line.startswith('X-DSPAM-Confidence:'):
            values.append(line.replace('X-DSPAM-Confidence: ', '')) # I don't know whats after the float value

values = [float(i) for i in values] # need to convert the string to floats
print 'Average spam confidence: %f' % float( sum(values) / len(values))

I just tested this against the sample data it works just fine

reticentroot
  • 3,612
  • 2
  • 22
  • 39
  • just make sure the file name you enter is correct or the path to the file is correct and it should work just fine. If you want to use type in the file name e.g "mbox-short.txt", then make sure the script and the file are in the same directory – reticentroot Mar 19 '15 at 02:55
  • So I tried that and I got an error that stated I was trying to divide by zero on the final line – jayro Mar 19 '15 at 03:58
  • I run it just like that with the sample data you provided in the link using python 2.7 on sublime the text editor. – reticentroot Mar 19 '15 at 04:04
  • This is the output from terminal Average spam confidence: 0.750719. I just re-ran that code with zero changes. Copied and pasted it in an editor right off the website. – reticentroot Mar 19 '15 at 04:06
  • well idk why the autograder isn't accepting it and giving me that error, but it is happening and I didn't change a thing in it.... – jayro Mar 19 '15 at 04:24
  • jayro Note in the code taht hrand provided the filename is hardcoded, not read from stdin. Is that possibly your problem when uploading it? – Eric Renouf Mar 19 '15 at 13:23
  • Maybe. How could I adjust it to work for what I need? – jayro Mar 20 '15 at 03:50
  • Check out this thread to find out how to read until end of file http://stackoverflow.com/questions/21235855/how-to-read-user-input-until-eof – reticentroot Mar 20 '15 at 04:10
  • Also attach the link to the problem not the data. If you haven't figured it out by tomorrow evening ill make an account on the site and try submitting the answer, so I know exactly what the issue is – reticentroot Mar 20 '15 at 04:24
1
#try the code below, it is working.
fname = raw_input("Enter file name: ")
count=0
value = 0
sum=0
fh = open(fname)
for line in fh:
    if not line.startswith("X-DSPAM-Confidence:") : continue
    pos = line.find(':')
    num = float(line[pos+1:])
    sum=sum+num
    count = count+1    
print "Average spam confidence:", sum/count
N_D
  • 11
  • 1
0

My guess from the question is that the actual 0.8475 is actually just an example, and you should be finding all the X-DSPAM-Confidence: lines and reading those numbers.

Also, the indenting on the code you added has all the calcuations outside the for loop, I'm hoping that is just a formatting error for the upload, otherwise that would also be a problem.

As a matter if simplification you can also skip the

inp = fh.read()

line and just do

for line in fh:

Another thing to look at is that total will always only be the last number you read.

Eric Renouf
  • 13,950
  • 3
  • 45
  • 67
  • You're right about the formatting error and I took the 0.8475 out, but I'm still getting the same error. – jayro Mar 19 '15 at 03:59
0
# Use the file name mbox-short.txt as the file name
fname = raw_input("Enter file name: ")
fh = open(fname)
count = 0
total = 0
for line in fh:
    if not line.startswith("X-DSPAM-Confidence:") :     continue
    count = count + 1
   # print count
    num = float(line[20:])
    total +=num
   # print total
    average = total/count
print "Average spam confidence:", average
rocktheartsm4l
  • 2,129
  • 23
  • 38
0

The way you're checking if it is the correct field is too specific. You need to look for the field title without a value (see code below). Also your counting and totaling needs to happen within the loop. Here is a simpler solution that makes use of python's built in functions. Using a list like this takes a little bit more space but makes the code easier to read in my opinion.

How about this? :D

with open(raw_input("Enter file name: ")) as f:
    values = [float(line.split(":")[1]) for line in f.readlines() if line.strip().startswith("X-DSPAM-Confidence")]
    print 'Average spam confidence: %f' % (sum(values)/len(values))

My output:

Average spam confidence: 0.750719

If you need more precision on that float: Convert floating point number to certain precision, then copy to String

Edit: Since you're new to python that may be a little too pythonic :P Here is the same code expanded out a little bit:

fname = raw_input("Enter file name: ")
values = []
with open(fname) as f:
    for line in f.readlines():
        if line.strip().startswith("X-DSPAM-Confidence"):
            values.append(float(line.split(":")[1]))

print 'Average spam confidence: %f' % (sum(values)/len(values))
Community
  • 1
  • 1
rocktheartsm4l
  • 2,129
  • 23
  • 38
0
fname = raw_input("Enter file name: ")
fh = open(fname)
x_count = 0
total_count = 0
for line in fh:
    if not line.startswith("X-DSPAM-Confidence:") : continue
    line = line.strip()
    x_count = x_count + 1
    num = float(line[21:])
    total_count = num + total_count
aver = total_count / x_count

print "average spam confidence:", aver
0
user_data = raw_input("Enter the file name: ")
lines_list = [line.strip("\n") for line in open(user_data, 'r')]


def find_spam_confidence(data):
    confidence_sum = 0
    confidence_count = 0
    for line in lines_list:
        if line.find("X-DSPAM-Confidence") == -1:
            pass
        else:
            confidence_index = line.find(" ") + 1
            confidence = float(line[confidence_index:])
            confidence_sum += confidence
            confidence_count += 1
    print "Average spam confidence:", str(confidence_sum / confidence_count)

find_spam_confidence(lines_list)
noobninja
  • 900
  • 12
  • 14
0
fname = raw_input("Enter file name: ")
fh = open(fname)
c = 0
t = 0
for line in fh:
    if line.startswith("X-DSPAM-Confidence:") : 
        c = c + 1
        p = line.find(':')
        n = float(line[p+1:])
        t = t + n

print "Average spam confidence:", t/c
0
    fname = input("Enter file name: ")
    fh = open(fname)
    count = 0
    add = 0
    for line in fh:
        if line.startswith("X-DSPAM-Confidence:"):
        count = count+1
        pos = float(line[20:])
        add = add+pos
    print("Average spam confidence:", sum/count)
0
fname = input('Enter the file name : ') # file name is mbox-short.txt
try:
    fopen = open(fname,'r') # open the file to read through it
except:
    print('Wrong file name') #if user input wrong file name display 'Wrong file name'
    quit()
count = 0  # variable for number of 'X-DSPAM-Confidence:' lines
total = 0  # variable for the sum of the floating numbers

for line in fopen: # start the loop to go through file line by line
    if line.startswith('X-DSPAM-Confidence:'): # check whether a line starts with 'X-DSPAM-Confidence:'
        count = count + 1 # counting total no of lines starts with 'X-DSPAM-Confidence:'
        strip = line.strip() # remove whitespace between selected lines
        nline = strip.find(':') #find out where is ':' in selected line
        wstring = strip[nline+2:] # extract the string decimal value
        fstring = float(wstring) # convert decimal value to float
        total = total + fstring  # add the whole float values and put sum in to variable named 'total'
print('Average spam confidence:',total/count) # printout the average value
ofirule
  • 4,233
  • 2
  • 26
  • 40
0
total = float(num)

You forgot here to sum the num floats. It should have been

total = total+num 
β.εηοιτ.βε
  • 33,893
  • 13
  • 69
  • 83
0
fname = input("Enter file name: ")
fh = open(fname)
count=0
avg=0
cal=0
for line in fh:
    if not line.startswith("X-DSPAM-Confidence:") :
        continue
    else:
        count=count+1
        pos = line.find(':')
        num=float(line[pos+1:])
        cal=float(cal+num)
        #print cal,count
avg=float(cal/count)
print ("Average spam confidence:",avg)
  • Please don't post only code as answer, but also provide an explanation what your code does and how it solves the problem of the question. Answers with an explanation are usually more helpful and of better quality, and are more likely to attract upvotes – Pouria Hemi Feb 05 '21 at 17:38
  • Hello and welcome to SO! While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value. Please read the [tour](https://stackoverflow.com/tour), and [How do I write a good answer?](https://stackoverflow.com/help/how-to-answer) – Tomer Shetah Feb 06 '21 at 05:48
0

IT WORKS JUST FINE !!!

Use the file name mbox-short.txt as the file name

fname = raw_input("Enter file name: ")

if len(fname) == 0:
    fname = 'mbox-short.txt'

fh = open(fname)
count = 0
tot = 0
ans = 0

for line in fh:
    if not line.startswith("X-DSPAM-Confidence:") : continue
    count = count + 1
    num = float(line[21:])
    tot = num + tot

ans = tot / count
print("Average spam confidence:", ans)
Muhammad Mohsin Khan
  • 1,444
  • 7
  • 16
  • 23
-1
# Use the file name mbox-short.txt as the file name
fname = raw_input("Enter file name: ")
fh = open(fname,'r')
count=0
avg=0.0
cal=0.00 
for line in fh:
    if not line.startswith("X-DSPAM-Confidence:") :        
        continue
    else:
        count=count+1
        pos = line.find(':')
        num=float(line[pos+1:])
        cal=cal+num
        #print cal,count
avg=float(cal/count)
print "Average spam confidence:",avg
-3
fname = raw_input("Enter file name: ")
fh = open(fname)
inp = fh.read()
i = 0
total = 0
count = 0
for line in inp:
    if not line.strip().startswith("X-DSPAM-Confidence: 0.8475"):
        continue
    pos = line.find(':')
    num = float(line[pos+1:]) 
    total += num
    count += 1
print 'Average spam confidence: ', float( total / count )
ziwen lv
  • 1
  • 1