-1

I'm trying to produce the average of numbers in a specified column in a text file. I am receiving an error that python could not convert string to float, although I don't see where I could be passing it an invalid string.

def avg_col(f, col, delim=None, nhr=0):
    """
    file, int, str, int -> float

    Produces average of data stored in column col of file f

    Requires: file has nhr header rows of data; data is separated by delim

    >>> test_file = StringIO('0.0, 3.5, 2.0, 5.8, 2.1')
    >>> avg_col(test_file, 2, ',', 0)
    2.0

    >>> test_file = StringIO('0.0, 3.5, 2.0, 5.8, 2.1')
    >>> avg_col(test_file, 3, ',', 0)
    5.8
    """
    total = 0 
    count = 0


    skip_rows(f, nhr)
    for line in f: 
        if line.strip() != '':
            data = line.split(delim)
            col_data = data[col]
            total = sum_data(col_data) + total
            count = len(col_data) + count 
    return total / count

def sum_data(lod):
    '''
    (listof str) -> Real 

    Consume a list of string-data in a file and produce the sum

    >>> sum_data(['0'])
    0.0
    >>> sum_data(['1.5'])
    1.5

    '''
    data_sum = 0
    for number in lod: 
        data_sum = data_sum + float(number)
    return data_sum
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Kyle Dewhurst
  • 13
  • 1
  • 4
  • 3
    You need to share the full traceback, and if you can provide some sample data that reproduces the exception, that'd be far more helpful than just the code you posted. – Martijn Pieters Jan 26 '15 at 20:24

1 Answers1

2

You are passing in one string to sum_lod():

data = line.split(delim)
col_data = data[col]
total = sum_data(col_data) + total

data is a list of strings, data[col] is then one element.

sum_data() on the other hand expects an iterable:

def sum_data(lod):
    # ...
    for number in lod: 

Iterating over a number then gives you the individual characters:

>>> for element in '5.8':
...     print element
... 
5
.
8

Trying to turn each element of such a string can easily lead to you trying to convert characters that are not numbers to a float:

>>> float('.')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: could not convert string to float: .

Either pass in a list of strings:

total += sum_data([col_data])
count += 1

or simply use float() on the one element you have:

total += float(col_data)
count += 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343