-1

I am still very new to Python. I have a large tab delimited text file with a lot of data, and I'm trying to get values from columns 18 and 22 and making a calculation called "rate" which I then want to append to the end of each line in my text file.

I cannot seem to get any attempt at this to work. My attempt is shown below, but I will always get an error (which is in the ouptut shown below my code).

with open(fieldDataFile, 'rw') as f:
    lines = f.readlines()[1:]
    for i, line in enumerate(lines):
        ratecalc = (float(lines[i+1][21]) - float(lines[i][21]))/(float(lines[i+1][17]) - float(lines[i][17]))
        line[i] = line[i].strip() + str(ratecalc)    
    for line in lines:
        f.write(line)   

Output:

ValueError: could not convert string to float: n

I can't find any letters at all in columns 18 or 22, so I have no idea where this conversion error comes from. Even if the code works, I'm not sure if it would actually append the value.

Any help is greatly appreciated! Thank you!

EDIT: I tried printing out the lines I needed using:

print fieldDataFile
with open(fieldDataFile, 'rw') as f:
lines = f.readlines()[1:]
for i, line in enumerate(lines):

    print (lines[i][21])
    print (lines[i+1][21])
    print (lines[i][17])
    print (lines[i+1][17])

with the result:

n
n
i
i

Although according to the data in the file, it should be:

1452.1
1509.5
0
5.52

Unfortunately, I cannot share headers or more info since the file is proprietary, so I know this makes it hard to help. However, I don't notice anything particularly wrong with it, since it has headers like any other file and columns of numbers below each one in a tab delimited format.

AJS
  • 1
  • 1
  • have you tried printing out `lines[i+1][21]`,`lines[i][21]` , `lines[i+1][17]` and `lines[i][17]`? That's the first step in debugging. It will tell you if the assumptions you are making about the data are correct. According to the error message, one of those has the value "n". – Bryan Oakley Nov 25 '14 at 16:38
  • Apparently, all of column 21 has 'n' and all of column 18 has 'i', which makes no sense because that's definitely not what's in my file. I'm going to have to investigate, since I'm not allowed to post my text file here. – AJS Nov 25 '14 at 16:47
  • You can also print out the value of `line`, which will help you decide if you're reading the correct file. – Bryan Oakley Nov 25 '14 at 16:48
  • Yes, it's reading the right file, but there are so many columns of data, the rows are displayed in a wrapped format. Perhaps this can be causing the error? – AJS Nov 25 '14 at 16:52
  • It depends on what you mean by "wrap". If it's only wrapped by whatever you're using to view the data, no, it doesn't matter. If the data has physical newlines in the file causing what you think should be one line of data to be two physical lines in the file, it matters. – Bryan Oakley Nov 25 '14 at 16:56
  • It looks like it's only wrapped by the viewing window, since expanding the window unwraps it to a degree. I'll have to look at it again after lunch. Thanks so far! – AJS Nov 25 '14 at 17:11

4 Answers4

0

I can't put this as a comment, but here is a piece of advice to help troubleshoot.

Read a smaller piece of your data and use type() on your column/field of interest to make sure there aren't spaces or other hidden characters that are recognized as strings. If they are, take a close look at the data. If you can post a small piece of it, I'm sure the people here will quickly help.

Minnow
  • 1,733
  • 2
  • 26
  • 52
  • Apparently, all of column 21 has 'n' and all of column 18 has 'i', which makes no sense because that's definitely not what's in my file. I'm going to have to investigate, since I'm not allowed to post my text file here. I'll have to keep looking into it. – AJS Nov 25 '14 at 16:48
0

As the output suggested, you probably have picked up a character "n" in your textfile. Try printing out ratecalc and see what are you getting. And go from there.

chad
  • 627
  • 1
  • 8
  • 24
0
  1. As suggested, try printing out the components of ratecalc
  2. While debugging with print drop the float()s and see what the values printed are
  3. By using i+1, on your last line your loop tries to read lines[i+1] (which doesn't exist since you are already on the last line)
  4. Also, n may be some kind of leftover newline character (\n) that is errant in your file

You also could do a lookup similar to the accepted answer of Get Line Number of certain phrase in file Python:

lookup = 'n'

with open(filename) as myFile:
    for num, line in enumerate(myFile, 1):
        if lookup in line:
            print 'found at line:', num
Community
  • 1
  • 1
Douglas Denhartog
  • 2,036
  • 1
  • 16
  • 23
  • Okay I will try to lookup the 'n' and 'i' values with this. The file itself is full of numbers so I don't know where it's finding these strings. EDIT: apparently 'n' is in every single line using the above code. I'm so confused, they're all numbers... – AJS Nov 25 '14 at 16:56
  • Please open `fieldDataFile` in a text editor and paste the first 3 lines into your Question – Douglas Denhartog Nov 25 '14 at 17:22
  • Unfortunately I am not allowed to post it here from where I work. – AJS Nov 25 '14 at 18:21
  • Well, best of luck in your debugging then! – Douglas Denhartog Nov 25 '14 at 19:27
0

Turns out it was an issue with splitting the lines. For some reason I had to specify the tab delimiter, and it worked. Thank you for your help!

with open(fieldDataFile, 'r') as f:
    lines = f.readlines()[1:]
    for i in range(len(lines[:-1])):
        status=0
        try:
            curralt= float(lines[i].split('\t')[21])
            currtime= float(lines[i].split('\t')[17])
            nextalt= float(lines[i+1].split('\t')[21])
            nexttime= float(lines[i+1].split('\t')[17])
AJS
  • 1
  • 1