0

Problem

I have two files i have to read and extract data from. The data contains these strings:

162,   520,20121028,  -28,    0
162,   520,20121029,   54,    0
162,   520,20121102,   48,    0
162,   520,20121103,   33,    0
162,   520,20121104,   12,    0

I need to unpack the data into a variable which i want to add to a list so that i can analyze it.

What i tried

hoog ='/Users/arkin/programming/TX_STAID000162.txt'

data_laag = open('/Users/arkin/programming/TN_STAID000162.txt')

temp = []

for line in data_laag:

    niks = line.split(',')
    temp.append(niks[3])

i wanted to give the third string the name temp and append them to a list however i get the error

index out of range 

the result of the split is:

['   162', '   520', '20121130', '  -28', '    0\r\n']
  • 2
    Note that some of those lines are empty... – jonrsharpe Dec 11 '14 at 10:59
  • 2
    Consider printing out the lines you're splitting and the results of the split. – bereal Dec 11 '14 at 11:00
  • in the file where i read from it is not i did it to clearify but i will edit it –  Dec 11 '14 at 11:01
  • Your code should work, so the issue must be in your input file! – Don Dec 11 '14 at 11:16
  • I've tried to reproduce your error and it worked fine for me. Maybe you are wrong with error description and provided wrong data to us or vv, you have good data here but not in the file you actually reading. – VP. Dec 11 '14 at 11:19
  • victor you are right, my input file has some text at the beginning i started splitting after the correct line. pretty stupid from me but a valuable lesson learned! thank you! –  Dec 11 '14 at 11:23

1 Answers1

0

Your code is correct, so something is wrong with your input file. There may be a line with only 2 comma separated values perhaps? or just a blank line?

You could use a regular expression to check if the line is valid before attempting to grab the data that doesn't exist. Heres an example:

import re

regex = re.compile("([\s -]*\d*,){4}")
data_laag = open("data_file.txt")
temp = []

for line in data_laag:
    if regex.match(line):
        niks = line.split(',')
        temp.append(niks[3].strip())

print temp

with this code, I get the output

['-28', '54', '48', '33', '12']
linzwatt
  • 117
  • 6