0

I am using the following code to import data in Python:

 one=[]
 two=[]
 three=[]
 four=[]
 five=[]
 six=[]
 x=0
 for line in open('text.txt', 'r'):
     if x==2:
         column0, column1, column2, column3, column4, column5 = line.split(',')
 else:
    column0, column1, column2, column3, column4, column5 = line.split(' ')
    one.append(column0)
    two.append(column1)
    three.append(column2)
    four.append(column3)
    five.append(column4)
    six.append(column5)
    x=x+1

This code imports this text file:

 1 2 3 4 5 6 
 1 2 3 4 5 6 
 1,2,3,4,5,6
 1 2 3 4 5 6
 1 2 3 4 5 6

But I am having so much trouble with how to import the following

 1 2 3 4 5 6 
 1 2 3 4 5 6 
 1,2,3,
 4,5,6 
 1 2 3 4 5 6
 1 2 3 4 5 6

Even though the data has a break on the third line I want it to import the same way as the first text file. I tried importing by row and then using the number of variables for the third row but I could not get it to work.

Does anyone know of any resources or examples or that can help? Thanks!

user1026987
  • 73
  • 2
  • 9

3 Answers3

1
  1. You've asked nine questions on StackOverflow and only accepted an answer to one of them? Please show your appreciation for the kindness of strangers and accept some answers.

  2. Saying "I tried it and it doesn't work" is good, but how did it fail to work? What was the error message? If the program behaved unexpectedly, what did you actually expect?

  3. For reading comma (space?) separated data, I think you actually want python's csv module. With appropriate arguments to the delimiter option it will also read space-separated data, or you can replace the spaces with commas before you read the data.
Community
  • 1
  • 1
Li-aung Yip
  • 12,320
  • 5
  • 34
  • 49
  • I don't think `csv` will work here since the data is not tabular and has inconsistent delimiters. – Mike DeSimone Mar 17 '12 at 03:18
  • If by 'not tabular' you mean that it's not a rectangular array of numbers, it actually doesn't matter - `csv` will split each line up into as many tokens as required. It doesn't matter if some lines have more tokens than others. Inconsistent delimiters can be fixed with a per-processing run (replace spaces with commas) before you start. – Li-aung Yip Mar 17 '12 at 03:21
  • By "not tabular" I mean "a row may be split across multiple lines", see the second example. `csv` will do the same thing as his current code and read the data as two rows of three columns each. – Mike DeSimone Mar 17 '12 at 03:32
1

The line for line in open('text.txt', 'r'): iterates through the lines in the files. It pays no attention to your commas.

If you want to iterate by element instead of by line, you have to use a different loop.

You probably want to read this question: How to read numbers from file in Python? It shows how to read a number at a time, ignoring line breaks. You'll need to pass a parameter to the split function to tell it to skip commas as well as whitespace.

P.S. In your current code, the if x==2: does nothing. If you really want to count lines, you need the enumerate function.

Community
  • 1
  • 1
Mike DeSimone
  • 41,631
  • 10
  • 72
  • 96
1

heres one way to get the lines right ... Its not a complete answer but i think it addresses whatever your problem might be

fh  = open('Documents/import.txt').read()

for line in fh.split('\n'):
    print line.strip()
    splits = line.split()
    if  len(splits) ==1 and splits[0]== line.strip():
        splits = [item for item in line.strip().split(',') if item]
    print splits

whoops ... didnt read what you wanted try this

fh  = open('Documents/import.txt').read()
the_list = []
for line in fh.split('\n'):
    print line.strip()
    splits = line.split()
    if  len(splits) ==1 and splits[0]== line.strip():
        splits = line.strip().split(',')
    if splits:the_list.append(splits)

for i in range(len(the_list)):
    print the_list[i]
    if  the_list[i][-1]=='':
        the_list[i].pop(-1)
        the_list[i].extend(the_list[i+1])
        i += 1

print the_list
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
  • Would you mind explaining why you added the second for loop: `for i in range(len(the_list)): print the_list[i] if the_list[i][-1]=='': the_list[i].pop(-1) the_list[i].extend(the_list[i+1]) i += 1` – user1026987 Mar 18 '12 at 14:21