Removing tab delimited spaces from a text file using for loop

Question

For my python class, I am working on opening a .tsv file and taking 15 rows of data, broken down in 4 columns, and turning it into lists for each line. To do this, I must remove the tabs in between each column.

I've been advised to use a for loop and loop through each line. This makes sense but I can't figure out how to remove the tabs.

Any help?

score 4 · Answer 1 · answered Mar 21 '11 at 00:44

4

To read lines from a file, and split each line on the tab delimiter, you can do this:

rows = []
for line in open('file.tsv', 'rb'):
    rows.append(line.strip().split('\t'))

answered Mar 21 '11 at 00:44

samplebias

37,113
6
107
103

score 4 · Answer 2 · edited May 23 '17 at 11:47

4

Properly, this should be done using the Python CSV module (as mentioned in another answer) as this will handle escaped separators, quoted values etc.

In the more general sense, this can be done with a list comprehension:

rows = [line.split('\t') for line in file]

And, as suggested in the comments, in some cases a generator expression would be a better choice:

rows = (line.split('\t') for line in file)

See Generator Expressions vs. List Comprehensions for some discussion on when to use each.

edited May 23 '17 at 11:47

Community

1
1

answered Mar 21 '11 at 03:28

Blair

15,356
7
46
56

2

I would actually use a [generator expression](http://www.python.org/dev/peps/pep-0289/) here instead of a list comprehension, so that you're not holding a bunch of lists in memory while you process them. Depends on what you're doing with the results, though. – Sasha Chedygov Dec 12 '12 at 22:10

score 3 · Answer 3 · answered Mar 21 '11 at 03:36

You should use Python's stdlib csv module, particularly the csv.reader function.

rows = [row for row in csv.reader(open('yourfile.tsv', 'rb'), delimiter='\t')]

There's also a a dialect parameter that can take excel-tab to conform to Microsoft Excel's tab-delimited format.

score 2 · Answer 4 · answered Mar 21 '11 at 00:45

2

Check out the built-in string functions. split() should do the job.

>>> line = 'word1\tword2\tword3'
>>> line.split('\t')
['word1', 'word2', 'word3']

answered Mar 21 '11 at 00:45

user35147863

2,525
2
23
25

Removing tab delimited spaces from a text file using for loop

4 Answers4