-1

I'm trying to read a file in Python that looks something like this:

hello\t\tsecondhello\n
this\t\tsecondthis\n
is\t\tsecondis\n
data\t\tseconddata\n

I'm only interested in the second piece of information for each line, so I'm trying to get rid of those two tabs and the new lines. I tried this:

documents = open("data.txt", "r").readlines()
for line in documents:
    splitted = line.strip().split("\t")
    print(splitted) 

But this only gives me list objects that look like this:

['hello\t\tsecondhello']

I've also looked at this accepted answer but it gives me the same only that the new lines are kept as well: splitting a string based on tab in the file

EDIT: found the error, it was false formatting in the input file. still, thanks for your help, people

Community
  • 1
  • 1
dot
  • 87
  • 1
  • 3
  • 13
  • 1
    `line.strip().split("\t\t")`? – Ozgur Vatansever Jan 30 '17 at 22:17
  • nope, tried that already, getting the exact same output – dot Jan 30 '17 at 22:20
  • I don't get the same result. I'm using Python 2.7, and I get each line splitting into three fields, as I expect: lines such as ['hello', '', 'secondhello']. Can you try printing the line and the split string, one character at a time? – Prune Jan 30 '17 at 22:21
  • I am using python 3.5 and .split("\t\t") will do the work, .split("\t") gives same result as Prune mentioned. – Steve Deng Zishi Jan 30 '17 at 22:30
  • splitted = line.strip().split("\t")[2] gives the second value – Vaishali Jan 30 '17 at 22:33
  • the code basically just converts each line into a list object. so the line looks exactly like the list object. it never seems to actually split by tab – dot Jan 30 '17 at 22:33
  • @VaishaliGarg I'm getting an `index out of range` when I try to access it with [2] – dot Jan 30 '17 at 23:07

2 Answers2

1

It looks like your \t are actually escaped and not actual tabs. So try

line.strip().split("\\t\\t")
koalo
  • 2,113
  • 20
  • 31
0

This works with data you provided:

data = documents.strip().split('\n')
wanted_data = [item.split('\t')[2] for item in data if item]
zipa
  • 27,316
  • 6
  • 40
  • 58
  • Getting this `AttributeError: 'list' object has no attribute 'strip'`. – dot Jan 30 '17 at 22:40
  • If my answer was helpful, don't forget [accept](http://meta.stackexchange.com/a/5235/345643) it. Thanks. – zipa Jan 31 '17 at 10:41