1

The problem is i need to read a text.txt file and just get very specific data from it. the entries of that text.txt looks like this

b(1,4,8,1,4,TEST,0,3,AAAA,Test,2-150,000)
a(1,1,3,1,3,BBBB,0,3,BBBB,Test,2-150,000)
a(1,0,2,1,4,TEST,0,3,CCCC,Test,2-150,000)
b(1,1,0,1,4,TEST,0,3,DDDD,Test,2-150,000)

So now i just whant those lines with "a(" and in those i just need the sting after the 5 and 8 comma, so in line 2 it would be BBBB ,BBBB

my code so far is:

infile = open("text.txt","r") 
numlines = 0
found = []

for line in infile:
 numlines += 1
 if "a" in line:
  line=line[line.find("(")+1:line.find(")")]
  found.append(line.split(','))

wordLed=len(found)
for i in range(0,wordLed):
    print found[i]


infile.close()

This just gives me the full lines seperated at the "," but how can i index though them?

GEnGEr
  • 195
  • 2
  • 18

4 Answers4

4

The quick short and dirty:

with open('text.txt') as f:
    result = [line.split(',')[5:9:3] for line in f if line.startswith("a(")]
#                            ^^^^^^^
#                       "5 to 9 (excl.) by step of 3"
#                       that is items 5 and 5+3
#
#                       replace by [5] if you only want the fifths item
#                       replace by [5:9] if you want items from 5 to 9 (excl.) 

from pprint import pprint
pprint(result)

dirty because of the lack of error handling...

... anyway, given your sample data, this produces:

[['BBBB', 'BBBB'], ['TEST', 'CCCC']]
Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • 1
    I like this solution the best, however, why 5:9:3... am I missing something? Nevermind... I understand.. and now I like it just a little bit less. :D – dss539 Aug 26 '14 at 14:17
  • @dss539 that's for selecting both values in one shot. – Raydel Miranda Aug 26 '14 at 14:19
  • @dss539 `[5:9:3]` means _"take values from 5 to 9 (excl.) by step of 3"_. Its a _trick_ to get item 5 and 5+3. – Sylvain Leroux Aug 26 '14 at 14:20
  • sounds nice and is it possible to than work with just 1 value of "result" like "CCCC" or do i than need to index the whole thing again ? Like print result[2,2] would than be CCCC – GEnGEr Aug 26 '14 at 14:21
  • @GEnGEr Dude, what you mean with just one value. Please elaborate you question. – Raydel Miranda Aug 26 '14 at 14:24
  • 1
    @GEnGEr This is the [slice notation](http://stackoverflow.com/questions/509211/pythons-slice-notation). Notice too that the result is a list of list. Given the code in the answer, if you only want to _display_ the column with `CCCC` you can write `for row in result: print(row[1])`. Feel free to *ask an other question* if you need more informations. – Sylvain Leroux Aug 26 '14 at 14:27
  • Ahh ok that http://stackoverflow.com/questions/509211/pythons-slice-notation sounds just like the stuff i need now. Yes it was intended to get a List of a list. – GEnGEr Aug 26 '14 at 14:32
3

I would use readlines function:

with open("data.txt","r") as f:
    lines = f.readlines()
for line in lines:
    if line[0:2] == 'a(':
        data1 = line.split(',')[5]
        data2 = line.split(',')[8]
        print(data1, data2)       
f.close()
Northern
  • 2,338
  • 1
  • 17
  • 20
  • If a line is empty (ie ""), `line[0:2]` will crash the program – Pphoenix Aug 26 '14 at 14:22
  • 1
    that's right. Error handling is necessary to make it robust. Or simply check length of line and splitted list before using `line[0:2]` and `line.split(',')[5]` and `line.split(',')[8]` – Northern Aug 26 '14 at 14:35
1

You should check the full condition on start, i.e. a( instead of a. Also you could use split to create an array out of your string, based on ,:

infile = open("text.txt","r")

for line in infile:
 if line.startswith("a("): # Starts with a(
  data = line.split(',')
  print data[5] # Print data at place 5
  print data[8] # Print data at place 8

infile.close()
Pphoenix
  • 1,423
  • 1
  • 15
  • 37
1
for line in [l for l infile if l.startswith('a(')]
    line = line[line.find('('):].strip('()\n').split(',')
    a_field, other_field = line[5], line[8]

You split the string already, just index into the list to get the fields you want.

dss539
  • 6,804
  • 2
  • 34
  • 64
  • @Raydel the OP had problems with extracting specific fields. I can include the filtering in my answer, too, I suppose – dss539 Aug 26 '14 at 14:13
  • Don't forget the `'\n'` that may end a line. A small modification can answer the question: `s[s.find('('):].strip('()\n').split(',')` – Taha Aug 26 '14 at 14:22
  • @Taha Added your suggestion. Thanks. – dss539 Aug 26 '14 at 14:24