5

I have this simple file:

1|2|234 A=Jim 33
1|2|765 A=Sam 44
1|2|561 A=Edy 55

I want to parse the file to get the following output:

["1","2","Jim 33"]
["1","2","Sam 44"]
["1","2","Edy 55"]

I tried to split by "|", but the problem I am facing is how to split by "A=" or how to make the program recognizes "A=" and prints what is after it.

The algorithm that I have in mind is to iterate through each split item and check if the item contains the character "A=". Not sure how to translate that into python code. Any pythonic idea?

falsetru
  • 357,413
  • 63
  • 732
  • 636
MEhsan
  • 2,184
  • 9
  • 27
  • 41
  • You could use a regex, like so: `(\d)\|(\d)\|(\d{3}) A=(.+)`, then get the groups. – Asad Saeeduddin Dec 09 '15 at 03:48
  • 1
    Does each line always have the same length for each part? Or might there be a line with, say, `AC=Alice`? – TigerhawkT3 Dec 09 '15 at 03:50
  • Thank you for asking... The length of the lines are not consistent ... The sample of the file I put is simplistic ... Any idea how to tackle this issue? @TigerhawkT3 – MEhsan Dec 09 '15 at 03:54

1 Answers1

7

You can use regular expression, re.split:

>>> import re
>>> re.split('\|| A=', '1|2|234 A=Jim 33')
['1', '2', '234', 'Jim 33']

\|| A= will match | or A=. The first | was escaped because | has special meaning in regular expression (meaning OR).

falsetru
  • 357,413
  • 63
  • 732
  • 636