reading lines starting specifically with some criteria

Question

I have a huge data file: I need to extract lines starting with say U(1.0 ----) irrespective of the line number because the line number varies with each run.
I tried splitting and reading but the output is not handleable. Can anyone help me?

https://developers.google.com/edu/python/regular-expressions?hl=fr — Luca Davanzo, Jul 29 '14 at 11:03
1) Show us a sample of the input. 2) Tell us which lines of the input you want to extract and why. 3) Show us what you have done in Python to solve the problem. — , Jul 29 '14 at 11:06
possible duplicate of [Grep and Python](http://stackoverflow.com/questions/1921894/grep-and-python) — Luca Davanzo, Jul 29 '14 at 11:11

score 0 · Answer 1 · answered Jul 29 '14 at 11:12

you have to read a file (https://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files)
Then make a loop though lines and get the first part of the line.
Then you need to check if match with a regular expression you design for that task.

Hope it helps you :)

score 0 · Answer 2 · answered Jul 29 '14 at 11:13

0

Use the startswith() string method on each line and add them to a seperate list for analysis

data = open("whatever").readlines()
results = []
for line in data:
   if line.startswith("U(1.0"):
      results.append(line)

answered Jul 29 '14 at 11:13

manicphase

618
6
9

score 0 · Answer 3 · answered Jul 29 '14 at 11:22

Similar to manicphase's answer, use Python's startswith string method to pick out the lines you are interested in.

with open('mydata.txt') as data:
    for line in data:
        if line.startswith('U(1.0 '):
            # Do stuff here

A little simpler than manicphase's solution and quicker, as you don't need to re-iterate over the list which, if you have a lot of data, might have an adverse effect.

I don't have enough reputation to comment on manicphase's answer, so I shall make a note here instead: The space delimiter after the 1.0 is important if the data can have more than one decimal point (question doesn't specify), otherwise it might match U(1.0234 xxxx) as well.

reading lines starting specifically with some criteria

3 Answers3