0

How does one break a file into two parts depending on a key word...and then parse through that file for the expression 'chain "edt_'?

martineau
  • 119,623
  • 25
  • 170
  • 301
Cookies Monster
  • 121
  • 1
  • 7
  • 1
    I think you are looking for this. [enter link description here](http://stackoverflow.com/questions/8369175/binary-search-over-a-huge-file-with-unknown-line-length) – Prashanth Kondedath Apr 25 '17 at 23:44

1 Answers1

0

Here is an example text file (named temp.txt):

hello 10 20 30
goodbye 20 30 40
keyword 5 6 7
chain "edt_ 0 1 2
rubbish 2 3 4
more 6 7 8
test 1 2 3

If we want to split the file when we find keyword, and then search for the expression chain "edt_, this is one way to do it:

# Read the file
with open('temp.txt','r') as f:
    data = f.readlines()

# Strip out newline char
data = [i.strip('\n') for i in data]

# Look for keyword
kwLoc = [i.find('keyword') for i in data].index(0)

print 'Keyword found on line {0} - splitting file.'.format(kwLoc)

# Split the file
partOne = data[:kwLoc]
partTwo = data[kwLoc:]

# Optionally save each file
with open('temp_1.txt','w') as f:
    for row in partOne:
        f.writelines(row)
        f.write('\n')

with open('temp_2.txt','w') as f:
    for row in partTwo:
        f.writelines(row)
        f.write('\n')

# Now search the file for the expression - returning the row where it occurs

exLoc = [i.find('chain "edt_')>0 for i in partTwo].index(True)


print 'Found expression chain "edt on row {0}.'.format(exLoc)

Is this what you wanted to do?

Robbie
  • 4,672
  • 1
  • 19
  • 24
  • Thanks so much, Robbie. :) I did run into one error though: Traceback (most recent call last): File "split_file.py", line 32, in exLoc = [i.find('chain "edt_') for i in partTwo].index(0) ValueError: 0 is not in list – Cookies Monster Apr 26 '17 at 22:31
  • The error is saying there are no lines with the string 'chain "edt' in it. Is this what's happening? – Robbie Apr 26 '17 at 22:37
  • No, I just checked and found this in the second file produced by the script: chain "edt_disppipebpar1_channel1" = "0 – Cookies Monster Apr 26 '17 at 22:41
  • Okay - I'm not at my computer at the moment but will check it out soon and edit my answer – Robbie Apr 26 '17 at 22:42
  • Okay check out the new code (just changed the line with exLoc = ...) – Robbie Apr 27 '17 at 10:46
  • Perfect! =) Thank you again, Robbie! – Cookies Monster Apr 28 '17 at 17:49
  • Hey Robbie...for this line: kwLoc = [i.find('measure') for i in data].index(0) How might you make it search backwards? IE: from the end of the file to the beginning of the file? – Cookies Monster May 01 '17 at 05:06