I'm trying to extract lines from a very large text file (10Gb). The text file contains the output from an engineering software (it's not a CSV file). I want to copy from line 1 to the first line containing the string 'stop' and then resume from the first line containing 'restart' to the end of the file.
The following code works but it's rather slow (about a minute). Is there a better way to do it using pandas? I have tried the read_csv function but I don't have a delimiter to input.
file_to_copy = r"C:\Users\joedoe\Desktop\C ANSYS R1\PATCHED\modes.txt"
output = r"C:\Users\joedoe\Desktop\C ANSYS R1\PATCHED\modes_extract.txt"
stop = '***** EIGENVECTOR (MODE SHAPE) SOLUTION *****'
restart = '***** PARTICIPATION FACTOR CALCULATION ***** X DIRECTION'
with open(file_to_copy) as f:
orig = f.readlines()
newf = open(output, "w")
write = True
first_time = True
for line in orig:
if first_time == True:
if stop in line:
first_time = False
write = False
for i in range(300):
newf.write(
'\n -------------------- MIDDLE OF THE FILE -------------------')
newf.write('\n\n')
if restart in line: write = True
if write: newf.write(line)
newf.close()
print('Done.')