0

Lets say I have a Text file (input_file.txt, size ~10gb ) with the below content
Jun 6 17:58:13 other strings not unique
Jun 6 17:58:13 other strings not unique
Jun 6 17:58:14 other strings not unique
Jun 6 17:58:14 other strings not unique
Jun 6 17:58:15 other strings not unique
Jun 6 17:58:15 other strings not unique
Jun 6 17:58:15 other strings not unique
Jun 6 17:58:15 other strings not unique
Jun 6 17:58:16 other strings not unique
Jun 6 17:58:16 other strings not unique
Jun 6 17:58:16 other strings not unique
Jun 6 17:58:17 other strings not unique
Jun 6 17:58:18 other strings not unique
Jun 6 17:58:19 other strings not unique
Jun 6 17:58:20 other strings not unique

Now I need to write a Python code which will read the text file and copy the contents between start and end to another file.

I wrote the following code.

import re    
with open(r'C:\Python27\log\master_input.txt', 'r') as infile, open(r'C:\Python27\log\output.txt', 'w') as outfile:    
   copy = False    
   for line in infile:    
      if re.match("Jun  6 17:58:14(.*)", line):    
         copy = True    
      elif re.match("Jun  6 17:58:16(.*)", line):    
         copy = False
      elif copy:
         outfile.write(line)

Output of my code: ( Note: I'm not getting the desired output as expected )
Jun 6 17:58:14 other strings not unique
Jun 6 17:58:14 other strings not unique
Jun 6 17:58:16 other strings not unique
Jun 6 17:58:16 other strings not unique
Jun 6 17:58:16 other strings not unique
Jun 6 17:58:16 other strings not unique
Jun 6 17:58:16 other strings not unique

Expected output is :
Jun 6 17:58:14 other strings not unique
Jun 6 17:58:14 other strings not unique
Jun 6 17:58:15 other strings not unique
Jun 6 17:58:15 other strings not unique
Jun 6 17:58:15 other strings not unique
Jun 6 17:58:15 other strings not unique
Jun 6 17:58:16 other strings not unique
Jun 6 17:58:16 other strings not unique
Jun 6 17:58:16 other strings not unique

Pls help me here to do it in best way...Thanks in Advance.

HelloR
  • 49
  • 1
  • 1
  • 9

1 Answers1

0

Not Python, but every self respecting developer should know sed.

$ cat data.txt 
foo
bar
baz
qux
quux
quuux

$ sed -n "2,4p" < data.txt
bar
baz
qux