I am processing a 500MB file. the processing time increased when used re.search.
Please find the below cases i have tested. In all the cases i am reading file line by line and using only one if
condition.
Case1:
prnt = re.compile(r"(?i)<spanlevel level='7'>")
if prnt.search(line):
print "Matched"
out_file.write(line)
else:
out_file.write(line)
This has taken 16 seconds to read the entire file.
Case2:
if re.search(r"(?i)<spanlevel level='7'>",line):
print "Matched"
out_file.write(line)
else:
out_file.write(line)
This has taken 25 seconds to read the file.
Case3:
if "<spanlevel level='7'>" in line:
print "Matched"
out_file.write(line)
else:
out_file.write(line)
This has taken only 8 seconds to read the file.
Can any one of you please let know the diference between the three cases. and Case3 is processing very fast but i am unable to do case-insensitive match. how to do a case-insensitive match in Case3 ?