I want to create a function that reads a csv file line by line and loads those lines that meet two different regex conditions. The first condition is loading those lines that include any roman number: IVXLCDM After that condition is met, I need to filter out the ones that include the following pattern: .od.s
So if I have a csv file like this:
547 I. Line 1
479 II. Todos Line 2
897 Line 3
879 XI. Line 4
It should only load these lines:
547 I. Line 1
879 XI. Line
So far I have this:
def load_file(file_extension):
import re
file = open(file_extension,'r')
filter1 = re.compile("\d{3}\s+.([.IVXLCDM.]+)")
filter2 = re.compile(".od.s")
final_list = []
for line in file:
if re.search(filter1,line):
if not re.search(filter2,line):
final_list.append(line)
return(final_list)
file.close()
print(load_file('file.csv'))
But it keeps returning an empty list.
I am not sure if this can be done in a single function. I also tried creating two different functions: One that filters a list with both regex conditions, and another one that calls the first function when it reads a csv file. But it also didn't work.