I have a very large csv file with millions of rows
and a list of the row numbers that I need.like
rownumberList = [1,2,5,6,8,9,20,22]
I know there is something called skiprows
that helps to skip several rows when reading csv file
like that
df = pd.read_csv('myfile.csv',skiprows = skiplist)
#skiplist would contain the total row list deducts rownumberList
However, since the csv file is very large, directly selecting the rows that I need could be more efficient. So I was wondering are there any methods to select rows
when using read_csv
? Not try to select rows using dataframe
afterwards, since I try to minimize the time of reading file.Thanks.