I want to to randomly sample a data frame without reading the entire csv in pandas. Is this possible?
There's an argument nrows
but i think it gets the first n rows and it's not actually random.
I don't want to use .sample()
because that means I have to read the entire csv first.
My code
sample_size = 10
df = pd.read_csv(input_data, nrows=sample_size)