Creating a filter while creating a dataframe - Pandas

Asked Feb 12 '16 at 04:09

Active Feb 12 '16 at 12:02

Viewed 61 times

I have a rather large dataset (~15GB zipped). What is the most efficient way of random sampling from this dataset using Pandas? Currently I have the following way;

df = pd.read_csv (file, names = []
    , sep = '|', nrows=10000000)

However this really does not serve my need. Additionally is there a way I can filter the data before creating the dataframe?

Any help is appreciated :)

edited Feb 12 '16 at 12:02

MSeifert

145,886
38
333
352

asked Feb 12 '16 at 04:09

Siddharth Shah

for filter you probably can check this answer http://stackoverflow.com/a/13653490/1744834 – Roman Pekar Feb 12 '16 at 07:58

Creating a filter while creating a dataframe - Pandas

0 Answers0