Pandas : What is the fastest way to get a list of single panda series from a large dataframe?

Question

How the questions explain, i search for a very fast split method to get a list of all panda series from a panda dataframe to multiprocessing them. hsplit from numpy split it to single dataframes. Is there a method im not aware of?

Yeah, iterate through `df.columns`. How fast to do you want it to be? Columns are superficial, they're basically just keys holding Series objects. — roganjosh, Apr 12 '19 at 10:49
Actually, this really doesn't make sense. If you're now turning to multiprocessing, the segmentation of data by column is likely <1% of the overall processing time for whatever you're doing — roganjosh, Apr 12 '19 at 10:53

score 0 · Answer 1 · answered Apr 12 '19 at 10:53

0

If you're trying to iterate through all the columns df.iloc[index] or df.loc[index] is probably the best method. However, I don't see how this would be that important. Pandas is already exceptionally efficient for operations and its hard to beat the core dataframe indexer speed.

answered Apr 12 '19 at 10:53

Zubin Aysola

1
1

1

Why do any of that when you can just get the column by name? – roganjosh Apr 12 '19 at 10:55
It all depend on the @Varlor use case whether he require records column wise or row wise Check the similar case https://stackoverflow.com/questions/33246771/convert-pandas-data-frame-to-series – Chirag Apr 12 '19 at 12:21

Pandas : What is the fastest way to get a list of single panda series from a large dataframe?

1 Answers1