-1

How the questions explain, i search for a very fast split method to get a list of all panda series from a panda dataframe to multiprocessing them. hsplit from numpy split it to single dataframes. Is there a method im not aware of?

Varlor
  • 1,421
  • 3
  • 22
  • 46
  • Yeah, iterate through `df.columns`. How fast to do you want it to be? Columns are superficial, they're basically just keys holding Series objects. – roganjosh Apr 12 '19 at 10:49
  • you can use df.iloc[index] it will return row – Chirag Apr 12 '19 at 10:50
  • Actually, this really doesn't make sense. If you're now turning to multiprocessing, the segmentation of data by column is likely <1% of the overall processing time for whatever you're doing – roganjosh Apr 12 '19 at 10:53

1 Answers1

0

If you're trying to iterate through all the columns df.iloc[index] or df.loc[index] is probably the best method. However, I don't see how this would be that important. Pandas is already exceptionally efficient for operations and its hard to beat the core dataframe indexer speed.

  • 1
    Why do any of that when you can just get the column by name? – roganjosh Apr 12 '19 at 10:55
  • It all depend on the @Varlor use case whether he require records column wise or row wise Check the similar case https://stackoverflow.com/questions/33246771/convert-pandas-data-frame-to-series – Chirag Apr 12 '19 at 12:21