0

I have a dataframe with 2000 columns and 730 rows.

Each column is a time series for a variable.

Th distribution of some of the columns is unimmodal and some are bi-modal.

I would like to do a correlation analysis on the columns but before that i need to split the columns into two dataframes, one that contains all columns that have a uni-modal distribution and the other than contains all columns that have a bi-modal distribution.

Is anyone able to help me with this? I suspect i need to loop through the columns of the dataframe and check for normality or number of modes and do the partitioning this way - just i don't know where to start.

Thanks!

chrisSpaceman
  • 249
  • 3
  • 14
  • It seem like you've already suspected the right way to do it, all that's left is actually starting. – Rocky Li Nov 14 '19 at 21:28
  • according to the SO guidelines (check [mcve]) you need to provide an attempt, once you share your attempts, you can get some feedback – Yuca Nov 14 '19 at 21:29
  • Looping is not actually a great idea; it will work don't get me wrong, but would take some time to complete. You need a vectorized approach. To show you how, you need to provide a sample of your data. [You can refer to this to provide a great pandas example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and [refer to this one on how to provide a minimal, complete, and verifiable example](https://stackoverflow.com/help/minimal-reproducible-example) and revise your question accordingly so people in the community can easily help you. – Joe Nov 15 '19 at 13:07

0 Answers0