-2

I originally have some time series data, which looks like this and have to do the following:

enter image description here

  1. First import it as dataframe
  2. Set date column as datetime index
  3. Add some indicators such as moving average etc, as new columns
  4. Do some rounding (values of the whole column)
  5. Shift a column one row up or down (just to manipulate the data)
  6. Then convert the df to list (because I need to loop it based on some conditions, it's a lot faster than looping a df because I need speed)
  7. But now I want to convert df to dict instead of list because I want to keep the column names, it's more convenient

But now I found out that convert to dict takes a lot longer than list. Even I do it manually instead of using python built-in method.

My question is, is there a better way to do it? Maybe not to import as dataframe in the first place? And still able to do Point 2 to Point 5? At the end I need to convert to dict which allows me to do the loop, keep the column names as keys? THanks.

P.S. the dict should look something like this, the format is similar to df, each row is basically the date with the corresponding data.

enter image description here

hpaulj
  • 221,503
  • 14
  • 230
  • 353
saga
  • 736
  • 2
  • 8
  • 20
  • How should the dict look like? One dict, keys are column names, values are the full columns? Or a list of dicts, one for each row? – Michael Butscher May 31 '20 at 04:08
  • @MichaelButscher Post updated, it should look like that thanks. – saga May 31 '20 at 04:16
  • A pandas dataframe has the method "to_dict" for that. – Michael Butscher May 31 '20 at 04:28
  • @MichaelButscher As I said, I tried both built-in to_dict() and manual way. Manual is faster than to_dict but it’s still slow. – saga May 31 '20 at 04:36
  • What do you need the dicts for? Maybe [named tuples](https://docs.python.org/3/library/collections.html#collections.namedtuple) can be a replacement (but I don't know if they are really much faster). – Michael Butscher May 31 '20 at 05:51
  • Does this answer your question? [python pandas dataframe columns convert to dict key and value](https://stackoverflow.com/questions/18012505/python-pandas-dataframe-columns-convert-to-dict-key-and-value) – Joe May 31 '20 at 06:25
  • https://stackoverflow.com/questions/49077008/pandas-datarame-to-dict – Joe May 31 '20 at 06:26

1 Answers1

0

On item #7: If you want to convert to a dictionary, you can use df.to_dict()

On item #6: You don't need to convert the df to a list or loop over it: Here are better options. Look for the second answer (it says DON'T)

Leon Lu
  • 24
  • 3
  • I tried to_dict it’s slow. Tried manual way, a bit faster but still slow. Iterow as I said it’s super slow to loop a df. – saga May 31 '20 at 04:39
  • try .apply() to apply some function to all the rows in a column. I'm unaware of a faster method to convert to dict than .to_dict() – Leon Lu May 31 '20 at 05:01