0

I have a pandas dataframe in Python.

          datetime           machineID
0   2021-10-01 00:00:00        1.0
1   2021-10-01 00:00:00        2.0
2   2021-10-01 00:00:00        3.0
3   2021-10-01 00:00:00        4.0
4   2021-10-01 00:00:00        5.0
... ... ...
443 2021-10-07 12:00:00       28.0
444 2021-10-07 12:00:00       29.0
445 2021-10-07 12:00:00       30.0
446 2021-10-07 12:00:00       31.0
447 2021-10-07 12:00:00       32.0

There are 7 days in this dataframe from 2021-10-01 to 2021-10-07. This is indexed as per datetime like for every machineID, all the machineIDs come for that date then for next date all machineIDs come and so on.

What I want is, I want to reindex this dataframe such that for each machineID, all 7 dates come then for next machineID all dates come. Something like this,

          datetime           machineID
0   2021-10-01 00:00:00        1.0
1   2021-10-02 00:00:00        1.0
2   2021-10-03 00:00:00        1.0
3   2021-10-04 00:00:00        1.0
4   2021-10-05 00:00:00        1.0
... ... ...
443 2021-10-03 12:00:00       32.0
444 2021-10-04 12:00:00       32.0
445 2021-10-05 12:00:00       32.0
446 2021-10-06 12:00:00       32.0
447 2021-10-07 12:00:00       32.0

I am not able to find any method to do so.

PeakyBlinder
  • 1,059
  • 1
  • 14
  • 35
  • How looks second group? – jezrael May 31 '22 at 11:05
  • 1
    Need `df.sort_values(by=['machineID', 'datetime'], ignore_index=True)` – jezrael May 31 '22 at 11:10
  • If you want to sort the values first but also reset the index (if the new df is the format you want), you could sort by `df.sort_values(['machineID', 'datetime'], inplace=True)` and then overwrite the df to reset index in accordance with the sorting result `df.reset_index(drop=True, inplace=True)` . – MrWorldWide May 31 '22 at 11:36

3 Answers3

1

I think you may be looking for df.sort_values()

df.sort_values(by=['machineID', 'datetime'])

You may need to tinker with the parameter <ascending=True/False> to get your desired result.

Documentation available here:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html

bwe69
  • 103
  • 3
0

you should tell pandas to reindex on your custom sort column

df =  df.sort_values(by='datetime',ignore_index=True)
Hadi Rahjoo
  • 175
  • 7
0

You can simply sort your dataframe by machineID, something like :

df = df.sort_values(by=['machineID'], ignore_index=True)
mrCopiCat
  • 899
  • 3
  • 15