-1

I have a csv file which contains 3 years of data (time series). After importing the data in python I have created dataframe. The problem is that the indexing starts from 0 and continues but I want that for each month the indexing can again begin from 0 so that when I plot the day number on graph starts from 0. How can I create multiple dataframes for different months by extracting only certain number of rows each time for a single dataframe

Tried to create an empty dataframe and import a certain number of rows (for February) in it but still the index it take is 31 as January ends on 30 (if we start from 0). I imported csv, created a dataframe, used the iloc function for jan data (row indexing is from 0 to 31), for feb I have done row accessing from 31:59, so it shows in print as well as plot starting from 31.

But I want to make each month start from 1.

Timus
  • 10,974
  • 5
  • 14
  • 28
  • `df.groupby('month')` and then plot? – SiP Apr 11 '23 at 08:08
  • 1
    Are you using Padas? If so, please add the resp. tag to the question. Please also add a [MRE](https://stackoverflow.com/help/minimal-reproducible-example) (also look [here](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples)) that replicates your problem. – Timus Apr 11 '23 at 09:19

1 Answers1

0

IIUC, use sample/reset_index inside a dictcomp :

import numpy as np
import pandas as pd
​
dates = pd.date_range(start="2021-01-01", end="2023-12-31", freq="D") # 3 years of data
df = pd.DataFrame({"value": np.random.rand(len(dates))}, index=dates) # ---------------
​
dfs = {f"{n.month_name()}_{n.year}": g.reset_index(drop=True) for n, g in df.resample("M")}

​ Output :

{'January_2021':        value
 0   0.061758
 1   0.727779
 2   0.330694
 3   0.987844
 4   0.558633
 ..       ...
 26  0.866195
 27  0.304799
 28  0.972036
 29  0.691263
 30  0.966565
 
 [31 rows x 1 columns],
 'February_2021':        value
 0   0.426072
 1   0.866830
 2   0.662469
 3   0.449467
 4   0.335500
 ..       ...
 23  0.409101
 24  0.790689
 25  0.946540
 26  0.022972
 27  0.648176
 
[28 rows x 1 columns],
...
Timeless
  • 22,580
  • 4
  • 12
  • 30