1

i have 4 data frames for covid cases and I wanna concatenate them to plot them:

    df1:
    
        Date    active
    0   March   29
    1   April   3332
    2   May 8257
    3   June    5912
    4   July    11418
    5   August  11292
    6   September   4386
    7   October 1024
    8   November    1883
    9   December    1934
    10  January 1653
    11  February    255
    
    df2:
    
        Date    cases
    0   March   6
    1   April   241
    2   May 637
    3   June    671
    4   July    1512
    5   August  1304
    6   September   271
    7   October 72
    8   November    182
    9   December    152
    10  January 68
    11  February    14

df3:
  

      Date  deaths
        0   April   1
        1   May 2
        2   June    14
        3   July    29
        4   August  13
        5   September   10
        6   October 9
        7   November    2
        8   December    3
        9   January 3
    df4:
        Date    recovories
        0   April   43
        1   May 652
        2   June    704
        3   July    1239
        4   August  1259
        5   September   632
        6   October 69
        7   November    150
        8   December    148
        9   January 78
        10  February    16

when I concatenate them I expect 5 columns :(Date, cases, active, deaths, recovories) and 11 rows, but this happen (they repeat themselves ):

        Date    active  Date    cases   Date    deaths  Date    recovories
    0   March   29  March   6   April   1.0 April   43.0
    1   April   3332    April   241 May 2.0 May 652.0
    2   May 8257    May 637 June    14.0    June    704.0
    3   June    5912    June    671 July    29.0    July    1239.0
    4   July    11418   July    1512    August  13.0    August  1259.0
    5   August  11292   August  1304    September   10.0    September   632.0
    6   September   4386    September   271 October 9.0 October 69.0
    7   October 1024    October 72  November    2.0 November    150.0
    8   November    1883    November    182 December    3.0 December    148.0
    9   December    1934    December    152 January 3.0 January 78.0
    10  January 1653    January 68  0   0.0 February    16.0
    11  February    255 February    14  0   0.0 0   0.0

how can I prevent this from happening , here is the code :

all= [df1, df2, df3, df4]
df_new = pd.concat(all, axis=1)
df_new = df_new.fillna(0)

info: windows 10 python 3.9.1 beginner

1 Answers1

2

First convert Date to DatetimeIndex for each DataFrame:

dfs = [df1, df2, df3, df4]
df_new = pd.concat([x.set_index('Date') for x in dfs], axis=1).fillna(0)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252