1

I have a data frame where the data looks something like this:

ds                    Name     Up     Down

2021-03-01 02:55:00   name1    14094  13384
2021-03-01 03:00:00   name1    16124  18283
2021-03-01 03:05:00   name1    20838  23124
2021-03-01 02:55:00   name2    15094  33384
2021-03-01 03:00:00   name2    16174  48283
2021-03-01 03:05:00   name2    30838  63124

So I have a column with different names, but they can have the same datetime.

What I would like to do is to merge rows with the same date into a new data frame or something like that, but where the Up/Down values are added together, i.e. the resulting data frame should be something like:

ds                    Name           Up     Down

2021-03-01 02:55:00   name1+name2    29188  46768
2021-03-01 03:00:00   name1+name2    32298  66569
2021-03-01 03:05:00   name1+name2    51676  86248

The naming convention for the Name column I haven't really given much thought. I don't think that is that important.

Denver Dang
  • 2,433
  • 3
  • 38
  • 68
  • Does this answer your question? [Concatenate strings from several rows using Pandas groupby](https://stackoverflow.com/questions/27298178/concatenate-strings-from-several-rows-using-pandas-groupby) – Frodnar Apr 28 '21 at 09:34

3 Answers3

3

It's a straight forward groupby().agg()

import io
df = pd.read_csv(io.StringIO("""ds                    Name     Up     Down
2021-03-01 02:55:00   name1    14094  13384
2021-03-01 03:00:00   name1    16124  18283
2021-03-01 03:05:00   name1    20838  23124
2021-03-01 02:55:00   name2    15094  33384
2021-03-01 03:00:00   name2    16174  48283
2021-03-01 03:05:00   name2    30838  63124"""), sep="\s\s+", engine="python")
df["ds"] = pd.to_datetime(df["ds"])

df.groupby("ds", as_index=False).agg({"Name":lambda s: "+".join(list(s)), "Up":"sum","Down":"sum"})

ds Name Up Down
0 2021-03-01 02:55:00 name1+name2 29188 46768
1 2021-03-01 03:00:00 name1+name2 32298 66566
2 2021-03-01 03:05:00 name1+name2 51676 86248
Rob Raymond
  • 29,118
  • 3
  • 14
  • 30
  • i think the `Name` lambda can be simplified to `{"Name":"+".join, ...`, but +1 overall – tdy Apr 28 '21 at 09:46
1
df = df.set_index('Date')
df
                    Name    Up      Down
Date            
2021-03-01 02:55:00 name1   14094   13384
2021-03-01 03:00:00 name1   16124   18283
2021-03-01 03:05:00 name1   20838   23124
2021-03-01 02:55:00 name2   15094   33384
2021-03-01 03:00:00 name2   16174   48283
2021-03-01 03:05:00 name2   30838   63124

df['Name'] = df['Name'].astype('str')

df = df.groupby(df.index).agg({'Name' : lambda x: '+'.join(x),
                               'Up':'sum',
                               'Down':'sum'})
df

                    Name        Up      Down
Date            
2021-03-01 02:55:00 name1+name2 29188   46768
2021-03-01 03:00:00 name1+name2 32298   66566
2021-03-01 03:05:00 name1+name2 51676   86248

df.reset_index()
                 Date         Name     Up   Down
0 2021-03-01 02:55:00  name1+name2  29188  46768
1 2021-03-01 03:00:00  name1+name2  32298  66566
2 2021-03-01 03:05:00  name1+name2  51676  86248
imdevskp
  • 2,103
  • 2
  • 9
  • 23
0

You can do this with Pandas in Python.

Considering your dataframe as df with column ds then you can use the fuction groupby() like this df_new = df.groupby(df['ds']).

yum
  • 1,125
  • 1
  • 8
  • 14