Let's say I have this dataframe:
> df <- data.frame(a1=c(1, 2, 3, 4, 5), b1=c(5, 4, 3, 2, 1), b2=c(10, 11, 12, 13, 14), a2=c(5, 6, 7, 8, 9), c1=c(100, 200, 300, 400, 500))
> df
a1 b1 b2 a2 c1
1 1 5 10 5 100
2 2 4 11 6 200
3 3 3 12 7 300
4 4 2 13 8 400
5 5 1 14 9 500
>
As you can see, I have columns a1
a2
that both start with a
, I want to sum them row-wise and generate just one column of a
, same for column b
. Also for c
column, just the c
column should stay the same as it is since there are only one column that starts with c
.
Desired output:
a b c
1 6 15 100
2 8 15 200
3 10 15 300
4 12 15 400
5 14 15 500
Coming from a Python, in pandas (Python Library), I could just simply use:
df.groupby(df.columns.str[:1], axis=1).sum()
But I am not sure how to do this in R.
I tried dplyr
:
df %>% group_by(sub(".$", "", colnames(df))) %>%
mutate(across(colnames(.), sum))
But an error popped up. New to R.