1

I have a data frame similar to this. I want to sum up values for rows if the text in column "Name" is the same before the - sign.

enter image description here

jaco0646
  • 15,303
  • 7
  • 59
  • 83
  • 1
    Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – Ronak Shah Sep 25 '19 at 08:31

3 Answers3

1

Remove everything after "-" using sub and then use How to sum a variable by group

df$Name <- sub("-.*", "",df$Name)
aggregate(cbind(val1, val2)~Name, df, sum)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

Below is a data.table solution.

Data:

df = data.table(
  Name = c('IRON - A', 'IRON - B', 'SABBATH - A', 'SABBATH - B'),
  val1 = c(1,2,3,4),
  val2 = c(5,6,7,8)
)

Code:

df[, Name := sub("-.*", "", Name)]

mat = df[, .(sum(val1), sum(val2)), by = Name]

> mat
       Name V1 V2
1:    IRON   3 11
2: SABBATH   7 15
JDG
  • 1,342
  • 8
  • 18
0

You can rbind your 2 tables (top and bottom) into one data frame and then use dplyr or data.table. The data.table would be much faster for large tables.

data_framme$Name <- sub("-.*", "", data_frame$Name)

library(dplyr)
data_frame %>%
  group_by(Name) %>%
  summarise_all(sum)


library(data.table)
data.frame <- data.table(data.frame)
data.frame[, lapply(.SD, sum, na.rm=TRUE), by=Name ]
haci
  • 241
  • 1
  • 8
  • This solution would separate IRON MAIDEN - A and IRON MAIDEN - B. OP wants to identify Name groups with the same name before the "-" sign. – JDG Sep 25 '19 at 08:36
  • 1
    Thanks @J.G. for pointing this out, I have edited my answer. – haci Sep 25 '19 at 08:38