I have a similar question to this: (Sum the duplicate rows of particular columns in dataframe) but the solution does not work for me or I don't know how to modify it.
I need the add the Number column values together if both Reference and NODCCODE match, even if NODCCODE are not adjacent within Reference number.
I have this:
structure(list(Reference = c("BBM101", "BBM102",
"BBM102", "BBM102", "BBM103", "BBM103",
"BBM104", "BBM105", "BBM105", "BBM105"),
NODCCODE = c("101","301", "201", "201", "201", "401", "401", "201", "102", "201"),
Number = c(2, 1, 3, 1, 3, 14, 3, 24, 2, 1)),
row.names = c(NA, 10L), class = "data.frame")
Reference NODCCODE Number
1 BBM101 101 2
2 BBM102 301 1
3 BBM102 201 3
4 BBM102 201 1
5 BBM103 201 3
6 BBM103 401 14
7 BBM104 401 3
8 BBM105 201 24
9 BBM105 102 2
10 BBM105 201 1
I want this:
structure(list(Reference = c("BBM101", "BBM102", "BBM102", "BBM103", "BBM103", "BBM104", "BBM105", "BBM105"),
NODCCODE = c("101","301", "201", "201", "401", "401", "201", "102"),
Number = c(2, 1, 4, 3, 14, 3, 25, 2)),
row.names = c(NA, 8L), class = "data.frame")
Reference NODCCODE Number
1 BBM101 101 2
2 BBM102 301 1
3 BBM102 201 4
4 BBM103 201 3
5 BBM103 401 14
6 BBM104 401 3
7 BBM105 201 25
8 BBM105 102 2
notice that lines 3 and 4 Reference and NODCCODE were merged and Number column was added. Also lines 8 and 10 even though there was a 102 value in between the 201 values, all with the same Reference number, so they were added. I do not care whether the remaining rows are at the beginning or the end of that group of reference numbers.