1

I have a dataframe that looks like this:

Name  Fruit Cost
Adam  Orange   2
Adam  Apple    3
Bob   Orange   3
Cathy Orange   4
Cathy Orange   5

Dataframe creation:

df=data.frame(Name=c("Adam","Adam","Bob","Cathy","Cathy"),Fruit=c("Orange","Apple","Orange","Orange","Orange"),Cost=c(2,3,3,4,5))

I would like to script a combine that says when Name and Fruit match, add the Cost and delete the other row. For the example, the result would look like this, with two Cathy costs being combined because the Name and Fruit are the same:

Name  Fruit Cost
Adam  Orange   2
Adam  Apple    3
Bob   Orange   3
Cathy Orange   9

I was thinking of writing a for loop to compare line by line, value by value, compare and add and then delete. But I have to imagine there's a faster/cleaner way.

Daniel T.
  • 32,821
  • 6
  • 50
  • 72
EricW
  • 69
  • 5

2 Answers2

1

What you are trying to do is sum Cost within a group.

In base R:

aggregate(Cost ~ Name + Fruit, df, sum)

Or using dplyr:

library(dplyr)

df %>% 
  group_by(Name, Fruit) %>% 
  summarize(Cost = sum(Cost), .groups = "drop")
LMc
  • 12,577
  • 3
  • 31
  • 43
1

We may use

library(data.table)
setDT(df)[, .(Cost = sum(Cost)), .(Name, Fruit)]
akrun
  • 874,273
  • 37
  • 540
  • 662