0

In my data.frame I have a column with a list of items and a column with each item's amount. Each item belongs to a group. I want to calculate the sum of amounts for a selection of items across all groups and enter a value in a new column based on the sum.

Here is my table:

GROUP ITEM AMOUNT
1 A 10
1 AT 20
1 EP 5
1 S 5
2 AC 10
2 A 5
2 P 61
2 SB 39
3 Z 25
3 AC 80
3 P 80
3 na 0
4 Z 0
4 M 100
4 na 0
4 na 0

The statements I want to apply:

  • if a group contains any of A, AT, EP, AC then perform calculations
  • if A+AT+EP+AC >= 80 enter "red" in new "SUM" column for all values in group
  • if A+AT+EP+AC <= 20 enter "yellow" in new "SUM" column for all values in group
  • if 20< A+AT+EP+AC >80 enter "orange" in new "SUM" column for all values in group
  • if a group DOES NOT contain any of A, AT, EP, AC then enter "black"

I tried using mutate with multiple SUMs that I coalesced into SUM, but I can't figure out how to apply the calculations.

This is what I want:

GROUP ITEM AMOUNT SUM
1 A 10 orange
1 AT 20 orange
1 EP 5 orange
1 S 5 orange
2 AC 10 yellow
2 A 5 yellow
2 P 61 yellow
2 SB 39 yellow
3 Z 25 red
3 AC 80 red
3 P 80 red
3 na 0 red
4 Z 0 black
4 M 100 black
4 na 0 black
4 na 0 black

Thanks for the help!

Phil
  • 7,287
  • 3
  • 36
  • 66

1 Answers1

0

I'm sure there's a more efficient way to do this, but you could give this a try - you may have to adjust the regex if the sample data doesn't contain all possible values from ITEM. I flipped your > for Orange because what you had didn't make much sense and the flip made the output match what you have as an expected output.

df = data.frame(GROUP = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4),
                ITEM = c("A", "AT", "EP", "S", "AC", "A", "P", "SB", "Z", "AC", "P", NA, "Z", "M", NA, NA),
                AMOUNT = c(10, 20, 5, 5, 10, 5, 61, 39, 25, 80, 80, 0, 0, 100, 0, 0)) 

df1 <- df %>% group_by(GROUP) %>% mutate(in_group = paste(ITEM, collapse = " ")) %>% 
mutate(SUM = ifelse(grepl("^A| A|AT|EP|AC", in_group), "TBD", "black")) %>% 
mutate(amount_placeholder = ifelse(grepl("^A$|^AT$|^EP$|^AC$", ITEM), AMOUNT, 0)) %>% 
group_by(GROUP) %>% summarize(total = sum(amount_placeholder), SUM = first(SUM)) %>% 
mutate(SUM = ifelse(SUM == "black", SUM, ifelse(total >= 80, "red", ifelse(total <= 20, "yellow", ifelse(total > 20 & total < 80, "orange", NA))))) %>% right_join(df, by = "GROUP") %>% 
select(GROUP, ITEM, AMOUNT, SUM)

In the future, a reproducible example makes it much easier to help you -- you can use the data.frame() function I use above or dput

Laura
  • 97
  • 6