1

I've been teaching myself R from scratch, basically by undertaking something, then reading posts like these, and trial-and-error based on that. Sometimes I hit a wall and reach out.

I've hit a wall. I have dplyr 0.7 installed. I have a tibble with a column - call it contract_key - I added by applying mutate(coalesce()) to three other columns in the tibble. Here's sample data:

product <- c("655393265191","655393265191","168145850127","168145850127","350468621217","350468621217","977939797847","NA","928893912852")
supplier <- c("person5","person3","person10","person5","person11","person5","person11","person14","person5")
vendor <- c("org2","org3","org3","org2","org1","org2","org1","org5","org2")
quantity <- c(7,5,6,1,2,1,18,2,2)
gross <- c(0.0419,0.0193,0.0439,0.0069,0.0027,0.0055,0.0233,NA,0.0004)

df <- data_frame(product,supplier,vendor,quantity,gross)

Here's how I generated contract_key:

df <- df %>% 
  mutate(contract_key = coalesce(product,supplier,vendor))

I now want to add another column that categorizes the contents of contract_key based on which of the three columns provided the content (through coalesce()). So if contract_key ="person5", for example, the new column, contract_level, would be "supplier". And contract_key="org2" would map to contract_level = "vendor", etc.

Essentially, I'll be using contract_level as a join variable to another tibble.

I'm stumped. I've tried if_else, and I see that I shouldn't bother trying case_when (because it's inside mutate()). I've also tried nesting if_else's to no avail.

It's probably basic R syntax that I just don't know. Something to do with dot notation and grammar. If someone supplies the answer, I will backtrace until I figure out what you did. (And I'll have learned a new lesson in R!)

Thanks!

Steve
  • 575
  • 4
  • 18
  • 1
    In the data sample for `product`, change `"NA"` to `NA` so that it will be coded correctly as a missing value, rather than as the string `"NA"`. – eipi10 Sep 12 '17 at 16:23
  • Does this answer your question? [Can dplyr package be used for conditional mutating?](https://stackoverflow.com/questions/24459752/can-dplyr-package-be-used-for-conditional-mutating) – divibisan Jun 16 '20 at 14:21

1 Answers1

3

How about this:

df %>% mutate(contract_key = coalesce(product,supplier,vendor),
              contract_level = case_when(contract_key %in% product ~ "product",
                                         contract_key %in% supplier ~ "supplier",
                                         contract_key %in% vendor ~ "vendor",
                                         TRUE ~ "none"))
       product supplier vendor quantity  gross contract_key contract_level
1 655393265191  person5   org2        7 0.0419 655393265191        product
2 655393265191  person3   org3        5 0.0193 655393265191        product
3 168145850127 person10   org3        6 0.0439 168145850127        product
4 168145850127  person5   org2        1 0.0069 168145850127        product
5 350468621217 person11   org1        2 0.0027 350468621217        product
6 350468621217  person5   org2        1 0.0055 350468621217        product
7 977939797847 person11   org1       18 0.0233 977939797847        product
8         <NA> person14   org5        2     NA     person14       supplier
9 928893912852  person5   org2        2 0.0004 928893912852        product

Other options that require less code:

df %>% mutate(contract_key = coalesce(product,supplier,vendor),
              contract_level = if_else(!is.na(product), 'product', 
                                       if_else(!is.na(supplier), 'supplier', 'vendor')))

df %>% mutate(contract_key = coalesce(product,supplier,vendor),
              contract_level = apply(., 1, function(x) names(.)[min(which(!is.na(x)))]))
eipi10
  • 91,525
  • 24
  • 209
  • 285
  • Thanks! Really that simple I guess. For any future readers: I had not even *tried* case_when because the dplyr documentation specifically says case_when doesn't work inside mutate(). But it did, at least in this case! Thanks @eipi10 – Steve Sep 12 '17 at 16:31
  • 1
    Just curious where the documentation says `case_when` doesn't work inside mutate. The help for `case_when` has an example that begins `case_when is particularly useful inside mutate...`. – eipi10 Sep 12 '17 at 16:36
  • Ah, I think that was an issue in earlier versions of `dplyr`, but not any more. – eipi10 Sep 12 '17 at 16:37
  • https://blog.rstudio.com/2016/06/27/dplyr-0-5-0/, about halfway down the page in the case_when() section: "case_when() is still somewhat experiment and does not currently work inside mutate(). That will be fixed in a future version." That's the doc for 0.7, which I think is the current major release? – Steve Sep 12 '17 at 16:38
  • 1
    anyway, now i have to figure out how to do a conditional join based on contract_level. back to the salt mines!! – Steve Sep 12 '17 at 16:39