1

I have a sample data frame as such:

dat <- data.frame(V1 = c("Non Debt Balance","Non Debt Income","Non Debt Cost"), 
              V2 = c("Average Balance","Income","Cost"), 
              V3 = c("Trade","Sales Finance","Trade"))

I would like to create a new column based on the value of column V2. If column V2 is %in% ("Income","Cost") then I would like the new column to pick column V3, if not in the list then V2.

I was thinking about using the %in% function but not sure how I would go about doing an if in this list then get this column else get this column logic.

Help is most appreciated. Regards,

smci
  • 32,567
  • 20
  • 113
  • 146
Aksel Etingu
  • 195
  • 2
  • 15
  • Seems your dataframe behavior is dependent on the option default `options('stringsAsFactors'=FALSE)`. It's best not to rely on obscure non-default behavior like that... – smci Feb 01 '18 at 22:30
  • See also [Change stringsAsFactors settings for data.frame](https://stackoverflow.com/questions/11538532/change-stringsasfactors-settings-for-data-frame#11538828) – smci Feb 01 '18 at 22:32
  • Anyway, there are many existing duplicates of this and related-but-distinct questions eg [1](https://stackoverflow.com/questions/18685519/adding-multiple-ranges-of-values-from-a-column-in-the-ifelse-statement-in-r),[2](https://stackoverflow.com/questions/37321020/adding-new-column-with-conditional-values-using-ifelse),[3](https://stackoverflow.com/questions/43461414/ifelse-on-data-frame-column-to-replace-with-date-time-column-values-if-matched),[4](https://stackoverflow.com/questions/35868728/how-to-use-ifelse-when-comparing-two-columns-and-changing-a-third), must canonicalize all the variants – smci Feb 01 '18 at 23:00

3 Answers3

3

Is this what you're looking for?

library(dplyr)

dat %>% 
  mutate(V4 = case_when(V2 == "Income" | V2 == "Cost" ~ V3,
                        TRUE ~ V2))

                V1              V2            V3              V4
1 Non Debt Balance Average Balance         Trade Average Balance
2  Non Debt Income          Income Sales Finance   Sales Finance
3    Non Debt Cost            Cost         Trade           Trade

Alternatively, using %in% as you mentioned:

dat %>% 
  mutate(V4 = case_when(V2 %in% c("Cost", "Income") ~ V3,
                        TRUE ~ V2))

Data:

dat <- data.frame(V1 = c("Non Debt Balance","Non Debt Income","Non Debt Cost"), 
                  V2 = c("Average Balance","Income","Cost"), 
                  V3 = c("Trade","Sales Finance","Trade"),
                  stringsAsFactors = FALSE)
tyluRp
  • 4,678
  • 2
  • 17
  • 36
  • 1
    Yes, this is the output I was trying to get to. However, I do get an error message when I run your code. Warning message: In `[<-.factor`(`*tmp*`, i, value = 1L) : invalid factor level, NA generated – Aksel Etingu Feb 01 '18 at 22:15
  • 2
    Ah, forgot to mention that I used `stringsAsFactors = FALSE`. I'll update my answer. – tyluRp Feb 01 '18 at 22:16
  • 2
    Perfect! Thank you very much:) – Aksel Etingu Feb 01 '18 at 22:19
  • AkselEtingu, you didn't show reproducible data before we answered. You said the data was string but it was actually factor. There's no point complaining "Your solution doesn't work" if you didn't give reproducible data in the first place... use the `dput()` command to write out exactly what your data is. – smci Feb 01 '18 at 22:25
  • Anyway, there are many existing duplicates of this and related-but-distinct questions eg [1](https://stackoverflow.com/questions/18685519/adding-multiple-ranges-of-values-from-a-column-in-the-ifelse-statement-in-r),[2](https://stackoverflow.com/questions/37321020/adding-new-column-with-conditional-values-using-ifelse),[3](https://stackoverflow.com/questions/43461414/ifelse-on-data-frame-column-to-replace-with-date-time-column-values-if-matched),[4](https://stackoverflow.com/questions/35868728/how-to-use-ifelse-when-comparing-two-columns-and-changing-a-third), must canonicalize all the variants – smci Feb 01 '18 at 23:01
2
dat$NewCol <- ifelse(dat$V2 %in% c('Income','Cost'), dat$V3, dat$V2)
  • You can also use dplyr/data.table
  • ifelse doesn't have the best performance. Also you can't do NA-aware comparisons (i.e. three-way selects with an NA clause)
smci
  • 32,567
  • 20
  • 113
  • 146
  • Thank you very much for your help. This code doesn't, seem to work. Instead of getting the values from those two columns, I am getting 1,1,2 as values for the new column. Perhaps I am missing something. – Aksel Etingu Feb 01 '18 at 22:18
  • This is also because of the `stringsAsFactors = FALSE` argument missing when reading your data. Alternatively, you could read it without that argument and then convert everything to character using `dat[] <- lapply(dat, as.character)`. After that, smci's solution should work. – tyluRp Feb 01 '18 at 22:21
  • Thank you! This works great as well. Much appreciated! – Aksel Etingu Feb 01 '18 at 22:25
1

data.table way:

 result <- setDT(dat)[,result_col:= ifelse(V2 %in% c('Income','Cost'), V3, V2)]


> result

    V1                    V2            V3             result_col
1  Non Debt Balance   Average Balance  Trade         Average Balance
2  Non Debt Income       Income     Sales Finance     Sales Finance
3  Non Debt Cost          Cost         Trade             Trade
Rushabh Patel
  • 2,672
  • 13
  • 34