2

I'm sure there is an easy answer for this , but I have scanned stack overflow and haven't been able to find a solution. It would seem that potentially a combination of sapply and ifelse functions would do the job (but I'm not sure).

So I have a dataframe with characters, except one column which is a numeric value.

####Create dataframe which needs converting
df <- data.frame(Sample_1 = rep(letters[1:3], each = 3),
             Sample_2 = rep("a", times = 9))
df$Number <- rep(seq(from=1,to=3,by=1))

I would like to convert the characters in this dataframe to a specific number. What the character needs to be converted to depends on the number in the final column. So the criteria would be:

  • If Number = 1, then a should change to 30, b should change to 20 and c should change to 10
  • If Number = 2, then a should change to 35, b should change to 25 and c should change to 15
  • If Number = 3, then a should change to 40, b should change to 30 and c should change to 20

Here is a dataframe highlighting this conversion

A <- c(30,20,10)
B <- c(35,25,15)
C <- c(40,30,20)
Conversion_df <- data.frame(A, B,C)

And here is the desired output.

Final <- data.frame(Sample_1 = c(30,20,10,35,25,15,40,30,20),
                Sample_2 = c(30,20,10,30,20,10,30,20,10))

Thank you in advance for any help.

James White
  • 705
  • 2
  • 7
  • 20

3 Answers3

2

First we can create a function to valuate the sample with if's statements:

valuate_sample <- function(x,y) {
    ifelse(y==1, ifelse(x=='a',30, ifelse(x=='b',20, 10)),
           ifelse(y==2, ifelse(x=='a',35, ifelse(x=='b',25, 15)),
                  ifelse(y==3, ifelse(x=='a',40, ifelse(x=='b',30, 20)),0)))
}

After we just need to use the function in your dataframe:

df <- df %>% 
    mutate(
        Sample_1 = valuate_sample(Sample_1, Number),
        Sample_2 = valuate_sample(Sample_2, Number)
        )

Result:

enter image description here

skulden
  • 380
  • 1
  • 10
  • thank you for this answer @skulden. Do you know of a way that I can mutate all of the 'Sample' columns simultaneously? I actually have several hundred sample columns and would rather avoid specifying this conversion for each individual column. Thank you. – James White Feb 18 '19 at 09:51
  • yeah, dplyr have a function called: `mutate_all()`, take a look at it. – skulden Feb 18 '19 at 12:10
  • I have tried this but however I code this I get error messages like "no applicable method for 'tbl_vars' applied to an object of class "function" or that it can't recognise the y value. If you do get a chance to slightly modify your answer above which implements the valuate_sample function across all columns (except the number column) I'd be very grateful. Thank you for any help on this. – James White Feb 18 '19 at 12:15
  • Maybe you have coded it wrong, idk. Post your code for us. – skulden Feb 18 '19 at 12:18
  • I have gone through various iterations with different error messages. Here is my latest attempt where I try to mutate only the 'factor' columns (i.e. the samples). Test <- df %>% mutate_all(is.factor, valuate_sample, df$Number) – James White Feb 18 '19 at 12:22
  • Sometimes i prefer use the ~hard code~, its more effective. Try this: `for(column in names(df)) { if(is.factor(df[,column])){ df[,column] <- valuate_sample(df[,column], df[,'Number']) }` – skulden Feb 18 '19 at 14:05
  • Perfect! Thank you so much! I've added this answer in case it might be of interest to others – James White Feb 18 '19 at 14:18
  • This is a great reply Skulden. I faced another problem where I needed to update multiple columns based on multiple conditions, and your solutions also generalised well to that other problem. Thank you! – Sandy Nov 06 '22 at 22:53
1

I also have a dplyr solution, but using case_when, which is perhaps a bit more transparent. The idea is taken from this answer https://stackoverflow.com/a/24459900/5795592

 library(dplyr)
 df %>% mutate( # Sample_1
                    Sample_1_conv = case_when( Number == 1 & Sample_1 == "a" ~ 30
                        , Number == 1 & Sample_1 == "b" ~ 25
                        , Number == 1 & Sample_1 == "c" ~ 10
                        , Number == 2 & Sample_1 == "a" ~ 35
                        , Number == 2 & Sample_1 == "b" ~ 25
                        , Number == 2 & Sample_1 == "c" ~ 15
                        , Number == 3 & Sample_1 == "a" ~ 40
                        , Number == 3 & Sample_1 == "b" ~ 30
                        , Number == 3 & Sample_1 == "c" ~ 20)
                        # Sample_2
                    , Sample_2_conv = case_when( Number == 1 & Sample_2 == "a" ~ 30
                                               , Number == 1 & Sample_2 == "b" ~ 25
                                               , Number == 1 & Sample_2 == "c" ~ 10
                                               , Number == 2 & Sample_2 == "a" ~ 35
                                               , Number == 2 & Sample_2 == "b" ~ 25
                                               , Number == 2 & Sample_2 == "c" ~ 15
                                               , Number == 3 & Sample_2 == "a" ~ 40
                                               , Number == 3 & Sample_2 == "b" ~ 30
                                               , Number == 3 & Sample_2 == "c" ~ 20)
                        )
hannes101
  • 2,410
  • 1
  • 17
  • 40
0

As per the code described by @skulden in the comments, you can also apply the 'valuate_sample' function automatically across all of the desired columns (i.e. those coded as factors within the dataframe).

Here is the function highlighted by @skulden in a previous answer.

valuate_sample <- function(x,y) {
ifelse(y==1, ifelse(x=='a',30, ifelse(x=='b',20, 10)),
       ifelse(y==2, ifelse(x=='a',35, ifelse(x=='b',25, 15)),
              ifelse(y==3, ifelse(x=='a',40, ifelse(x=='b',30, 20)),0)))
}

And here is how this can be applied to all columns.

for(column in names(df)) { if(is.factor(df[,column])){

   df[,column] <- valuate_sample(df[,column], df[,'Number'])

}
skulden
  • 380
  • 1
  • 10
James White
  • 705
  • 2
  • 7
  • 20