6

I am trying to apply different functions to different rows based on the value of a string in an adjacent column. My dataframe looks like this:

type    size
A       1
B       3
A       4
C       2
C       5
A       4
B       32
C       3

and I want to apply different functions to types A, B, and C, to give a third column column "size2." For example, let's say the following functions apply to A, B, and C:

for A: size2 = 3*size
for B: size2 = size
for C: size2 = 2*size 

I'm able to do this for each type separately using this code

df$size2 <- ifelse(df$type == "A", 3*df$size, NA)
df$size2 <- ifelse(df$type == "B", 1*df$size, NA)
df$size2 <- ifelse(df$type == "C", 2*df$size, NA)

However, I can't seem to do it for all of the types without erasing all of the other values. I tried to use this code to limit the application of the function to only those values that were NA (i.e., keep existing values and only fill in NA values), but it didn't work using this code:

df$size2 <- ifelse(is.na(df$size2), ifelse(df$type == "C", 2*df$size, NA), NA)

Does anyone have any ideas? Is it possible to use some kind of AND statement with "is.na(df$size2)" and "ifelse(df$type == "C""?

Many thanks!

Thomas
  • 2,484
  • 8
  • 30
  • 49

3 Answers3

5

This might be a might more R-ish (and I called my dataframe 'dat' instead of 'df' since df is a commonly used function.

> facs <- c(3,1,2)
> dat$size2= dat$size* facs[ match( dat$type, c("A","B","C") ) ]
> dat
  type size size2
1    A    1     3
2    B    3     3
3    A    4    12
4    C    2     4
5    C    5    10
6    A    4    12
7    B   32    32
8    C    3     6

The match function is used to construct indexes to supply to the extract function [.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
3

if you want you can nest the ifelses:

df$size2 <- ifelse(df$type == "A", 3*df$size,
                       ifelse(df$type == "B", 1*df$size,
                           ifelse(df$type == "C", 2*df$size, NA)))


#    > df
#  type size size2
#1    A    1     3
#2    B    3     3
#3    A    4    12
#4    C    2     4
#5    C    5    10
#6    A    4    12
#7    B   32    32
#8    C    3     6
user1317221_G
  • 15,087
  • 3
  • 52
  • 78
  • This worked, but was a bit tedious as I have 32 different "types" LOL – Thomas Feb 02 '13 at 23:52
  • I offer a more compact approach. I try to avoid nested `ifelse`'s if the nesting if more than 4 deep. Ray Waldin's and my approach are very similar. – IRTFM Feb 03 '13 at 01:34
0

This could do it like this, creating separate logical vectors for each type:

As <- df$type == 'A'
Bs <- df$type == 'B'
Cs <- df$type == 'C'
df$size2[As] <- 3*df$size[As]
df$size2[Bs] <- df$size[Bs]
df$size2[Cs] <- 2*df$size[Cs]

but a more direct approach would be to create a separate lookup table like this:

df$size2 <- c(A=3,B=1,C=2)[as.character(df$type)] * df$size
Ray Waldin
  • 3,217
  • 1
  • 16
  • 14