0

I have the following macro in SAS, and I'm trying to migrate it to R:

%macro recode_var(var=);
    if &var in (1,2,3,4,5) then do;
        if &var = 1 then &var._r = 0;
        if &var = 2 then &var._r = 25;
        if &var = 3 then &var._r = 50;
        if &var = 4 then &var._r = 75;
        if &var = 5 then &var._r = 100;
    end;
%mend;

This is called by the line

%recode_var(var=Q7_1);

This creates a new variable Q7_1_r with values based on the values of Q7_1. I have 100 variables to call the macro for (e.g. Q7_5, Q15_6, Q35_4). What is the best way to write an R equivalent?

I wrote the following code (newfile is the name of the data frame)

sevenvar <- function(var){
newfile <- mutate(newfile, Q7_1_r = case_when(var == 1 ~ 0,
                                    var == 2 ~ 25,
                                    var == 3 ~ 50,
                                    var == 4 ~ 75,
                                    var == 5 ~ 100))
}

sevens <- sevenvar(newfile$Q7_1)
head(sevens)

It worked for a specific variable (Q7_1), but if I were to do it all this way, I'd have to rewrite the above block of code 100 times, for each of the variables I want to apply it do. I'm thinking surely there has to be a more efficient way - something that could take in a variable (var) and create a new variable var_r, with the function then called upon for each variable it applies to by means of a one liner:

sevens <- sevenvar(var)

changing the value of var every time I run the above. e.g. sevens <- sevenvar(newfile$Q7_5), sevens <- sevenvar(newfile$Q15_6), sevens <- sevenvar(newfile$Q35_4). I just need to get the block of code right to begin with so I can replicate the results of the SAS macro in R.

Any advice welcome. Thank you.

Barry
  • 1
  • 2
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Most likely you'll want to use `across()` with `mutate()` to apply the same transformation to multiple columns. You could do it all in one invocation instead of calling a function a bunch of times. – MrFlick Jun 22 '23 at 14:49
  • See reproduceable example below. The output of this is right, but the variable name of the output is "var_r". I would like output to be Q1_2_r if var is Q1_2, Q1_1_r if var is Q1_1 etc. Also, I'm looking to run this 100 times with different values for var and have all results on a single table. – Barry Jun 23 '23 at 10:08
  • 'code'Q1_1 <- c(1,2,3,4,5) Q1_2 <- c(5,4,5,3,2) Q1_3 <- c(3,4,5,1,1) df <- data.frame(Q1_1,Q1_2,Q1_3) df testvar <- function(var){ newfile <- mutate(df, var_r = case_when(var == 1 ~ 0, var == 2 ~ 25, var == 3 ~ 50, var == 4 ~ 75, var == 5 ~ 100)) } testing <- testvar(df$Q1_2) testing <- testvar(df$Q1_3) testing'code' – Barry Jun 23 '23 at 10:10
  • Apologies, not worked out how to get the formatting correct in the comments – Barry Jun 23 '23 at 10:31

1 Answers1

0

To create a custom column programmatically you can use [[ ]] to make new columns based on a formula.

m <- mtcars[1:5, c("mpg","cyl")]
for(i in 1:5){
    m[[paste0("cyl", i)]] <- (m$cyl+i)
}
                   mpg cyl cyl1 cyl2 cyl3 cyl4 cyl5
Mazda RX4         21.0   6    7    8    9   10   11
Mazda RX4 Wag     21.0   6    7    8    9   10   11
Datsun 710        22.8   4    5    6    7    8    9
Hornet 4 Drive    21.4   6    7    8    9   10   11
Hornet Sportabout 18.7   8    9   10   11   12   13

here (m$cyl + i) is just a simple function. But you could instead do a vectorized function like

recode <- function(value){
(value-1)*25
}

and swap m$cyl+1 for recode(m$cyl) any function you want as long as it returns the same length as the number of rows in the data.

In case case of wanting to do this for multiple columns, you can wrap the process in a function and then pass it to all the columns, or do a nested loop.

for(j in c("mpg","cyl"){
 for(i in 1:5){
    m[[paste0(j, i)]] <- (m[[j]]+i)
 }
}
Carlos M.
  • 303
  • 2
  • 7
  • Thank you for the response. A merge wouldn't work as I'm not trying to connect another table, and there is no other table to merge to. I'm trying to create new variables in the same table If Q6_1 = 1, I'm trying to create Q6_1_r with a value of 0, If Q6_1 = 2, Q6_1_r = 25 etc. I can hard code this for "Q6_1", but looking to do it for a dynamic "var", which uses same rules to create var_r. Then call the code with a one liner such as saying "var = Q6_1" – Barry Jun 23 '23 at 09:54
  • I understand, creating a custom reference table would be harder. I've edited to include a loop method including a loop method that programmatically takes specified column names and loops through those. You would still need to define a recode() function (possibly 1 per column if the rules change and are not (value-1)*25; but this structure scales to however many columns you desire. – Carlos M. Jun 23 '23 at 14:46
  • Thank you for the modifications to the code you posted. I'm trying to work out whether or not I can apply the code to what I'm working on. In order to aid this, can you please post what the output looks like? Thanks. – Barry Jun 26 '23 at 13:02
  • I did, the data frame of cyl, cyl1, cyl2, ... cyl5. You can write w/e function you like. – Carlos M. Jun 26 '23 at 15:12
  • Thank you. Unfortunately that wouldn't work for what I'm doing, as the variable naming isn't a clean 1-n. It's more like Q6_1...Q6_7, Q7_1...Q7_5,...,Q51. some of the Q numbers have more options than others. For example, Q6 has Q6_1 to Q6_7, while Q51 only has one option. Thus, something that could be written for a generic variable first, before another line to assign the value to apply, would be best – Barry Jun 27 '23 at 13:45
  • That's easy, just assume all columns have the max() number of options, wrap it in a tryCatch() and remove the columns that end up NULL at the end. Who cares if Q51_2 doesn't exist, you can just make a column of NULLs or use trycatch to skip it. – Carlos M. Jul 13 '23 at 14:46
  • Thank you. I need to play around with it more before I crack it. – Barry Jul 18 '23 at 09:52