I have a dataset with currently 4 rows /subjects (more to come as this is ongoing research) and 259 variables /columns. 240 variables of this dataset are ratings of fit ("How well does the following adjective match the dimension X?" and 19 variables are sociodemographic.
For these 240 rating-variables, my subjects could give a rating ranging from 1 ("fits very badly") to 7 ("fits very well"). Consequently, I have a 240 variables numbered from 1 to 7. I would like to change these numeric values as follows (the procedure being the same for all of the 240 columns)
1 should change to 0, 2 should change to 1/6, 3 should change to 2/6, 4 should change to 3/6, 5 should change to 4/6, 6 should change to 5/6 and 7 should change to 1. So no matter where in the 240 columns, a 1 should change to 0 and so on.
I have tried the following approaches:
In this post, it says that
x <- 1:10
# With recode function using backquotes as arguments
dplyr::recode(x, `2` = 20L, `4` = 40L)
# [1] 1 20 3 40 5 6 7 8 9 10
# With case_when function
dplyr::case_when(
x %in% 2 ~ 20,
x %in% 4 ~ 40,
TRUE ~ as.numeric(x)
)
# [1] 1 20 3 40 5 6 7 8 9 10
Consequently, I tried this:
df = ds %>% select(AD01_01:AD01_20,AD02_01:AD02_20,AD03_01:AD03_20,AD04_01:AD04_20,AD05_01:AD05_20,AD06_01:AD06_20, AD09_01:AD09_20,AD10_01:AD10_20,AD11_01:AD11_20,AD12_01:AD12_20,AD13_01:AD13_20,AD14_01:AD14_20)
%>% recode(.,`1`=0,`2`=-1/6,`3`=-2/6, `4`=3/6,`5`=4/6, `6`=5/6, `7`=1))
with AD01_01 etc. being the column names for the adjectives my subjects should rate. I also tried it without the .,
after recode(,
to no avail.
This code is flawed because it omits the 19 rows of sociodemographic data I want to keep in my dataset. Moreover, I get the error unexpected SPECIAL in "%>%"
.
I thought R might accept my selected columns with the pipe operator as the "x" in recode
. Apparently, this is not the case. I also tried to read up on the R documentation of recode
but it made things much more confusing for me, as there were a lot of technical terms I don't understand.
As there is another option mentioned in the post, I also tried this:
df = df %>% select(AD01_01:AD01_20,AD02_01:AD02_20,AD03_01:AD03_20,AD04_01:AD04_20,AD05_01:AD05_20,AD06_01:AD06_20, AD09_01:AD09_20,AD10_01:AD10_20,AD11_01:AD11_20,AD12_01:AD12_20,AD13_01:AD13_20,AD14_01:AD14_20) %>% case_when (.,%in% 1~0,%in% 2~1/6,%in%3~2/6,%in%4~3/6,%in%5~4/6,%in%6~5/6,%in%7~1)
I thought I could give the output of the select function to the case_when function. Apparently, this is also not the case.
When I execute this command, I get
Error: unexpected SPECIAL in:
"df = df %>% select(AD01_01:AD01_20,AD02_01:AD02_20,AD03_01:AD03_20,AD04_01:AD04_20,AD05_01:AD05_20,AD06_01:AD06_20, AD09_01:AD09_20,AD10_01:AD10_20,AD11_01:AD11_20,AD12_01:AD12_20,AD13_01:AD13_20,AD14_01:AD14_20) %>% case_when (%in%"
Reading up on other possibilities, I found this
https://rstudio-education.github.io/hopr/modify.html
exemplary dataset:
head(dplyr::storms)
## # A tibble: 6 x 13
## name year month day hour lat long status category wind pressure
## <chr> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr> <ord> <int> <int>
## 1 Amy 1975 6 27 0 27.5 -79 tropi… -1 25 1013
## 2 Amy 1975 6 27 6 28.5 -79 tropi… -1 25 1013
## 3 Amy 1975 6 27 12 29.5 -79 tropi… -1 25 1013
## 4 Amy 1975 6 27 18 30.5 -79 tropi… -1 25 1013
## 5 Amy 1975 6 28 0 31.5 -78.8 tropi… -1 25 1012
## 6 Amy 1975 6 28 6 32.4 -78.7 tropi… -1 25 1012
## # ... with 2 more variables: ts_diameter <dbl>, hu_diameter <dbl>
# We decide that we want to recode all NAs to 9999.
storm <- storms
storm$ts_diameter[is.na(storm$ts_diameter)] <- 9999
summary(storm$ts_diameter)
ds$AD01_01:AD01_20[1(ds$AD01_01:AD01_20)] <- 0, ds$AD01_01:AD01_20[2(ds$AD01_01:AD01_20)] <- 1/6, ds$AD01_01:AD01_20[3(ds$AD01_01:AD01_20)] <- 2/6,
ds$AD01_01:AD01_20[4(ds$AD01_01:AD01_20)] <- 3/6, ds$AD01_01:AD01_20[5(ds$AD01_01:AD01_20)] <- 4/6, ds$AD01_01:AD01_20[6(ds$AD01_01:AD01_20)] <- 5/6,
ds$AD01_01:AD01_20[7(ds$AD01_01:AD01_20)] <- 1
My idea in this case was to use assign
for multiple columns at a time (this effort just concerns 20 of my 240 columns and it also didn't work. I got the error
could not find function ":<-"
which is weird because I thought this was a basic command. The only noteworthy thing that might explain is that I executed library(readr)
and library(tidyverse)
beforehand.
Disclaimer: I am an R newbie and have spent 2 hours to try to solve this issue. I would also like to know where I went wrong and why my code doesn't work.