0

We have a big database with 1189 records and 330 variables in a mySQL file that was exported to a CSV comma delimited file. I am using read_csv from readr to import the database that apparently it is imported fine. However, when I try to recode some of the vars I get the error:

Error: unexpected symbol in "test.data$coder<-test.data$User Name"
library(readr)
test.data<-read_csv("test_hcsv_file.csv",  col_names = TRUE, show_col_types=FALSE)
str(test.data)
spc_tbl_ [1,109 × 363] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ FormId                                                                                                : num [1:1109] 17 18 19 20 21 198 113 25 26 27 ...
 $ UserId                                                                                                : num [1:1109] 67 67 64 67 67 71 71 71 67 67 ...
 $ User Name 
[1:1109] "user1" "user2" "user3" "user4" ... 
test.data$coder<-test.data$User Name

> test.data$coder<-test.data$User Name
Error: unexpected symbol in "test.data$coder<-test.data$User Name"

stefan
  • 90,330
  • 6
  • 25
  • 51
Poggio
  • 13
  • 3
  • Your issue is that R doesn't like spaces in variable names. If you insist on having a space in a variable name in a data frame, use backticks, namely `test.data$coder<-test.data$\`User Name\``. See here: https://stackoverflow.com/questions/3411201/specifying-column-names-in-a-data-frame-changes-spaces-to – David Jul 16 '23 at 05:21
  • David. Thanks a lot. Of the 300 variables at least 200 have this "Variable 5" format. Can you suggest a method to fix this before of after importing. I cannot modify the mySQL export to CSV because is the output we have. Thanks – Poggio Jul 16 '23 at 06:37
  • Poggio, you could try to use `read.csv` and with the argument `check.names=FALSE`. That is what they suggest here: https://www.statology.org/r-read-csv-column-names-with-spaces/ By default, `read.csv` (from vanilla R, not from the `readr` package) will put a "." in every variable where there is a space. If you don't want this, you'd have something like `df <- read.csv("string to your file.csv", check.names=FALSE)`. If you wish to call a specific column name using `df$your variable name`, you would have to use the backticks: `df$\`your variable name\`` to handle the spaces. – David Jul 16 '23 at 09:43
  • From this post, Ben says that `readr::read_csv` doesn't correct for column names that have spaces in them: https://stackoverflow.com/questions/6124519/r-import-csv-with-column-names-that-contain-spaces . This is confirmed here as well: https://r4ds.hadley.nz/data-import.html. So either way works. The point is that `read.csv` corrects the columns names by inserting a period, unless `check.names=F` is specified. Your `read_csv` works and preserves spaces in column names by default. But either way, you need to use backticks to call any column with a space in the column name. – David Jul 16 '23 at 09:56
  • David, it worked. Many thanks!! – Poggio Jul 17 '23 at 06:25

0 Answers0