I'm inferring you're hoping to bulk-insert the data using bcp
or sqlcmd
. While neither one deals well with commas, embedded commas, and embedded quotes, you can work around this by using a different field separator (that is not contained within the data).
Setup:
evil_data <- data.frame(
id = 1:2,
chr = c('evil"string ,;\n\'', '",";:|"'),
stringsAsFactors = FALSE
)
# con <- DBI::dbConnect(...)
DBI::dbExecute(con, "create table r2test (id INT, chr nvarchar(64))")
# [1] 0
DBI::dbWriteTable(con, "r2test", evil_data, create = FALSE, append = TRUE)
DBI::dbGetQuery(con, "select * from r2test")
# id chr
# 1 1 evil"string ,;\n'
# 2 2 ",";:|"
First, I'll use \U001
as the field separator and \U002
as the row separator. Those two should be "good enough", but if you have non-printable characters in your data, then you might either change your separators to other values or look for encoding options for the data (e.g., base64, though it might need to be stored that way).
write.table(evil_data, "evil_data.tab", sep = "\U001", eol = "\U002", quote = FALSE)
# or data.table::fwrite or readr::write_delim
Since I'm using bcp
, it can use a "format file" to indicate separators and which columns on the source correspond with columns in the database. See references for how to create this file, but for this example I'll use:
fmt <- c("12.0", "2",
"1 SQLCHAR 0 0 \"\001\" 1 id \"\" ",
"2 SQLCHAR 0 0 \"\002\" 2 chr SQL_Latin1_General_CP1_CI_AS")
writeLines(fmt, "evil_data.fmt")
From here, assuming bcp
is in your PATH
(you'll need an absolute path for bcp otherwise), run this in a terminal (I'm using git-bash on windows, but this should be the same in others). The second line is all specific to my database connection, you'll need to omit or change all of this for your own connection. The first line is your stuff
$ bcp [db_owner].[r2test] in evil_data.tab -F2 -f evil_data.fmt -r '\002' \
-S '127.0.0.1,21433' -U 'myusername' -d 'mydatabase' -P ***MYPASS***
Starting copy...
2 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total : 235 Average : (8.51 rows per sec.)
Proof that it worked:
DBI::dbGetQuery(con, "select * from r2test")
# id chr
# 1 1 evil"string ,;\n'
# 2 2 ",";:|"
# 3 1 1\001evil"string ,;\r\n'
# 4 2 2\001",";:|"
References: