I suggest working around this problem without dbplyr. As the overwhelming majority of encoding questions have nothing to do with dbplyr (the encoding tag has 23k questions, while the dbplyr tag has <400 questions) this may be a non-trivial problem to resolve without dbplyr.
Here are two work-arounds to consider:
Use a text file as an intermediate step
R will have no problem writing an in-memory table out to a text file/csv. And SQL server has standard ways of reading in a text file/csv. This gives you the added advantage of validating the contents of the text file before loading it into SQL.
Documentation for SQL Server BULK INSERT
can be found here. This answer gives instructions for using UTF-8 encoding: CODEPAGE = '65001'
. And this documentation gives instructions for unicode: DATAFILETYPE = 'widechar'
.
If you want to take this approach entirely within R, it will likely look something like:
write.csv(failing_row, "output_file.csv")
query_for_creating_table = "CREATE TABLE schema_name.table_name (
col1 INT,
col2 NCHAR(10),
)"
# each column listed along with suitable data types
query_for_bulk_insert = "BULK INSERT schema_name.table_name
FROM 'output_file.csv'
WITH
(
DATAFILETYPE = 'widechar',
FIRSTROW = 2,
ROWTERMINATOR = '\n'
)"
DBI::dbExecute(con, query_for_creating_table)
DBI::dbExecute(con, query_for_bulk_insert)
Load all the non-error rows and append the final row after
I have has some success in the past using the INSERT INTO
syntax. So would recommend loading the failing row using this approach.
Something like the following:
failing_row = local_df %>%
filter(condition_to_get_just_the_failing_row)
non_failing_rows = local_df %>%
filter(! condition_to_get_just_the_failing_row)
# write non-failing rows
dbWriteTable(con, SQL("schema.table"), non_failing_rows, append = T)
# prep for insert failing row
insert_query = "INSERT INTO schema.table VALUES ("
for(col in colnames(failing_row)){
value = failing_row[[col]]
if(is.numeric(value)){
insert_query = paste0(insert_query, value, ", ")
} else {
insert_query = paste0(insert_query, "'", value, "', ")
}
}
insert_query = paste0(insert_query, ");")
# insert failing row
dbExecute(con, insert_query)
other resources
If you have not seen them already, here are several related Q&A that might assist: Arabic characters, reading UTF-8, encoding to MySQL, and non-Latin characters as question marks. Though some of these are for reading data into R.