0

I have this data set:

var_1 = rnorm(1000,1000,1000)
var_2 = rnorm(1000,1000,1000)
var_3 = rnorm(1000,1000,1000)

sample_data = data.frame(var_1, var_2, var_3)

I broke this data set into groups of 100 - thus creating 10 mini datasets:

list_of_dfs <- split(
  sample_data, (seq(nrow(sample_data))-1) %/% 100
)



table_names <- paste0("sample_", 1:10)

I know want to upload these 10 mini datasets onto an SQL server :

library(odbc)
library(DBI)
library(RODBC)
library(purrr)

#establish connection

map2(
  table_names, 
  list_of_dfs, 
  function(x,y) dbWriteTable(connection, x, y)
)

The problem is, that one of these mini datasets (e.g. sample_6) is not being accepted by the SQL server and gives this error:

Error in result_insert_dataframe(rs@prt, values):  nanodbc/nanodbc.cpp:1587 : HY008 : Operation canceled

This means that "sample_1", "sample_2", "sample_3", "sample_4", "sample_5" were all successfully uploaded - but since "sample_6" was rejected, "sample_7", "sample_8", "sample_9" and "sample_10".

  • Is there a way to "override" this error and ensure that if one of these "sample_i" data sets are rejected, the computer will skip this data set and attempt to upload the remaining datasets?

If I were to do this manually, I could just "force" R to skip over the problem data set. For example, imagine if "sample_2" was causing the problem:

dbWriteTable(my_connection,  SQL("sample_1"), sample_1)

dbWriteTable(my_connection,  SQL("sample_2"), sample_2)
    Error in result_insert_dataframe(rs@prt, values):  nanodbc/nanodbc.cpp:1587 : HY008 : Operation canceled

dbWriteTable(my_connection,  SQL("sample_3"), sample_3)

In the above code, "sample_1" and "sample_3" are successfully uploaded even though "sample_2" was causing a problem.

  • Is it possible to override these errors when "bulk-uploading" the datasets?

Thank you!

stats_noob
  • 5,401
  • 4
  • 27
  • 83
  • The only way for anything to continue past an error is to wrap the risky code in `try` or `tryCatch`. Have you tried either of those? (I really think you should look into why SQL Server is killing the upload or why `odbc` (using `nanodbc`) is failing to upload. If the data is all homogeneous, then there should be no logical problems with uploading them all. – r2evans Mar 10 '22 at 18:54
  • wrap it around try function: `map2(table_names, list_of_dfs, ~try(dbWriteTable(connection, .x, .y)))` – Onyambu Mar 10 '22 at 18:54
  • Another thought, though: is there a reason you must upload them separately? One could always combine them into one frame (using `do.call(rbind, ..)` or `data.table::rbindlist` or `dplyr::bind_rows`) and upload *once*. – r2evans Mar 10 '22 at 18:55
  • I now see your [other question](https://stackoverflow.com/questions/71417218/splitting-a-dataset-into-arbitrary-sections) asking to split the data in the first place ... I have to ask, **Why?** If it's all going to the same table, why split them? I saw no rationale for this in that question. – r2evans Mar 10 '22 at 18:56

0 Answers0