So there's many problems on StackOverflow with the RODBC problem, but I haven't seen this specific variant of attempting to append to an AZURE SQL db. I really think there should be a parameter that allows you to identify the KEY and NOT try to load into that... I'd prepare a pull request, but RODBC doesn't have a dev branch on github? Anyway, I'll post my problem and what I tried to do, and then my nasty workaround.
I have my data in a table called ActDF.new
Here are the properties:
str(ActDF.new)
'data.frame': 52 obs. of 10 variables:
$ Date : Date, format: "2016-03-23" "2016-03-23" "2016-03-23" "2016-03-23" ...
$ Project : Factor w/ 1 level "x": 1 1 1 1 1 1 1 1 1 1 ...
$ IndName : Factor w/ 26 levels "x x...etc",..: 2 17 1 4 11 12 8 3 25 6 ...
$ IndNum : num 1 2 3 4 5 6 7 8 9 10 ...
$ ProjectYear : Factor w/ 2 levels "bla","blabla": 1 1 1 1 1 1 1 1 1 1 ...
$ Value : num NA NA NA NA 4883 ...
$ NoteTitle : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
$ NoteAnnotation: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
$ ID : num 1 1 1 1 1 1 1 1 1 1 ...
$ CorpCode : ch
I want to APPEND this info into a db, with a newish table called Actuals
. So I tried using RODBC::sqlSave to accomplish this. Below find the blow-by-blow:
Connect to the db
d <- "Actuals RW"
p <- "xx"
u <- "xx"
channel <- odbcConnect(d,u,p)
Get number of rows on the db (to know where the KEY should start)
PresentNum <- sqlQuery(channel, 'SELECT count(*) FROM Actuals', rows_at_time = 5)
PresentNum <- PresentNum[1,1]
OK, so let's add an ID to my DF, but also, there's tons of NAs for Value... and I don't need those, so let's convert to a better DF
## Initialize ID on this df
ActDF.new$ID <- 1
## Remove NAs from ActDF.new, and organize
toSave <- ActDF.new %>% filter(!is.na(Value)) %>%
select (ID,Date,Project,FiscalYear=ProjectYear,IndNum,IndName,CorpCode,CurrentValue=Value,NoteTitle,NoteAnnotation)
## And now issue correct numbers to the ID
toSave$ID <- (PresentNum+1):(nrow(toSave)+PresentNum)
There are many blank values, so let's convert those to NA (this is an annoying way to do it... I know)
toSave <-
apply(toSave, 2, function(x) gsub("^$|^ $", NA, x)) %>% as.data.frame()
## Now everything is a factor, convert to correct format
toSave$ID <- as.numeric(toSave$ID)
toSave$Date<- as.Date(toSave$Date)
toSave$Project<- as.character(toSave$Project)
toSave$FiscalYear<- as.character(toSave$FiscalYear)
toSave$IndNum<- as.character(toSave$IndNum)
toSave$IndName<- as.character(toSave$IndName)
toSave$CorpCode<- as.character(toSave$CorpCode)
toSave$CurrentValue<- as.numeric(toSave$CurrentValue)
toSave$NoteTitle<- as.character(toSave$NoteTitle)
toSave$NoteAnnotation <- as.character(toSave$NoteAnnotation)
### OK, we're ready! So try to append! ###
################# Test 1 TRY APPENDING DATA AS IS
sqlSave(channel, toSave, tablename = 'Actuals', append = T,
rownames = F, colnames = F, verbose = T,
safer = T, addPK = F,
fast = T, test = F)
### RETURNS ERROR: Cannot insert explicit value for identity column in table 'Actuals' when IDENTITY_INSERT is set to OFF.
Huh... fair enough...
################# Test 2 TRY TO CHANGE THE IDENTITY_INSERT PROPERTY
sqlQuery(channel, "Set IDENTITY_INSERT Actuals ON", errors = TRUE)
### RETURNS ERROR: Cannot find the object \"Actuals\" because it does not exist or you do not have permissions."
Oh... but... wait, what? The table for sure exists and I have RW rights. So maybe IDENTITY_INSERT is different somehow... what's the status anyway?
sqlQuery(channel, "SELECT OBJECTPROPERTY(OBJECT_ID('Actuals'), 'TableHasIdentity')")
### RETURNS 1.
Huh. Don't know what that means... let's try again w/ different addPK
settings
################# Test 3, try to use addPK = TRUE to see if it makes difference.
sqlSave(channel, toSave, tablename = 'Actuals', append = T,
rownames = F, colnames = F, verbose = T,
safer = T, addPK = T,
fast = T, test = F)
### RETURNS ERROR: Cannot find the object \"Actuals\" because it does not exist or you do not have permissions."
That's not annoying at all. OK FINE, let's remove the ID all-together
################# Test 4, Try to remove the ID
NoID <- toSave[,-grep("ID",names(toSave))]
sqlSave(channel, NoID, tablename = 'Actuals', append = T,
rownames = FALSE, colnames = FALSE, verbose = T,
safer = TRUE, addPK = F,
fast = T, test = F)
### RETURNS ERROR: Error in odbcUpdate(channel, query, mydata, coldata[m, ], test = test, : missing columns in 'data'
Oh really? There's missing columns???????? FINE
## So add back in a dummy column
NoID$dummy <- 0
sqlSave(channel, NoID, tablename = 'Actuals', append = T,
rownames = FALSE, colnames = FALSE, verbose = T,
safer = TRUE, addPK = F,
fast = T, test = F)
### RETURNS ERROR: Error in odbcUpdate(channel, query, mydata, coldata[m, ], test = test, : missing columns in 'data'
Setting fast=F
returns the error: length of 'dimnames' [2] not equal to array extent
OK. gg sqlSave
you win, I lose. Here is where I think a modification is in order... checking the SQL query that it's building, we see:
Query: INSERT INTO "Actuals" ( "ID", "Date", "Project", "FiscalYear", "IndNum", "CorpCode", "CurrentValue", "NoteTitle", "NoteAnnotation", "IndName" ) VALUES ( ?,?,?,?,?,?,?,?,? )
I think there needs to be some parameter where I could simply specify that query to NOT try to append to the column ID
... right?
Am I missing something?