18

I have a script that works fine when I run it manually in R Studio, but does not work when I run it from another program through a wrapper.

I get this info in my debug output:

[912] Error in xj[i] : only 0's may be mixed with negative subscripts 
[912] Calls: GetTopN -> cor -> is.data.frame -> [ -> [.data.frame 

If I save the image right before I get the error and then load it in R Studio I get the same error when I execute GetTopN(10). However, if I re-run the statement actionlist<- sqlQuery(channel,al_string) within R Studio and then execute GetTopN(10) everything works as it should.

I even tried to save the image within R Studio right before the critical call, and then load it through the wrapper before executing GetTopN(10) and I got the same error.

I checked and all of the relevant variables (crs,z,x,n) appear to have the proper values. I have no idea what could be the cause of this, and I'd really appreciate some help!

Here is what is being executed (in order):

#INIT:
library(RODBC)
library(stats)

channel<- odbcConnect("data")
crs<-mat.or.vec(3000,5) #will hold correlations
n1<-seq(-33,0)

#Get whole series
z <- sqlQuery(channel,"SELECT RPos,M1,M2,M3,M4 FROM `data`.`z` ")
al_string <- "SELECT RPos,OpenTime FROM z JOIN actionlist on(OpenTime = pTime)"
trim_string<- "DELETE FROM ActionList WHERE OpenTime NOT IN (SELECT OpenTime FROM ReducedList)"

GetTopN<-function(n)
{ 
  for(i in 1:nrow(actionlist))
  {
   crs[i,1]<-actionlist$OpenTime[i]
   for(j in 2:ncol(z)) 
   {
    crs[i,j]<-cor(z[actionlist$RPos[i]+n1,j],x[,j])
   }
  }
  avc <- (cbind(crs[,1],rowSums(crs[,2:5])))
  sorted <- crs[order(avc[,2], decreasing=T),1] 
  topx<- head(sorted,n)
  bottomx <- tail(sorted,n)
  DF<-as.data.frame(c(topx,bottomx),row.names=NULL) 
  colnames(DF)[1]<-'OpenTime'
  sqlSave(channel,dat=DF,tablename='ReducedList',append=F,rownames=F,safer=F) 
  sqlQuery(channel,trim_string)
}


curpTime <- 1275266400
actionlist<- sqlQuery(channel,al_string)

x<- sqlQuery(channel,paste('SELECT pTime,M1,M2,M3,M4 FROM z WHERE pTime <= ',curpTime,' AND 
pTime > ',curpTime,'-(300*34) ORDER BY pTime ASC'))

GetTopN(10)

I saved my workspace too if it might help (4.7mb): workspace If connecting to my MYSQL database would help, it should be open on 74.73.17.163:3306

Mike Furlender
  • 3,869
  • 5
  • 47
  • 75
  • Although this wasn't your problem - it is worth noting that things that code meant for things that change the meaning of [i,j], e.g. data.table can also produce this kind of error when they are run against a data.frame. In particular, in a situation similar to yours where things are running into a wrapper and you may have failed to load the required package. – russellpierce Oct 31 '14 at 10:56

2 Answers2

17

The problem: actionlist$RPos[1000] has a value of 21. n1 ranges from -31 to 0. When you add them you get a vector with a mix of positive and negative values, which isn't allowed in subsetting.

How I got there: First check traceback():

traceback()
5: `[.data.frame`(z, actionlist$RPos[i] + n1, j) at #8
4: z[actionlist$RPos[i] + n1, j] at #8
3: is.data.frame(x) at #8
2: cor(z[actionlist$RPos[i] + n1, j], x[, j]) at #8
1: GetTopN(10)

This tells me the problem is in actionlist$RPos[i] + n1 most likely. Then I just added a simple print(i) statement to tell me which iteration was the problem. (Alternatively, you could probably have just checked actionlist$RPos + n1 for trouble spots manually.

joran
  • 169,992
  • 32
  • 429
  • 468
  • Excellent! An `if (actionlist$RPos[i]>34)` statement fixed it! Lightning fast response :) I'll use traceback() in the future. Is it possible to have R print all of the variables when an error occurs? – Mike Furlender Feb 10 '12 at 01:57
  • 1
    @MikeFurlender Not off the top of my head (but you learn quickly not to say something isn't possible in R). However, it will probably be more efficient to zero in on the trouble spot using `traceback`, and then investigate more closely using `browser` or `debug`. – joran Feb 10 '12 at 02:04
  • If you can anticipate trouble at a given point in a function, putting in a trycatch followed by a "print a bunch of values" statement would sort of do what @Mike asked. – Carl Witthoft Feb 10 '12 at 13:23
4

For tidyverse users, dplyr gives this error if you try to subset a grouped df, you need to call ungroup first.

tauft
  • 546
  • 4
  • 13