0

I just started using the R package called RMySQL in order to get around some memory limitations on my computer. I am trying to take a matrix with 100 columns in R (called data.df), then make a new table on an SQL database that has "100 choose 2" (=4950) columns, where each column is a linear combination of two columns from the initial matrix. So far I have something like this:

countnumber <- 1
con <- dbConnect(MySQL(), user = "root", password = "password", dbname = "myDB")
temp <- as.data.frame(data.df[,1] - data.df[,2])
colnames(temp) <- paste(pairs[[countnumber]][1], pairs[[countnumber]][2], sep = "")
dbWriteTable(con, "spreadtable", temp, row.names=T, overwrite = T)

for(i in 1:(n-1)){
for(j in (i+1):n){
    if(!((i==1)&&(j==2))){     #this part excludes the first iteration already taken care of
        temp <- as.data.frame(data.df[,i] - data.df[,j])
        colnames(temp) <- "hola"
        dbWriteTable(con, "spreadtable", value = temp, append = TRUE, overwrite = FALSE, row.names = FALSE)
        countnumber <- countnumber + 1
            }
        }
}

I've also tried toying around with the "field.types" argument of RMySQL::dbWriteTable(), which was suggested at RMySQL dbWriteTable with field.types. Sadly it hasn't helped me out too much.

Questions:

  1. Is making your own sql database a valid solution to the memory-bound nature of R, even if it has 4950 columns?
  2. Is the dbWriteTable() the proper function to be using here?
  3. Assuming the answer is "yes" to both of the previous questions...why isn't this working?

Thanks for any help.



[EDIT]: code with error output:

names <- as.data.frame(index)
names <- t(names)
#dim(names) is 1 409
con <- dbConnect(MySQL(), user = "root", password = "password", dbname = "taylordatabase")


dbGetQuery(con, dbBuildTableDefinition(MySQL(), name="spreadtable", obj=names, row.names = F))
#I would prefer these to be double types with 8 decimal spaces instead of text
#dim(temp) is 1 409
temp <- as.data.frame(data.df[,1] - (ratios[countnumber]*data.df[,2]))
temp <- t(temp)
temp <- as.data.frame(temp)
dbWriteTable(con, name = "spreadtable", temp, append = T)

The table is created successfully in the database (I will change variable type later), but the dbWriteTable() line produces the error:

Error in mysqlExecStatement(conn, statement, ...) : 
RS-DBI driver: (could not run statement: Unknown column 'row_names' in 'field list')
[1] FALSE
Warning message:
In mysqlWriteTable(conn, name, value, ...) : could not load data into table

If I make a slight change, I get a different error message:

dbWriteTable(con, name = "spreadtable", temp, append = T, row.names = F)

and

Error in mysqlExecStatement(conn, statement, ...) : 
RS-DBI driver: (could not run statement: Unknown column 'X2011_01_03' in 'field list')
[1] FALSE
Warning message:
In mysqlWriteTable(conn, name, value, ...) : could not load data into table

I just want to use "names" as a bunch of column labels. They were initially dates. The actual data I would like to be "temp."

Community
  • 1
  • 1
Taylor
  • 1,797
  • 4
  • 26
  • 51

1 Answers1

0

Having a query with 4950 rows is ok, the problem is that what columns you need.

If you always "select * ", you will eventually exhaust all you system memory (in the case that the table has 100 columns)

Why not give us some error message if you have encountered any problems ?

Gong-Yi Liao
  • 609
  • 4
  • 11
  • That makes sense--index the columns by time. Now I'm trying to figure out how to do this from within R. The "append" probably works better when you're trying to append rows instead of columns. – Taylor Aug 16 '12 at 18:20