2

I have a number of summary tables located in a mysql database. The database names are labelled as follows. I have created a list of these filenames- databasenameslist

AAA_1
AAA_2
AAA_3
AAA_4
AAA_5
AAA_6
AAA_7

The databases each contain a summary table. The names of these are as follows and are in the form of a list- querylist

AAA_1_Summary
AAA_2_Summary
AAA_3_Summary
AAA_4_Summary
AAA_5_Summary
AAA_6_Summary
AAA_7_Summary

Each table has a set of variables. I am running a random forest on the variables.

table AAA_1

   Variable1    Variable2   Variable3   Outcome
     A           1           123         100
     B           2           145         150
     C           3           167         130
     D           4           111         111
     A           5           133         123
     B           6          131.6        111

I want to import each table via an RMySQL query and a for loop. I Want to run a random forest on each table separately and the output should be written to a separate series of files.

I have tried the following code

  require(RMySQL)# Import library
  require(DBI)# import library
  require(sqldf)# import library
  require(dbConnect)
  username = 'username'
  userpass = 'userpass'
  hostname = '111.11.11.1111'

  for (i in databasenameslist){
                      cdb<-i}
 for(i in querylist){
                      Query<- i}

  cdbconn = dbConnect(MySQL(), user=username, password=userpass, 
  dbname=cdb[i], 
  host=hostname)

The above step is to connect each db separately. Now each file has to be read. For this, I am using the following code. database<-dbGetQuery(cdbconn, paste(querylist[i]))

The code above does not work.

Note: The final solution will be similar to this code: Write many files in a for loop

Ferdi
  • 540
  • 3
  • 12
  • 23
Raghavan vmvs
  • 1,213
  • 1
  • 10
  • 29

2 Answers2

1

I did not put the code for the random forest in here. I'm not familiar enough to give you an answer on that but maybe you can edit this loop to give you what you want. But here is the code to loop through your databases and summary tables and put the results of each query into a new R Data Frame. Move the code for cdbconn up with the rest of your password and username variables.

`for (i in 1:length(databasenamelist)){
     query = sprintf('SELECT * FROM %s.%s',databasenamelist[i],querylist[i])
     assign(paste0('table',databasenamelist[i]), dbGetQuery(cdbconn, query))
}`
Charlie
  • 61
  • 3
1

Consider Map to iterate through both lists element-wise since you do not need to nest the looping. Also, add SELECT * FROM to your querylist names to send appropariate SQL calls.

# IF QUERYLIST IS A CHARACTER VECTOR
querylist <- paste0("SELECT * FROM ", querylist)

# IF QUERYLIST IS A LIST
querylist <- lapply(querylist, function(q) paste0("SELECT * FROM ", q))

proc_data <- function(db, qry) {      
   conn <- dbConnect(MySQL(), user=username, password=userpass, dbname=db, host=hostname)
   df <- dbGetQuery(conn, qry)    
   dbDisconnect(conn)

   # ... DO OTHER WORK ON DF: random forest, csv output    

   return(df)    # OR WHATEVER OTHER OBJ IN PROCESSING
}

df_List <- Map(proc_data, databasenameslist, querylist)
Parfait
  • 104,375
  • 17
  • 94
  • 125