0

My sqldf set up in R uses SQLite by default. I tried the following query without success:

query =  "UPDATE t1
       SET Actual = t2.AvgRevenue,
           Total = t2.AvgRevenue
       WHERE  Name=t2.Name AND 
              Pillar= 'HW' AND
              (Status <> 'Lost') AND
              Revenue=0"

t1 = sqldf(c(query,"select * from pl0"))

t1 has columns Name, Pillar, Status, Revenue, Actual, Total t2 is a lookup table with columns Name, AvgRevenue

After doing some research, I found that SQLite does not currently support UPDATE queries involving two or more tables.

My question is this: can I do the equivalent of the query above using only R?

To get an answer, I tried the following:

test <- t1[t1$Revenue == 0 & t1$Status == 'Lost' & t1$Pillar == 'HW',]
test$Actual <- test$Name
mapvalues(test$Actual,
          t2$Name,
          t2$AvgRevenue,
          warn_missing = FALSE)
t1 <- test

but mapvalues is not updating column test$Actual as I expected. The right values of t2$AvgRevenue are output to the console, but test$Actual is not updated. By the way, I want t1 to be the same data frame as before, but with the appropriate rows in columns Actual and Total updated.

Any suggestions will be greatly appreciated!

  • If you would say actually what you wanted in natural language rather than showing failures in SQL, the R users might be of some help. I cannot tell what is source and what is destination and there are no examples to work with.. – IRTFM Apr 12 '16 at 23:30
  • Welcome to SO! You should edit to make your example [reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610), though it looks like all you need to do is assign the results of `mapvalues` to a variable. You may also like `dplyr`, whose syntax is pretty SQL-like. – alistaire Apr 12 '16 at 23:31
  • Thank you. Assigning the results of mapvalues to test$Actual worked! – Joaquin Marques Apr 13 '16 at 01:08

2 Answers2

0

You can use the dplyr library to select the variables:

    library(dplyr)
    Actual <- select(t1, Name, Pillar, Status, Revenue)
    Avg_Revenue < select(t2, Name, AvgRevenue)

    complete_data = cbind(Actual, Avg_Revenue) 

You can also use the filter:

   filter(Actual, Revenue==0, Status =="lost")  

Hope it helps

Boro Dega
  • 393
  • 1
  • 3
  • 13
  • Right. Seems overkill to reach for `dplyr` for that purpose. You can use `[` to select columns as well. The question is what his data looks like and what he actually wants to do. It appears to me that he wants to do a selective update of certain columns with values from another table, but there was too much SQL code that he said didn't work and not enough data and explanation. – IRTFM Apr 13 '16 at 00:12
  • Thank you for the feedback. Here is my data: – Joaquin Marques Apr 13 '16 at 01:09
  • My data is: t1 ---------------------------------------------------- Name Pillar Status Revenue Actual Total A SW Won 5 5 5 B HW Open 0 0 0 C HW Won 0 0 0 D SW Lost 0 0 0 t2 ------------------- Name AvgRevenue A 5 B 3 C 7 D 10 and the result I want is: t1 ---------------------------------------------------- Name Pillar Status Revenue Actual Total A SW Won 5 5 5 B HW Open 0 3 3 C HW Won 0 7 7 D SW Lost 0 0 0 Hope this helps! – Joaquin Marques Apr 13 '16 at 01:27
0

I found an answer to my question, based on R. Here it is:

t1 <- data.frame(Name=c("A","B","C","D"), 
                 Pillar=c("SW","HW","HW","SW"),
                 Status=c("Won","Open","Won","Lost"),
                 Revenue=c(5,0,0,0),
                 Actual=c(5,0,0,0),
                 Total=c(5,0,0,0))

t2 <- data.frame(Name=c("A","B","C","D"),
                 AvgRevenue=c(5,3,7,10))

t1[t1$Revenue == 0 & t1$Status != 'Lost' & t1$Pillar == 'HW',]$Actual <- 
  as.character(t1[t1$Revenue == 0 & t1$Status != 'Lost' & t1$Pillar == 'HW',]$Name)

t1[t1$Revenue == 0 & t1$Status != 'Lost' & t1$Pillar == 'HW',"Actual"] <-
  mapvalues(t1[t1$Revenue == 0 & t1$Status != 'Lost' & t1$Pillar == 'HW',"Actual"],
            t2$Name,
            t2$AvgRevenue,
            warn_missing = FALSE)

t1[t1$Revenue == 0 & t1$Status != 'Lost' & t1$Pillar == 'HW',"Total"] <- 
  as.character(t1[t1$Revenue == 0 & t1$Status != 'Lost' & t1$Pillar ==      'HW',"Name"])

t1[t1$Revenue == 0 & t1$Status != 'Lost' & t1$Pillar == 'HW',"Total"] <-
  mapvalues(t1[t1$Revenue == 0 & t1$Status != 'Lost' & t1$Pillar == 'HW',"Total"],
            t2$Name,
            t2$AvgRevenue,
            warn_missing = FALSE)

t1

The trick is to use the common key between t1 & t2 (Name) as an intermediate step to be able to use mapvalues to do the final step. This is the equivalent of the original SQL UPDATE query. Thank you very much for all your suggestions!