-1

I have two tables of customer and order,I want to filter out the customer_id which meets the demand of step1 and step2,while when I do the step2.5,the console shows

Error: Cannot pass NA to dbQuoteIdentifier()
In addition: Warning message:
In field_types[] <- field_types[names(data)] :
  number of items to replace is not a multiple of replacement length

Table

step1<- sqldf("select * from customer_table as ct inner join order_table as ot ON ct.customer_id=ot.customer_id where order_date<20161222 and order_amount=1 group by ct.customer_id;")

step2<- sqldf("select ot.customer_id from  order_table as ot 
              where order_date between 20161222 and 20170222
              and order_amount=0
              group by ot.customer_id;")

step2.5<- sqldf("select * from step1 as s1 inner join step2 as s2 on s1.customer_id=s2.customer_id; ")

Someone could help?Thanks

Jeff Hoffman
  • 71
  • 1
  • 2
  • 9
  • 2
    Your queries are rife with errors. Please explain what you are trying to do, and ideally give us some sample table data. Usually it makes little sense to be doing a `GROUP BY` without also selecting one or more aggregates of columns. Your `step2.5` is almost certain to fail; you can't just refer to an R query result set like that. – Tim Biegeleisen Aug 24 '17 at 17:19
  • @TimBiegeleisen because I want to filter out the – Jeff Hoffman Aug 24 '17 at 17:35
  • Same customer_id which step1 and step2 have – Jeff Hoffman Aug 24 '17 at 17:35
  • 2
    Please provide some sample data frames with the minimum number of rows to reproduce the problem. so that anyone else can run it. See [mcve]. – G. Grothendieck Aug 24 '17 at 17:36
  • https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Hack-R Aug 24 '17 at 17:48

1 Answers1

1

I am not able to replicate any error. I've made a few improvements to the SQL, but if this does not solve your problem, please provide your data in a reproducible format within your question.

data(iris)
customer_table <- iris
order_table    <- iris

customer_table$customer_id <- 1:nrow(iris)
order_table$customer_id    <- 1:nrow(iris)

customer_table$order_amount <- 1
order_table$order_amount    <- 0
order_table$order_date      <- rep(c(20161221, 20161223))

step1 <- sqldf("select ct.* 
                from customer_table ct 
                join order_table ot on 
                ct.customer_id=ot.customer_id 
                where ot.order_date < 20161222 
                and ct.order_amount=1
                group by ct.customer_id")

step2 <- sqldf("select customer_id 
                from order_table 
                where order_date 
                between 20161222 and 20170222
                and order_amount=0
                group by customer_id")

step2.5 <- sqldf("select * from step1 s1 
                 join step2 s2 
                 on s1.customer_id=s2.customer_id")

This proof of concept creates a table without error. That table correctly has 0 rows using this example data.

Hack-R
  • 22,422
  • 14
  • 75
  • 131
  • 1
    Thanks and I will add my data later – Jeff Hoffman Aug 24 '17 at 17:49
  • This is what I want to do:Select out customers in the customer table, who only made one purchase before 2016/12/22; For users in step 1, filter out users who didn’t purchase anything between 2016/12/22 and 2017/02/22, aka this group of users go dormant for 3 months. – Jeff Hoffman Aug 24 '17 at 17:54
  • @JeffHoffman I don't see why there would be an error. It really must be your data. Please visit the link in my comment on your question to see what reproducible forms of data are required for the R tag, for instance dput() – Hack-R Aug 24 '17 at 18:44
  • Since my data is very large (about 2G) ,do you know how could I attach it here? – Jeff Hoffman Aug 24 '17 at 18:53
  • @JeffHoffman Big data and secret data are an extremely common situation here. That's why they encourage you to make up data within the code of your question that's comparable in the ways that matter but minimal and reproducible. Alternately you can just take a small sample. – Hack-R Aug 24 '17 at 19:27