1

I am trying to use a SQL instruction on R to JOIN on the nearest past date.

I have the table JOIN :

        CODE   DATECREATION      PRIX
1         4   2015-12-31           813
2         4   2015-12-31           824
3         4   2016-07-20           864
4         4   2016-07-22           318.61
5         4   2016-07-22           256.52          

and table TGA :

     CODE      DATE         TGA
1     10      2012-06-01    19.29
2     20      2012-06-01    19.29
3     21      2012-06-01    19.29
4     23      2012-06-01    19.29
5     35      2012-06-01    19.29

I want to join that dataframes by CODE and DATE. But the dates are not the same, I want all lines of table JOIN, and take the nearest date of TGA to add the column TGA on the dataframe JOIN

I tried :

TEST = sqldf("SELECT A.*,
             (SELECT TOP 1 B.TGA
              FROM dataTGA B
              WHERE B.DATEDEBUTAPPLICATION < A.DATECREATION
              ORDER BY B.DATEDEBUTAPPLICATION DESC) AS TGA 
             FROM JOIN A")

The error is :

Error in rsqlite_send_query(conn@ptr, statement) : near "1": syntax error

And I didn't add yet the condition on the codes

celianou
  • 207
  • 4
  • 25
  • This is a duplicate: https://stackoverflow.com/questions/2728999/how-to-get-top-5-records-in-sqlite – Heikki Nov 08 '17 at 14:07
  • It is not the same problem. Since I am on R and not sqlLite. Using the LIMIT instead of TOP `TEST = sqldf("SELECT A.*, (SELECT B.TGAP FROM dataTGAP B WHERE B.DATEDEBUTAPPLICATION < A.DATECREATION ORDER BY B.DATEDEBUTAPPLICATION DESC LIMIT 5 ) AS TGAP FROM JOIN A")` gives me the error `Error in rsqlite_send_query(conn@ptr, statement) : near "JOIN": syntax error` – celianou Nov 08 '17 at 14:10
  • Possible duplicate of [How to get Top 5 records in SqLite?](https://stackoverflow.com/questions/2728999/how-to-get-top-5-records-in-sqlite) – mlegge Nov 08 '17 at 14:11
  • I don't think you can use `LIMIT` in a subquery in the select clause. Let me see if I can refactor for you. – Tim Biegeleisen Nov 08 '17 at 14:11
  • mlegge not the same thing. I am not working with SqlLite but the package sqldf or R. It would be so cool Tim Biegeleisen – celianou Nov 08 '17 at 14:13
  • 1
    It sure looks to me like you're sending these queries to a SQLite engine...therefore you are using SQLite. – Tim Biegeleisen Nov 08 '17 at 14:13
  • Ok my bad, but it didn't change the fact that I can't use LIMIT in a subquery... – celianou Nov 08 '17 at 14:14
  • Please edit the SQL query so that it uses SQLite dialect. – Heikki Nov 08 '17 at 14:16
  • Not every correlated subquery can be rewritten as a join, and the one you have may not be rewritable. But you should edit your question, show some sample data, along with what your query is supposed to be doing. Maybe there is still another workaround. – Tim Biegeleisen Nov 08 '17 at 14:20
  • @TimBiegeleisen Do you have some ideas ? – celianou Nov 08 '17 at 14:48
  • I upvoted your question, this is as much as I think I can do. – Tim Biegeleisen Nov 08 '17 at 14:53
  • (1) `TOP` is not a valid keyword in SQLite which is the default backend used by `sqldf`, i.e. if you do not specify an alternate backend. You can use `limit` in SQLite and that does work in subqueries. For example, `sqldf("select * from (select * from iris limit 1)")` works. (2) `DATEDEBUTAPPLICATION` is not a column in any of the inputs. – G. Grothendieck Nov 08 '17 at 15:01
  • I tried with LIMIT in subquery but it didn't work. I tried the instruction `TEST = sqldf("SELECT A.*, (SELECT B.TGAP FROM dataTGAP B WHERE B.DATEDEBUTAPPLICATION < A.DATECREATION ORDER BY B.DATEDEBUTAPPLICATION DESC LIMIT 1) AS TGAP FROM JOIN A")` and it gives me the error `Error in rsqlite_send_query(conn@ptr, statement) : near "JOIN": syntax ` – celianou Nov 08 '17 at 15:13
  • @G.Grothendieck But the nature of the query requires a subquery in the select clause, along with `LIMIT`. This is a tough one to refactor; if we had exact data and a SQLite console, it might be possible. – Tim Biegeleisen Nov 08 '17 at 15:13
  • That is not the source of the error. There are at least two other errors in the SQL statement (1) use of an SQL keyword as a table name, (2) there is no table called `dataTGAP` in the question. – G. Grothendieck Nov 08 '17 at 15:45
  • SQLite is not very good at inner queries. Try to write your query without inner queries. And edit the question above how far you have reworked the question. – Heikki Nov 08 '17 at 19:54

0 Answers0