0

I am trying to do a simple merge of two datasets using sqldf in R. The code line is as follows:

data_complete <- sqldf("SELECT df1.*, df2.*
                    FROM df1 LEFT JOIN df2 
                    USING(ccn_number)")

I get the following error message: "Error: duplicate column name: zip5"

However, the variable zip5 is ONLY in df1, not df2, because I had already removed it from the latter, so there is no duplicate column name. I have tried some variations on the syntax, but continue getting the same error. I even tried deleting zip5 from a CSV version of df2 and then importing that, but still got the error. Any advice would be greatly appreciated, and thank you!

thelatemail
  • 91,185
  • 12
  • 128
  • 188
  • Please check your submission after posting. The `*`'s in your code weren't working because the code wasn't in a code block (hit the `{}` button in the editor). I've fixed it up now. – thelatemail Jun 16 '21 at 21:00
  • Can you also let us know which database backend you are using for *sqldf*? I don't get this error when using the default *SQLite* backend, even with duplicate columns. Going by a quick search I'm guessing *MySQL*? – thelatemail Jun 16 '21 at 21:02
  • 1
    Please set up a [mcve] where you `dput` a sample of `df1` and `df2` to be able to reproduce your error. Also, see [Why is SELECT * considered harmful?](https://stackoverflow.com/q/3639861/1422451). – Parfait Jun 16 '21 at 21:04
  • Thanks for the fix! Sorry, this is my first post ever- appreciate the advice on how to cordon off the code block properly. Correct, the backend is MySQL, so I'll look into how to change it. I'm not sure how to reproduce this example because I have never had this happen before. I also tried removing the offending variable and got the same error but instead for another (also unduplicated) variable. – Bijan Niknam Jun 17 '21 at 15:52
  • Check that there are not any left over tables from prior attempts in the backend database that MySQL is using. – G. Grothendieck Jun 17 '21 at 17:47
  • Thanks for the tip- not sure how to clear the backend database, though. I did try a merge with base R but got the same problem- it thinks there is a duplicate column, but there isn't. – Bijan Niknam Jun 21 '21 at 15:13

0 Answers0