I was given these crude database exports to fix up. In SQL
have two tables that each have three columns: ID, first name and last name. In sheet1
, the ID column is empty at the start whereas there are ID's in sheet2
. I needed to join sheet1
and sheet2
where both first and last name matched and if there was an ID present in sheet2
from the match, to set it in the ID column in sheet1
. Very simple SQL query:
UPDATE sheet1
JOIN sheet2
ON sheet1.NAME_LAST = sheet2.Last_Name
AND sheet1.NAME_FIRST = sheet2.First_Name
SET sheet1.CONSTITUENT_ID = sheet2.ID
As this is a task that needs to be done often, what I'd like to do is set up my less savvy co-workers with a simple R script. I'm still a total beginner at R however. At present, I've got sheet1
and sheet2
in two data.frames (oh-so-descriptively named book1 and book2) via the native R method: book2 = read.csv("sheet2.csv")
Through some googling it was suggested I take a look at the data.table library, but I'm still at a bit of a loss for where to start. Can someone help me translate SQL in to R?