I am a beginner at R and I try to figure out what limits this remarkable (sometimes nerv-wracking) program have.
Here is my problem: I have two data frames(df's) from two different files of raw data. In these two data frames I have columns with ID numbers for individuals. I know how to merge these to df's together by ID. There problem is that the person who registered the ID numbers in one of the data frames have typed some of the ID numbers incorrect. For example, the ID is supposed to looks like this: NK-02-0028. But its typed in like this: NK-020028.
Hence, The ID's won't match when I merge these two data frames. If I had data frames with only 10 observations it wouldn't been such a big problem but I have approx. 8000 observations in one df and 355 in the other. The correct IDs are in the df with 355 obs and the wrong ones are in the df with 8000 obs. I want to match the ID numbers in the df with 355 observations based on the 4 last digits to see how many and what matches I get to see if there even are any matches.
Is this possible? Hopefully someone can help me and understands my problem I got here.