0

I merged two dataframe in R, both have id and date columns. I decide to merge by id, after merging there are two date columns, one is date.x and the other is date.y. But I only need one date column

I have searched online for solutions but find none.

Yomson
  • 1
  • 1
    Hello Yomson, please share the code you used and a sample of data, so you question can be reproduced to give you an answer. – Ric Dec 06 '22 at 19:24
  • 2
    That is because the `date` column is present in both dataframes, and when merging you need a way to differentiate them, so the first one gets the suffix `.x` and the second one `.y`. I guess you can just drop one and change the name of the remaining column to `date` ? Or before merging, drop the `date` column from one of the dataframes. – cucurbit Dec 06 '22 at 19:25
  • 2
    If the date column data necessarily match, another solution would be to join by both id and date. – Jon Spring Dec 06 '22 at 19:29
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Dec 06 '22 at 19:49
  • 1
    A reproducible example of your problem would be very nice, just like the others stated already. Nevertheless, I think you, should explore the mutating joins of 'dplyr'. As cucurbit wrote, with these functions you can define a suffix for duplicate columns after merging or even drop them: https://dplyr.tidyverse.org/reference/mutate-joins.html – fbeese Dec 06 '22 at 20:37
  • The dataframe is like this: **data1** Id date first_test second_test exam_score **data2** Id date grade graduate_school ```{r} new_data <-merge(data1,data2, by = "Id",all=TRUE) %>% drop_na() ``` when I run the code it gave me new_data with these columns **new_data** Id *date.x* first_test second_test exam_score *date.y* grade graduate_school – Yomson Dec 07 '22 at 09:14

0 Answers0