0

I am trying to transform my data-frame from wide format to long format. I have seen many questions already posted here regarding this, but it is not quite what I am looking for / I do not see how to apply it to my problem.
The data-frames share some columns like Name, SharedVal etc. but also have columns the other dataset does not have.
What I want to achieve: Merge these two dataframes based on the UserId, but per UserID have as many rows as there are MeasureNo.
So if there have been two measurements for a user, there will be two rows with the same user id. And the rows have the same length, but some columns have different entries/no entry at all.

Example:

Dataset1:

UserID Name MeasureNo SharedVal1 SpecificVal1
     1 Anna         1         42            8
     2 Alex         1         28           50

and

Dataset2:

UserID Name MeasureNo SharedVal1 DifferentVal1
     1 Anna         2         15            99
     2 Alex         2         33            45

And they should be merged into:

UserID Name MeasureNo SharedVal1 SpecificVal1 DifferentVal1
     1 Anna         1         42            8             -
     1 Anna         2         15            -            99
     2 Alex         1         28           50             -
     2 Alex         2         33            -            45

and so on...

problem is, the dataset is huge and there are a lot of rows and columns, so I thought that somehow merging them on the id and than reshaping is the most generic approach. But I could not achieve the expected behaviour.

What I am trying to say programatically is: "Merge the two dataframes based on the userid and create a as much rows per userid as there are different times of measurement(MeasureNo). Both rows obviously have the same amount of columns. So im both rows, some values in certain columns cannot be filled.

Sorry I am new to SO and this was my best approach to visualizing a table with rows starting in a new line and the Key:Val representing a columing inside that row.

thelatemail
  • 91,185
  • 12
  • 128
  • 188
Phitje
  • 1
  • 1
  • 2
    What happened to the values for "Alex" in your output? This seems like you need just a simple "merge". It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Use `dput()` or other techniques included in the link to properly share sample data. – MrFlick Jan 09 '19 at 22:19
  • 2
    it would be difficult to help you unless you take the time to learn how to create a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Eric Jan 09 '19 at 22:19
  • I've made an attempt at making your examples more obvious. If you are struggling to make an example object exactly as it is, it's often best to just type it out as columns like R would show. Please let me know if these changes don't represent what you want. – thelatemail Jan 09 '19 at 22:39
  • thanks for the suggestion, Ill try to build an reproducible example as suggested. and thank you for styling it properly – Phitje Jan 09 '19 at 23:04

1 Answers1

1

You can do outer join:

new_df <- merge(df1, df2, all = T)
YOLO
  • 20,181
  • 5
  • 20
  • 40