I have two data sets which needs to be joined and observations to be duplicated to avoid NAs as I'm going to run a regression on the data. However, I'm struggling to make it work.
I currently have two data sets similar to these:
Children
Participant_id Family_no Score_c Gender_c
A1 1 300 .5
B1 1 400 -.5
C1 2 500 -.5
D1 2 450 .5
E1 2 600 .5
F1 3 350 -.5
Parents
Participants_id Family_no Score_p Gender_p Q_score
A2 1 200 .5 3
B2 1 350 -.5 3.5
C2 2 300 .5 2
D2 3 250 -.5 3.9
E2 3 300 -.5 4
I would like to join them together to create a data set where each child is represented by each parent in a family. E.g if a family has two parents and one child, the child's data is represented twice and vice versa, and if there are two parents and two children, each observation exists twice per family. I.e. like this (the participant column is not necessary):
Participant_id Family_no Score_c Score_p Gender_c Gender_p Q_score
A1+A2 1 300 200 .5 .5 3
A1+B2 1 300 350 .5 -.5 3.5
B1+A2 1 400 200 -.5 .5 3
B1+B2 1 400 350 -.5 -.5 3.5
C1+C2 2 500 300 -.5 .5 2
D1+C2 2 450 300 .5 .5 2
E1+C2 2 600 300 .5 .5 2
F1+D2 3 350 250 -.5 -.5 3.9
F1+E2 3 350 300 -.5 -.5 4
I'd ideally like to use tidyverse but am open to other suggestions!
Thanks in advance,
Julia