0

I have multiple Excel sheets of a dataset that I need to align. Simplifying, it looks like this (this is just a simplification, in fact it has hundreds of columns and rows and there are multiple sheets of data): enter image description here As you can see, in the first sheet every patient has age, however in sheet n° 2 only patients 1, 3 and 4 have data on sex (and also those with sex missing are not present in the sheet). The result I want to have is in the third image so that the patients with sex missing are still reported but with blank. So I want to perform the alignment based on PATIENT NUMBER. I want to know if there are packages or rapid ways to do it.

user19745561
  • 145
  • 10

1 Answers1

1

Merge them with a left join:

library(tidyverse)

a <- tibble(
  patient = 1:5,
  age = c(22, 30, 31, 50, 60)
)

b <- tibble(
  patient = c(1, 3, 4),
  sex = c(0, 1, 0)
)

left_join(a, b, join_by(patient))
#> # A tibble: 5 × 3
#>   patient   age   sex
#>     <dbl> <dbl> <dbl>
#> 1       1    22     0
#> 2       2    30    NA
#> 3       3    31     1
#> 4       4    50     0
#> 5       5    60    NA

Created on 2023-03-09 with reprex v2.0.2

dufei
  • 2,166
  • 1
  • 7
  • 18