0

I am sorry in advance for the layman's post but I am an R beginner. I really need to reshape my long-formated dataset into wide-formated dataset and reshape a categorical variable to binary. I checked all the previous similar posts but none seems to help. My dataset looks like that:

ID   Disease
1    Measles
1    Measles
1    Pox
2    Measles
2    Pox
2    Chicken Pox
3    Pox 
3    Pox
3    Chicken Pox

And I would like an output that would look like that:

ID    Measles     Pox     Chicken Pox
1        1         1            0
2        1         1            1
3        0         1            1

Does anybody have an idea of how I can do that? Thank you so much for your help. I am grateful.

Maël
  • 45,206
  • 3
  • 29
  • 67
steffie22
  • 1
  • 1

1 Answers1

0

You can use table and convert it to a dataframe:

df <- as.data.frame.matrix(table(df))
df[df > 1] <- 1

  ChickenPox Measles Pox
1          0       1   1
2          1       1   1
3          1       0   1

Or, using tidyr::pivot_wider:

library(dplyr)
library(tidyr)
df %>% 
  distinct(ID, Disease) %>% 
  pivot_wider(names_from = "Disease", values_from = "Disease", 
              values_fn = list(Disease = length), values_fill = 0)

# A tibble: 3 x 4
     ID Measles   Pox ChickenPox
  <int>   <int> <int>      <int>
1     1       1     1          0
2     2       1     1          1
3     3       0     1          1
Maël
  • 45,206
  • 3
  • 29
  • 67