0

Absolute R newbie, and I know this should be simple, but I have spent 2 hours without any success.

How can I convert my dataframe from this (First 6 lines of dataframe)

Symbol  AF  wave
CUX1    0.0975  1
CUX1    0.0337  3
CUX1    0.0217  4
LUC7L2  0.0488  1
LUC7L2  0.0515  3
LUC7L2  0.0422  4

to something like this?

Symbol  AF  wave 1  wave 2  wave 3  wave 4
CUX1    0.0975  0.0975  NA  0.0337  0.0217
LUC7L2  0.0337  0.0488  NA  0.0515  0.0422

Hi Akron,

with your advice I get something like this:

Symbol  AF  wave 1  wave 2  wave 3  wave 4
CUX1    0.0975  0.0975  NA  NA  NA
LUC7L2  0.0337  0.0337  NA  NA  NA
CUX1    0.0975  NA  0.0337  NA  NA
LUC7L2  0.0337  NA  0.0515  NA  NA
CUX1    0.0975  NA  NA  0.082   NA
LUC7L2  0.0337  NA  NA  0.0781  NA

So almost there...

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459

1 Answers1

0

We need to first complete the missing 'wave' and then do the pivot_wider to reshape from 'long' to 'wide' format

library(dplyr)
library(tidyr)
library(stringr)
library(data.table)
df1 %>% 
    mutate(wave = str_c('wave', wave)) %>%
    complete(wave = str_c('wave', 1:4)) %>%
    mutate(rn = rowid(wave)) %>%
    pivot_wider(names_from = wave, values_from = AF) %>%
    filter(!is.na(Symbol)) %>%
    select(-rn)
# A tibble: 2 x 5
#  Symbol  wave1 wave2  wave3  wave4
#  <chr>   <dbl> <dbl>  <dbl>  <dbl>
#1 CUX1   0.0975    NA 0.0337 0.0217
#2 LUC7L2 0.0488    NA 0.0515 0.0422

data

df1 <- structure(list(Symbol = c("CUX1", "CUX1", "CUX1", "LUC7L2", "LUC7L2", 
"LUC7L2"), AF = c(0.0975, 0.0337, 0.0217, 0.0488, 0.0515, 0.0422
), wave = c(1L, 3L, 4L, 1L, 3L, 4L)), class = "data.frame", row.names = c(NA, 
-6L))
akrun
  • 874,273
  • 37
  • 540
  • 662