0

I have an issue that should be straightforward with dplyr (I think) but I can't seem to find a resolution.

My dataframe comprises numbers and factors. Each observation is represented by two rows which have either a value or NA in one of two columns (Agg_Entropy and Av_Amplitude). I want to combine each observation's rows into a single row (without summarising), so that the NAs are replaced with the relevant values.

A simple excerpt of the dataframe:

 Selection Low   High    Agg_Entropy Av_Amplitude Filename                  
  <fct>     <fct> <fct>         <dbl>        <dbl> <fct>                     
1 1         368.2 13747.8       NA           -17.5 20180110_182800_Sunset.wav
2 1         368.2 13747.8        5.62         NA   20180110_182800_Sunset.wav
3 2         142   13926.3       NA           -17.4 20180110_182800_Sunset.wav
4 2         142   13926.3        5.96         NA   20180110_182800_Sunset.wav

What I want:

 Selection   Low    High Agg_Entropy Av_Amplitude                   Filename
1         1 368.2 13747.8       5.623        -17.5 20180110_182800_Sunset.wav
2         2 142.0 13926.3       5.958        -17.4 20180110_182800_Sunset.wav

Any help is very much appreciated. Thank you!

1 Answers1

2

After group_by with columns 'Selection', 'Filename', 'Low', and 'High', summarise the other columns by removing the NA elements with na.omit. Here, we assume that there are only one non-NA element per each column for the groups

library(tidyverse)
df1 %>%
   group_by(Selection, Filename, Low, High) %>%
   summarise_all(na.omit)
akrun
  • 874,273
  • 37
  • 540
  • 662