0

I'd like help in how to deal with NaN using summarise from the "tidyverse" package in R

#library('tydiverse')
smple<-tibble(
          g=c('pina','pina','arroz','arroz','arroz','manzana', 'manzana'),
          x=c(1,2,NaN,2,3,4,5),
          y=c(3,4,3,NaN,2,3,2)
      )
   #outpu1: get some NaN
   smple %>% group_by(g) %>% summarise_all(sum)#get some NaN

   #output2: condition doesn't work
  smple %>% group_by(g) %>% summarise_if(~!is.nan(.),sum) 

  #output3: condition works with filter, but results wrong
  smple %>% group_by(g) %>%filter_all(~!is.nan(.)) %>% 
  summarize_all(sum)

What I am trying to do is to avoid using NaN values and aggregate the values avaliable.

if you can describe why it is not working with summarise_if, would be appreciate to understand better, and why can't I get the right value in the output 3 (it should be x=5,y=5 for g='arroz') I think filter_all is removing the whole row if the condition is met at least in one column, but I need to remove the NaN for each column and apply the aggregation function for each one. Thanks

  • 1
    `smple %>% group_by(g) %>% summarise_all(sum, na.rm = TRUE)` Or `smple %>% group_by(g) %>% summarise(across(.fns = sum, na.rm = TRUE))` – Ronak Shah Dec 22 '20 at 06:34
  • It's basically the same as with base R functions. `sum(c(NaN, 1,2))` is NaN, so the results with`summarise` using `sum` is determined by hos `sum` behaves. (So even if you use tidyverse, you still may need to look at the help pages for the base functions.) – IRTFM Dec 22 '20 at 06:44

0 Answers0