0

I have a data frame called data. It looks like this:

Train Local Arrival  
   A1   Yes       1  
   A2   Yes       3  
   A3   Yes       5  
   A4    No       2  
   A5    No       3  

I come to this table by doing the following:

data <- fread(file) %>%  
  select(Train, Local, Arrival) %>%  
  group_by(Local)  

Now, I know that I can calculate the differences between the arrival times by using diff(), this however does not take into consideration when the group type changes, e.g. A3 - A5.
How could I use the function so that I get two series of differences, one where Local=="Yes" and another one for Local =="No"?

Expected output:
sol_yes <- 2,3
sol_no <- 1

CroatiaHR
  • 615
  • 6
  • 24

1 Answers1

1

If, as you wrote as expected result, it's just a list of vectors that you need then you could go with this:

tapply(df$Arrival, FUN = diff,  df$Local)

#> $No
#> [1] 1
#> 
#> $Yes
#> [1] 2 2

But if you need to keep the dataframe, then I'd suggest this:

library(dplyr)

df %>%
    group_by(Local) %>% 
    mutate(diff = Arrival - lag(Arrival, 1)) %>% 
    ungroup()

#>   Train Local Arrival  diff
#>   <chr> <chr>   <dbl> <dbl>
#> 1 A1    Yes         1    NA
#> 2 A2    Yes         3     2
#> 3 A3    Yes         5     2
#> 4 A4    No          2    NA
#> 5 A5    No          3     1

Where df is:

df <- tibble::tribble(~Train, ~Local, ~Arrival,  
                      "A1", "Yes", 1,  
                      "A2", "Yes", 3,  
                      "A3", "Yes", 5,  
                      "A4", "No", 2,  
                      "A5", "No", 3)
Edo
  • 7,567
  • 2
  • 9
  • 19