0

I have a data set of Emergency Department attendances in an "un"-tidy format as below:


Date          T0    T1    T2    T3    T4    T5    T6    T7    T8
   <chr>      <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 01/01/2010     9    13    13    19    14    10    15     7     3
 2 01/01/2011    13    13    10    16    19    11    12     8     7
 3 01/01/2012    12    17    20    12    17    13     6     6     7
 4 01/01/2013    14    15    14    18     8    11     9     3     9
 5 01/01/2014    10    20    17    15    11    14     6     8    13
 6 01/01/2015    12    20    13    13    16     7     6     7    16
 7 01/01/2016     8    19    17    15     8    15    12     7     6
 8 01/01/2017    12    13    17    11    20    10     8     8     9
 9 01/01/2018    13    15    16    18    13    11     8     8     4
10 01/01/2019     5     9    19    15    19    10     7     8    11
# ... with 3,642 more rows, and 15 more variables: T9 <dbl>,
#   T10 <dbl>, T11 <dbl>, T12 <dbl>, T13 <dbl>, T14 <dbl>,
#   T15 <dbl>, T16 <dbl>, T17 <dbl>, T18 <dbl>, T19 <dbl>,
#   T20 <dbl>, T21 <dbl>, T22 <dbl>, T23 <dbl>

The T1 - T23 stands for the hours in the day.

Ideally I would like to gather these but the gather function does not seem to allow me to gather by multiple rows.

My ideal "tidy" data would be:

Date | Time | Number of attendees


I have tried applying some of the answers from other questions but the approaches do not seem to work for me


After this I am going to recategorise the data and play around with lubridate but for now would be great if I could get it all into better columns


Any help would be greatly appreciated

BW, Jack

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213

1 Answers1

0

The function gather from dplyr has been replaced by a function called pivot_longer that might be easier to understand. I think that a solution to your question could be:

library(tidyverse)

# In the future, if you supply a copy-pasteable test data like the one just below,
# it becomes much easier for someone to help you with your question :)

df <- tibble(
  date = c("01/01/2010", "01/01/2011", "01/01/2012"),
  T0 = c(1, 2, 3),
  T1 = c(4, 5, 6),
  T2 = c(7, 8, 9),
  T23 = c(10, 11, 12)
)

df 
#> # A tibble: 3 x 5
#>   date          T0    T1    T2   T23
#>   <chr>      <dbl> <dbl> <dbl> <dbl>
#> 1 01/01/2010     1     4     7    10
#> 2 01/01/2011     2     5     8    11
#> 3 01/01/2012     3     6     9    12

df %>% 
  pivot_longer(
    -date, 
    names_to = "time",
    names_prefix = "T",
    names_ptypes = list(time = double()),
    values_to = "n_attendees"
  )
#> # A tibble: 12 x 3
#>    date        time n_attendees
#>    <chr>      <dbl>       <dbl>
#>  1 01/01/2010     0           1
#>  2 01/01/2010     1           4
#>  3 01/01/2010     2           7
#>  4 01/01/2010    23          10
#>  5 01/01/2011     0           2
#>  6 01/01/2011     1           5
#>  7 01/01/2011     2           8
#>  8 01/01/2011    23          11
#>  9 01/01/2012     0           3
#> 10 01/01/2012     1           6
#> 11 01/01/2012     2           9
#> 12 01/01/2012    23          12

Created on 2020-01-28 by the reprex package (v0.3.0)

Peter H.
  • 1,995
  • 8
  • 26