1

In R, I have some class attendance data in a tidy data set. Here's a MWE:

library(lubridate)

students <- c("Alice", "Bob", "Alice", "Bob", "Alice", "Bob")
presences <- c("Present", "Present", "Present", "Absent", "Absent", "Present")
dates <- mdy(c("2/17/2020", "2/17/2020", "2/18/2020", "2/18/2020", "2/19/2020", "2/19/2020"))

df <- data.frame(Student=students,
                 Presence=presences, 
                 Date=dates, 
                 stringsAsFactors=FALSE) 

which produces

df

  Student Presence       Date
1   Alice  Present 2020-02-17
2     Bob  Present 2020-02-17
3   Alice  Present 2020-02-18
4     Bob   Absent 2020-02-18
5   Alice   Absent 2020-02-19
6     Bob  Present 2020-02-19

For a report, I want to produce a spreadsheet-style table where the rows are by student, the columns are by date, and the cell values are presence status. I've typed up the expected output explicitly below.

        02/17/20    02/18/20    02/19/20
Alice   Present     Present     Absent
Bob     Present     Absent      Present

How do I achieve this using R? I think my difficulty is that all the documentation I can find is for tidying data, and my goal here is essentialy to untidy it.

Rob Creel
  • 323
  • 1
  • 8
  • Yes, I think so. I've already accepted a solution, as I was kind of looking for a tidyverse solution, but this does help explain what I perhaps should have thought to search, so thank you. – Rob Creel Feb 21 '20 at 18:48
  • 1
    You can have an accepted solution and also mark it as being a duplicate of another post—doesn't discredit the answers here. There are at least 2 answers on that post that use `tidyr`, which is where within the tidyverse the relevant functions come from – camille Feb 21 '20 at 18:58

2 Answers2

1

We can use pivot_wider from tidyr

library(tidyr)
library(dplyr)
df %>% 
    pivot_wider(names_from = Date, values_from  = Presence)
# A tibble: 2 x 4
#  Student `2020-02-17` `2020-02-18` `2020-02-19`
#  <chr>   <chr>        <chr>        <chr>       
#1 Alice   Present      Present      Absent      
#2 Bob     Present      Absent       Present     

If the 'Student' needs to be row name, then use column_to_rownames from tibble

library(tibble)
df %>% 
    pivot_wider(names_from = Date, values_from  = Presence) %>%
    column_to_rownames('Student')
#      2020-02-17 2020-02-18 2020-02-19
#Alice    Present    Present     Absent
#Bob      Present     Absent    Present
akrun
  • 874,273
  • 37
  • 540
  • 662
1
do.call(rbind, lapply(split(df, df$Student), function(x){
    with(x, setNames(data.frame(t(Presence)), Date))
}))
#      2020-02-17 2020-02-18 2020-02-19
#Alice    Present    Present     Absent
#Bob      Present     Absent    Present
d.b
  • 32,245
  • 6
  • 36
  • 77