How to condense a dataset with dates in R

Question

I'm trying to take some data and clean it for end user visibility, however I'm new to R and can't quite seem to figure out how to go about it. Also, this is my first post, so please let me know if there are any formatting or structural issues to the way I wrote this question.

What the data looks like right now:

name	date	reason
john	1/1/2022	late
john	1/2/2022	late
john	1/4/2022	absent
betty	1/3/2022	absent
betty	1/5/2022	no call
betty	1/7/2022	no call
kyle	1/3/2022	absent
kyle	1/5/2022	no call
kyle	1/7/2022	no call

I want to see if there's a way to condense this so that for every name, you have the date and reason all on the same line. Like this:

name	date1	reason1	date2	reason2	date3	reason3
john	1/1/2022	late	1/2/2022	late	1/4/2022	absent
betty	1/3/2022	absent	1/5/2022	no call	1/7/2022	no call
kyle	1/3/2022	absent	1/5/2022	no call	1/7/2022	no call

Alternatively, I tried using dcast, but my code produced numbers instead of dates.

new db <- dcast(db, name ~ reason, fun.aggregate = list, value.var = "date")

What I wanted:

name	late	absent	no call
john	1/1/2022,1/2/2022	1/4/2022
betty		1/3/2022	1/5/2022,1/7/2022
kyle		1/3/2022	1/5/2022,1/7/2022

What I got:

name	late	absent	no call
john	c(1620708300,1627236300)	1639328820	numeric(0)
betty	numeric(0)	1612973940	c(1611937080, 1612455480)
kyle	numeric(0)	1639329540	c(1635526800, 1639760400)

EDIT:

I was able to get it inches away from where I need it to be using as.data.frame(pivot_wider(df, names_from=reason, values_from=date, values_fn=list, values_fill=list(""))) thanks to @Andre Wildberg, the last step that I need is to remove the c() from the cells and be able to display clean dates in those fields.

db<-structure(list(name = c("Debby", "Debby", "Debby", 
"Debby", "Robert", "Robert", "Robert", 
"Ryan", "Ryan", "Ryan", "Ryan", 
"Ryan", "Ryan", "Brandon", "Brandon"
), reason = c("Absent", "Leave Early", "Late", "Leave Early", 
"Leave Early", "Leave Early", "Absent", "Absent", "Absent", "Absent", 
"Absent", "Leave Early", "Late", "Leave Early", "Leave Early"
), date = c("2021-05-11 04:45:00", "2021-05-15 04:02:00", "2021-07-25                     
18:05:00", 
"2021-09-19 20:01:00", "2021-11-25 01:02:00", "2021-12-08 20:56:00", 
"2021-12-16 17:30:00", "2021-10-09 17:00:00", "2021-11-07 17:00:00", 
"2021-11-12 17:00:00", "2021-11-28 17:00:00", "2021-12-11 01:31:00", 
"2021-12-12 17:07:00", "2021-05-03 23:58:00", "2021-05-15 23:31:00"
)), row.names = c(NA, -15L), class = c("tbl_df", "tbl", "data.frame"
))

Does this answer your question? [How to reshape data from long to wide format](https://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format) — NelsonGon, Jan 12 '22 at 21:35

Andre Wildberg · Accepted Answer · 2022-01-12T23:18:26.910

0

Try this if you want to keep the observations combined

library(tidyr)

as.data.frame(pivot_wider(df, names_from=reason, values_from=date, 
  values_fn=list, values_fill=list("")))
     name
1   Debby
2  Robert
3    Ryan
4 Brandon
                                                                              Absent
1                                                                2021-05-11 04:45:00
2                                                                2021-12-16 17:30:00
3 2021-10-09 17:00:00, 2021-11-07 17:00:00, 2021-11-12 17:00:00, 2021-11-28 17:00:00
4                                                                                   
                               Leave Early                Late
1 2021-05-15 04:02:00, 2021-09-19 20:01:00 2021-07-25 18:05:00
2 2021-11-25 01:02:00, 2021-12-08 20:56:00                    
3                      2021-12-11 01:31:00 2021-12-12 17:07:00
4 2021-05-03 23:58:00, 2021-05-15 23:31:00

Data

df <- structure(list(name = c("Debby", "Debby", "Debby", "Debby", "Robert", 
"Robert", "Robert", "Ryan", "Ryan", "Ryan", "Ryan", "Ryan", "Ryan", 
"Brandon", "Brandon"), reason = c("Absent", "Leave Early", "Late", 
"Leave Early", "Leave Early", "Leave Early", "Absent", "Absent", 
"Absent", "Absent", "Absent", "Leave Early", "Late", "Leave Early", 
"Leave Early"), date = c("2021-05-11 04:45:00", "2021-05-15 04:02:00", 
"2021-07-25 18:05:00", "2021-09-19 20:01:00", "2021-11-25 01:02:00", 
"2021-12-08 20:56:00", "2021-12-16 17:30:00", "2021-10-09 17:00:00", 
"2021-11-07 17:00:00", "2021-11-12 17:00:00", "2021-11-28 17:00:00", 
"2021-12-11 01:31:00", "2021-12-12 17:07:00", "2021-05-03 23:58:00", 
"2021-05-15 23:31:00")), row.names = c(NA, -15L), class = c("tbl_df", 
"tbl", "data.frame"))

edited Jan 12 '22 at 23:18

answered Jan 12 '22 at 21:44

Andre Wildberg

12,344
3
12
29

This is making my dates read as "c(1637802120, 1638996960)" , is there a way to format these as dates at this point? – Quinn Langton Jan 12 '22 at 21:52
@QuinnLangton You can prevent this by storing the date as character. – Andre Wildberg Jan 12 '22 at 21:56
@QuinnLangton You can alternatively do `dput(head(data,10))` and update the question. I can then take a closer look! – Andre Wildberg Jan 12 '22 at 21:58
Updated! Though, I used the actual sheet I'm using, not the examples. So many more dates on the actual sheet. – Quinn Langton Jan 12 '22 at 22:31
@QuinnLangton Hm, something must have gone wrong while using `dput`. It should have `structure(` at the beginning like in my example in the Data section. – Andre Wildberg Jan 12 '22 at 22:36
I think that was just a me being new and not knowing what I'm doing thing. Should be fixed now. – Quinn Langton Jan 12 '22 at 22:42
@QuinnLangton You sure this is the right dataset? Has no column `reason`. Looks like its already in wide format. – Andre Wildberg Jan 12 '22 at 22:48
So sorry! I sent my `dput` after I already ran your code, this is what I have before. – Quinn Langton Jan 12 '22 at 23:05
@QuinnLangton I ran the same line on the updated `df` and it seems to be ok. See edit. I manually corrected the newline `\n` though. The dates are already in character form, so that should also be ok! – Andre Wildberg Jan 12 '22 at 23:21

How to condense a dataset with dates in R

1 Answers1

Data