Creating wide data that has only 1 ID column

Question

I have a data frame that looks like this:

  ID Code_Type Code date 
   1   10        4    1
   1    9        5    2
   2    10       6    3
   2    9        7    4

and I would like it to look like this:

  ID date.1 date.2  9  10
   1   1        2   5  4
   2   3        4   7  6

Where the different dates have different columns on the same row.

My current code is this:

#Example df
df <- data.frame("ID" = c(1,1,2,2), 
                 "Code_Type" = c(10,9,10,9), 
                 "Code" = c(4,5,6,7),
                 "date"= c(1,2,3,4))

spread(df, Code_Type,Code)

This outputs:

  ID date   9  10
   1   1    NA  4
   1   2    5  NA
   2   3    NA  6
   2   4    7  NA

Which is similar to what I want I just have no idea how to make the date column turn into multiple columns. Any help or extra reading is appreciated.

To clarify this is my expected output data frame

  ID date.1 date.2  9  10
   1   1        2   5  4
   2   3        4   7  6

Please add an example data frame of your expected output to your question. — jay.sf, Apr 24 '19 at 04:59
It may not have been clear it is the second data frame I will edit for clarity — Dillon Lloyd, Apr 24 '19 at 05:03

symbolrush · Answer 1 · 2019-04-24T05:32:17.213

1

Here's a dplyr / tidyr alternative:

df %>% mutate(date.1 = date %% 2 * date) %>% mutate(date.2 = - (date %% 2 - 1) * date) %>% select(-date) %>% spread(Code_Type, Code) %>% group_by(ID) %>% summarise_all(list(~ sum(.[!is.na(.)])))

# A tibble: 2 x 5
     ID date.1 date.2   `9`  `10`
  <dbl>  <dbl>  <dbl> <dbl> <dbl>
1     1      1      2     5     4
2     2      3      4     7     6

The idea is to split the date column into two columns whether date is even or odd. This is done using the modulo (%%) operator (and some additional number crunching). date.1 = date %% 2 * date catches the odd numbers in date and is 0 for all the others; date.2 = - (date %% 2 - 1) * date catches the even numbers and is 0 for all the others.

Afterwards it's straight forward: select all columns but date; spread it to wide format and, a bit tricky again, summarise by ID and drop all NAs (group_by(ID) %>% summarise_all(list(~ sum(.[!is.na(.)]))).

edited Apr 24 '19 at 05:32

answered Apr 24 '19 at 04:52

symbolrush

7,123
1
39
67

I get this error Error in UseMethod("tbl_vars") : no applicable method for 'tbl_vars' applied to an object of class "list" When I run it. Do you know what this is? – Dillon Lloyd Apr 24 '19 at 05:23
@DillonLloyd: No idea. It works over here.. I reformatted the code, maybe it was because of the `+` sign in it.. Does it work now? – symbolrush Apr 24 '19 at 05:25
Yes it does, however, I do not fully understand what you are doing in the mutate statements – Dillon Lloyd Apr 24 '19 at 05:26
@DillonLloyd: I made an edit with some explanations. – symbolrush Apr 24 '19 at 05:32

jay.sf · Accepted Answer · 2019-04-24T05:08:06.980

1

You could use reshape from base R.

reshape(dat, idvar=c("ID"), timevar="Code_Type", direction="wide")
#   ID Code.10 date.10 Code.9 date.9
# 1  1       4       1      5      2
# 3  2       6       3      7      4

Data

dat <- structure(list(ID = c(1, 1, 2, 2), Code_Type = c(10, 9, 10, 9
), Code = c(4, 5, 6, 7), date = c(1, 2, 3, 4)), class = "data.frame", row.names = c(NA, 
-4L))

edited Apr 24 '19 at 05:08

answered Apr 24 '19 at 04:53

jay.sf

60,139
8
53
110

That does not create the wide data format where there is 1 ID and everything else is on that same row. – Dillon Lloyd Apr 24 '19 at 04:56
See the edit to my answer. – jay.sf Apr 24 '19 at 05:09

Creating wide data that has only 1 ID column

2 Answers2