Reshaping/Deduping long data to wide in R

Question

I've been trying to reshape this long data into wide format by ID within excel with no luck. I tried using dcast but it did not give me the results I was expecting.

I've attached a csv of how the data is currently formatted in cells: (a1:c10), and cells: (f1:n4) is how I'd like it to be formatted. I tried in excel first, but have no experience with power query and thought perhaps reshape2 or dcast could do something similar.

In r I did:

olddata_wide$ID <- factor(olddata_wide$ID)

widedf <- dcast(df1, ID  ~ paydate, value.var="Type")

This just gave me an output of dates.

`df %>% dplyr::mutate(trans_no = row_number(), .by = ID) %>% tidyr::pivot_wider(names_from = trans_no, values_from = paydate:type)`. Note you’ll have to update to dplyr >=1.1.0. See [this thread](https://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format) for more on reshaping from long to wide, and [this](https://stackoverflow.com/q/2185252/17303805) for wide to long. — zephryl, Feb 25 '23 at 01:22
Also for your next question, please either post data in a copy-pasteable form, eg using `dput()`, or use a built-in dataset. See [How to make a great R reproducible example](https://stackoverflow.com/q/5963269/17303805). — zephryl, Feb 25 '23 at 01:27

Andre Wildberg · Answer 1 · 2023-02-25T02:02:05.227

Using pivot_wider and rename

library(dplyr)
library(tidyr)

repl <- c("1st_transaction" = "type_1", "2nd_transaction" = "type_2", 
  "3rd_transaction" = "type_3", "4th_transaction" = "type_4")

df %>% 
  mutate(n = row_number(), .by = ID) %>% 
  pivot_wider(names_from = n, values_from = c(type, paydate)) %>% 
  rename(all_of(repl))
# A tibble: 3 × 9
  ID     1st_transacti…¹ 2nd_t…² 3rd_t…³ 4th_t…⁴ payda…⁵ payda…⁶ payda…⁷ payda…⁸
  <chr>  <chr>           <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
1 AAL100 H               H       B       NA      8/28/2… 8/28/2… 8/28/2… NA     
2 AAC926 H               H       NA      NA      8/28/2… 8/28/2… NA      NA     
3 ABR765 V               H       H       B       8/17/2… 8/28/2… 8/28/2… 8/28/2…
# … with abbreviated variable names ¹`1st_transaction`, ²`2nd_transaction`,
#   ³`3rd_transaction`, ⁴`4th_transaction`, ⁵paydate_1, ⁶paydate_2, ⁷paydate_3,
#   ⁸paydate_4

Data

df <- structure(list(ID = c("AAL100", "AAL100", "AAL100", "AAC926", 
"AAC926", "ABR765", "ABR765", "ABR765", "ABR765"), paydate = c("8/28/2019", 
"8/28/2020", "8/28/2021", "8/28/2017", "8/28/2018", "8/17/2016", 
"8/28/2020", "8/28/2021", "8/28/2022"), type = c("H", "H", "B", 
"H", "H", "V", "H", "H", "B")), class = "data.frame", row.names = c(NA, 
-9L))

Hey Sorry for just getting back to you. This was super helpful & cant thank you enough! — willr, Mar 01 '23 at 17:15
Thx for mentioning! Glad it helped! Even though it was marked as a duplicate question the renaming element makes it unique imho, so if you wish you can mark it as solved, it might help others too. — Andre Wildberg, Mar 01 '23 at 18:54

Reshaping/Deduping long data to wide in R

1 Answers1

Data