Note:
Please note that I have tried the following to solve my problem before posting:
- Transpose / reshape dataframe without "timevar" from long to wide format
- How to Reshape Long to Wide While Preserving Some Variables in R
- Convert data from long format to wide format with multiple measure columns
to try to solve my problem, but haven't been successful
Problem
Suppose I have the following data that shows the way that items have flowed from a start to an end
> run = c(1, 2, 3, 3, 4, 5, 5, 5, 6, 7, 7, 7, 8, 9, 10, 10, 11)
> start_location = c("A", "C", "A", "B", "A", "B", "C", "A", "B", "C", "B", "A", "A", "A", "A", "B", "C")
> end_location = c("B", "B", "B", "C", "C", "C", "A", "C", "A", "B", "A", "C", "B", "C", "B", "C", "B")
> df = data.frame(run, start_site, end_site)
> df
run start_site end_site
1 1 A B
2 2 A C
3 3 A B
4 3 B C
5 4 A C
6 5 B C
7 5 C A
8 5 A C
9 6 B A
10 7 C B
11 7 B A
12 7 A C
13 8 A B
14 9 A C
15 10 A B
16 10 B C
17 11 C B
I would like to convert the data into a "wide" format that looks like the following, with a new column for every instance of a stage by the run.
> # Desired result
run first_location second_location third_location fourth_location
[1,] "1" "A" "B" NA NA
[2,] "2" "C" "B" NA NA
[3,] "3" "A" "B" "C" NA
[4,] "4" "A" "C" NA NA
[5,] "5" "B" "C" "A" "C"
[6,] "6" "C" "A" NA NA
[7,] "7" "C" "B" "A" "C"
[8,] "8" "A" "B" NA NA
[9,] "9" "A" "C" NA NA
[10,] "10" "A" "B" "C" NA
[11,] "11" "C" "B" NA NA
Attempted Solution
I have tried the following but I haven't got the desired result. I have more columns than I need.
> library(dplyr)
> library(tidyr)
>
> # Unsuccessful attempt
> df_long = melt(df, id.vars=c("run"))
> df_long %>%
select(!variable) %>%
group_by(run) %>%
dplyr::mutate(rn = paste0("location_",row_number())) %>%
spread(rn, value)
# A tibble: 11 x 7
# Groups: run [11]
run location_1 location_2 location_3 location_4 location_5 location_6
<dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 A B NA NA NA NA
2 2 A C NA NA NA NA
3 3 A B B C NA NA
4 4 A C NA NA NA NA
5 5 B C A C A C
6 6 B A NA NA NA NA
7 7 C B A B A C
8 8 A B NA NA NA NA
9 9 A C NA NA NA NA
10 10 A B B C NA NA
11 11 C B NA NA NA NA
Can someone help me figure out my mistake and help me get the desired output please?
Thank you for looking at my post.