1

With the data below (included with dput), I have a varying number of repeat Lat and Long locations for three individuals and would like to spread them into wide format using dplyr.

The data look like so:

> head(Dat)
  IndIDII      IndYear  WintLat  WintLong
1 BHS_265 BHS_265-2015 47.61025 -112.7210
2 BHS_265 BHS_265-2016 47.59884 -112.7089
3 BHS_770 BHS_770-2016 42.97379 -109.0400
4 BHS_770 BHS_770-2017 42.97129 -109.0367
5 BHS_770 BHS_770-2018 42.97244 -109.0509
6 BHS_377 BHS_377-2015 43.34744 -109.4821

This post provided a slick solution that has been a big help. Nonetheless, I am unable to obtain my desired result. Modifying the code I have the following:

Dat %>%  
  group_by(IndIDII) %>%
  #Make YearNum (as intiger not calnader year) for each IndIDII
  mutate(YearNum = row_number()) %>% 
  gather(Group, LatLong, c(WintLat,  WintLong)) %>% 
  unite(GroupNew, YearNum, Group, sep = "-") %>% 
  spread(GroupNew, LatLong) %>% 
  as.data.frame()

Which produces a nearly correct result, but has multiple rows for each IndIDII, each containing the lat and long for a single year.

  IndIDII      IndYear 1-WintLat 1-WintLong 2-WintLat 2-WintLong 3-WintLat 3-WintLong 4-WintLat 4-WintLong
1 BHS_265 BHS_265-2015  47.61025  -112.7210        NA         NA        NA         NA        NA         NA
2 BHS_265 BHS_265-2016        NA         NA  47.59884  -112.7089        NA         NA        NA         NA
3 BHS_377 BHS_377-2015  43.34744  -109.4821        NA         NA        NA         NA        NA         NA
4 BHS_377 BHS_377-2016        NA         NA  43.35559  -109.4445        NA         NA        NA         NA
5 BHS_377 BHS_377-2017        NA         NA        NA         NA  43.35195  -109.4566        NA         NA
6 BHS_377 BHS_377-2018        NA         NA        NA         NA        NA         NA  43.34765  -109.4892
7 BHS_770 BHS_770-2016  42.97379  -109.0400        NA         NA        NA         NA        NA         NA
8 BHS_770 BHS_770-2017        NA         NA  42.97129  -109.0367        NA         NA        NA         NA
9 BHS_770 BHS_770-2018        NA         NA        NA         NA  42.97244  -109.0509        NA         NA

I am trying to have all lat and longs for an IndIDII in a single row (i.e. wide format) as shown below. NA values will appear when individuals have fewer than the max number of years. I suspect the issue is with the GroupNew variable and have tried different options, but to no avail...

enter image description here

Dat <- structure(list(IndIDII = c("BHS_265", "BHS_265", "BHS_770", "BHS_770", 
"BHS_770", "BHS_377", "BHS_377", "BHS_377", "BHS_377"), IndYear = c("BHS_265-2015", 
"BHS_265-2016", "BHS_770-2016", "BHS_770-2017", "BHS_770-2018", 
"BHS_377-2015", "BHS_377-2016", "BHS_377-2017", "BHS_377-2018"
), WintLat = c(47.6102519805014, 47.5988417247191, 42.9737859090909, 
42.9712914772727, 42.9724390816327, 43.3474354347826, 43.3555934579439, 
43.3519543396226, 43.3476466990291), WintLong = c(-112.720994832869, 
-112.708887595506, -109.039964727273, -109.036693522727, -109.050923061224, 
-109.482114456522, -109.444522149533, -109.45659254717, -109.489241553398
)), class = "data.frame", row.names = c(NA, -9L))
B. Davis
  • 3,391
  • 5
  • 42
  • 78

1 Answers1

1

You are almost there. The lat and long go into different rows because their IndYear is different. As you only keep the first value of IndYear for each IndiDII in the final data.frame, add IndYear = first(IndYear) will give you the desired result.

Dat %>%  
    group_by(IndIDII) %>%
    mutate(YearNum = row_number(), IndYear = first(IndYear)) %>% 
    gather(Group, LatLong, c(WintLat,  WintLong)) %>% 
    unite(GroupNew, YearNum, Group, sep = "-") %>% 
    spread(GroupNew, LatLong) %>% 
    as.data.frame()

#   IndIDII      IndYear 1-WintLat 1-WintLong 2-WintLat 2-WintLong 3-WintLat 3-WintLong 4-WintLat 4-WintLong
# 1 BHS_265 BHS_265-2015  47.61025  -112.7210  47.59884  -112.7089        NA         NA        NA         NA
# 2 BHS_377 BHS_377-2015  43.34744  -109.4821  43.35559  -109.4445  43.35195  -109.4566  43.34765  -109.4892
# 3 BHS_770 BHS_770-2016  42.97379  -109.0400  42.97129  -109.0367  42.97244  -109.0509        NA         NA
mt1022
  • 16,834
  • 5
  • 48
  • 71