0

I have a dataset that I left joined some environmental variables, including yearly NDVI averages, to by location - I would like to have the yearly averages in long format aligned with the Year column.

Currently, my data looks roughly like:

###  Long   Lat    Year    Species   Presence  (..~ 50 other vars...) ndvi_2017  ndvi_2018   ndvi_2019 
#    long1  lat1   2017      sp1       N                                  .91       .72        .83                                               
#    long1  lat1   2017      sp2       Y                                  .91       .72        .83 
#    long1  lat1   2018      sp1       Y                                  .91       .72        .83 
#    long1  lat1   2018      sp2       Y                                  .91       .72        .83 
#    long1  lat1   2019      sp1       N                                  .91       .72        .83 
#    long1  lat1   2019      sp2       Y                                  .91       .72        .83 
#    long2  lat2   2017      sp1       Y                                  .91       .72        .83 
#    long2  lat2   2017      sp2       N                                  .91       .72        .83 
#    long2  lat2   2018      sp1       Y                                  .91       .72        .83 
#    long2  lat2   2018      sp2       N                                  .91       .72        .83 
#    long2  lat2   2019      sp1       Y                                  .91       .72        .83 
#    long2  lat2   2019      sp2       N                                  .91       .72        .83 

But I would like to just have the NDVI by year, so like:

###  Long   Lat    Year    Species   Presence  (..~ 50 other vars...) NDVI 
#    long1  lat1   2017      sp1       N                               .91                                              
#    long1  lat1   2017      sp2       Y                               .91 
#    long1  lat1   2018      sp1       Y                               .72   
#    long1  lat1   2018      sp2       Y                               .72   
#    long1  lat1   2019      sp1       N                               .83    
#    long1  lat1   2019      sp2       Y                               .83   
#    long2  lat2   2017      sp1       Y                               .91   
#    long2  lat2   2017      sp2       N                               .91 
#    long2  lat2   2018      sp1       Y                               .72 
#    long2  lat2   2018      sp2       N                               .72 
#    long2  lat2   2019      sp1       Y                               .83 
#    long2  lat2   2019      sp2       N                               .83

How do I associate the NDVI columns with values in the 'Year' column so I can achieve this?

Thank you in advance for your help!

westpier
  • 45
  • 3

1 Answers1

0

We can use pivot_longer, specify the cols with the column names that starts_with 'ndvi' substring

library(dplyr)
library(tidyr)
 df1 %>%
     pivot_longer(cols = starts_with('ndvi'), values_to = 'NDVI') %>%
     separate(name, into = c('name1', 'year1'), convert = TRUE) %>%
     filter(Year == year1)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • I don't think this fully solves my problem, as now all of the annual NDVI values are displayed for all the years of species presence observation, without describing which year the NDVI values are from? – westpier Jul 22 '20 at 20:29
  • @westpier you already have a 'Year' column – akrun Jul 22 '20 at 20:30
  • @westpier do you consider the duplicate link solves your problem – akrun Jul 22 '20 at 20:32
  • Yes, I know I have a year column - I was hoping there was some solution where I could link ndvi_2017 to Year == 2017 etc. somehow, rather than have all the years of NDVI data repeat over all the years of species observation as I need them distinct by year. – westpier Jul 22 '20 at 21:28
  • @westpier can you show an example with dput – akrun Jul 22 '20 at 21:29
  • @westpier can you check my update – akrun Jul 22 '20 at 21:45