1

I am working with the R programming language.

I have the following data frame that contains the Latitude and the Longitude of a ordered list of cities :

map_data <- data.frame("Lat" = c(43.5426, 43.2424, 43.6544, 43.6452, 43.6629), "Long" = c(-79.3871, -79.3860, -79.3807, -79.3806,-79.3957 ), id = c(1,2,3,4,5))

map_data$id = as.factor(map_data$id)

      Lat     Long id
1 43.5426 -79.3871  1
2 43.2424 -79.3860  2
3 43.6544 -79.3807  3
4 43.6452 -79.3806  4
5 43.6629 -79.3957  5

I would like to convert this data frame into the following format:

  start_lat start_long end_lat end_long
1   43.5426   -79.3871 43.2424  -79.386
2   43.2424   -79.3860 43.6540  -79.386

In the above data frame:

  • the first row represents the "trip" from "city 1 to city 2"
  • the second row represents the "trip" from "city 2 to city 3"
  • etc.

Currently, I am doing this in Microsoft Excel manually - I only have a few cities so I can manage this manually.

But can someone please show me how to do this for large amounts of data?

Thanks!

stats_noob
  • 5,401
  • 4
  • 27
  • 83

1 Answers1

1

Use data.table

library(data.table)

result=setDT(map_data)[
  , c("end_lat","end_long"):= shift(map_data[,c(1,2)],-1)][
    , .(id, start_lat = Lat, start_long=Long, end_lat,end_long)]

result[.N, c("end_lat", "end_long"):=result[1,.(start_lat,start_long)]]

Output:

       id start_lat start_long end_lat end_long
   <fctr>     <num>      <num>   <num>    <num>
1:      1   43.5426   -79.3871 43.2424 -79.3860
2:      2   43.2424   -79.3860 43.6544 -79.3807
3:      3   43.6544   -79.3807 43.6452 -79.3806
4:      4   43.6452   -79.3806 43.6629 -79.3957
5:      5   43.6629   -79.3957 43.5426 -79.3871 

Here is a baseR approach if you prefer:

result = rbind(
  cbind(map_data[1:nrow(map_data)-1,c(1,2)], map_data[-1,c(1,2)]),
  cbind(map_data[nrow(map_data), c(1,2)], map_data[1,c(1,2)])
)
colnames(result) <- c("start_lat", "start_long", "end_lat", "end_long")

Output

  start_lat start_long end_lat end_long
1   43.5426   -79.3871 43.2424 -79.3860
2   43.2424   -79.3860 43.6544 -79.3807
3   43.6544   -79.3807 43.6452 -79.3806
4   43.6452   -79.3806 43.6629 -79.3957
5   43.6629   -79.3957 43.5426 -79.3871
langtang
  • 22,248
  • 1
  • 12
  • 27
  • @ langtang: thank you so much for your answer! Just a question : do you know how to make it so that the last row "loops back" to the first city? I.e. in this example, for the 5th row, the values in the "end_lat" and "end_long" column would equal to the "start_lat" and the "start_long" values of the first row. Thank you so much! – stats_noob Feb 22 '22 at 04:19
  • 1
    sure, I've edited above, and also added another approach. I prefer the data.table approach however, because I find it more immediately extendable to multiple different sets of trips – langtang Feb 22 '22 at 13:27
  • Thank you so much for all your help! I am trying to learn how to visualize different "routes" that can be taken between a group of cities (travelling salesman) – stats_noob Feb 22 '22 at 15:54
  • In the end, it will look something like this: https://stackoverflow.com/questions/71197813/r-connecting-dots-on-a-map – stats_noob Feb 22 '22 at 15:54