14

I am unable to resolve the error: "wrong length for vector, should be 2" when trying to calculate the (runway length) distance between two points (runway thresholds / ends). To make things worse, I fail to understand answers like here R error: Wrong length for a vector, should be 2 and apply them to my case. A simplified data frame of (runway end) positions looks like this:

runways <-  data.frame(
 RWY_ID = c(1,2,3)
,RWY    = c("36R", "36L","01")
,LAT    = c(40.08, 40.12, 40.06)
,LON    = c(116.59, 116.57, 116.62)
,LAT2   = c(40.05, 40.07,40.09)
,LON2   = c(116.6, 116.57, 116.61)
)

Using the distHaversine() function from geosphere, I try to calculate the distance:

runways <- mutate(runways
                 , CTD = distHaversine( c(LON, LAT), c(LON2, LAT2))
                 )

I am not sure what I do wrong here. If I pull out the LON LAT position, it is a numerical vector with the right length.

myv <- c(runways$LON[1], runways$LAT[1])
myv

[1] 116.59  40.08
str(myv)
num [1:2] 116.6 40.1
alistaire
  • 42,459
  • 4
  • 77
  • 117
Rainer
  • 195
  • 1
  • 6
  • 5
    You need to operate `rowwise`, or it's passing all rows at once: `runways %>% rowwise() %>% mutate(CTD = geosphere::distHaversine( c(LON, LAT), c(LON2, LAT2)))` – alistaire Nov 11 '16 at 19:10
  • 1
    THANKS !!! Alistaire ... life can be so easy. I assume the error message then points to the fact that the sort-of vectorised approach exceeds the required length of 2. – Rainer Nov 11 '16 at 19:16

1 Answers1

28

You need to operate rowwise, so distHaversine is passed a single set of pairs at once instead of all the rows:

runways %>% rowwise() %>% 
    mutate(CTD = distHaversine(c(LON, LAT), c(LON2, LAT2)))

## Source: local data frame [3 x 7]
## Groups: <by row>
## 
## # A tibble: 3 × 7
##   RWY_ID    RWY   LAT    LON  LAT2   LON2      CTD
##    <dbl> <fctr> <dbl>  <dbl> <dbl>  <dbl>    <dbl>
## 1      1    36R 40.08 116.59 40.05 116.60 3446.540
## 2      2    36L 40.12 116.57 40.07 116.57 5565.975
## 3      3     01 40.06 116.62 40.09 116.61 3446.509

Alternatively, distHaversine can handle matrices, so you can use cbind instead of c:

runways %>% mutate(CTD = distHaversine(cbind(LON, LAT), cbind(LON2, LAT2)))

##   RWY_ID RWY   LAT    LON  LAT2   LON2      CTD
## 1      1 36R 40.08 116.59 40.05 116.60 3446.540
## 2      2 36L 40.12 116.57 40.07 116.57 5565.975
## 3      3  01 40.06 116.62 40.09 116.61 3446.509

At scale, the latter approach is almost certainly better, as operating rowwise doesn't take advantage of vectorization and can therefore get slow.

alistaire
  • 42,459
  • 4
  • 77
  • 117
  • 1
    Thanks, Alistaire, I owe you a beer. Now seeing the solution, I feel a bit humble for having asked. The move to a matrix via cbind() feels a bit more elegant than the rowwise iterations. Still a lot ahead of me to learn about R ... :) – Rainer Nov 11 '16 at 19:22
  • 2
    No worries, it's a good question. Thinking about what you're passing to a function when you refer to a column name in dplyr is one of those constant and not immediately transparent tasks in order to avoid errors with finicky functions. – alistaire Nov 11 '16 at 19:27
  • How would you add a column of this output permanently to the original dataset? When adding runways$CTD –>... and removing CTD=... I end up getting a matrix inside the original dataset... By solely leaving CTD it can only be seen when running the lines of code above... – Joehat Jun 24 '21 at 00:56
  • This is the type I get when adding runways$CTD –>... and removing CTD=... – Joehat Jun 24 '21 at 00:58