2

I have got a data frame with geographic position inside. The positions are strings. This is my function to scrape the strings and get the positions by Degress.Decimal.

Example position 23º 30.0'N

 latitud.decimal <- function(y) {
  latregex <- str_match(y,"(\\d+)º\\s(\\d*.\\d*).(.)")
  latitud <-  (as.numeric(latregex[1,2])) +((as.numeric(latregex[1,3])) / 60) 
  if (latregex[1,4]=="S") {latitud <-  -1*latitud}
  return(latitud)
  }

Results> 23.5

then I would like to create a new column in my original dataframe applying the function to every item in the Latitude column. Is the same issue for the longitude. Another new column

I know how to do this using Python and Pandas buy I am newbie y R and cannot find the solution.

I am triying with

lapply(datos$Latitude, 2 , FUN= latitud.decimal(y)) 

but do not read the y "argument" which is every column value.

kamome
  • 828
  • 3
  • 11
  • 27
  • try `sapply(datos$Latitude, latitud.decimal)` – Clemsang Feb 21 '19 at 14:20
  • 1
    Your function looks very close to vectorized. If you post some sample data (a few values in a vector) we can help complete the vectorization, which would mean you would just do `latitude.decimal(datos$Latitude)`. – Gregor Thomas Feb 21 '19 at 14:22

2 Answers2

2

Note that the str_match is vectorized as stated in the help page of the function help("str_match").

For the sake of answering the question, I lack a reproducable example and data. This page describes how one can make questions that are more likely to be reproducable and thus obtain better answers. As i lack data, and code, i cannot test whether i am actually hitting the spot, but i will give it a shot anyway.

Using the fact the str_match is vectorized, we can apply the entire function without using lapply, and thus create a new column simply. I'll slightly rewrite your function, to incorporate the vectorizations. Note the missing 1's in latregex[., .]

latitud.decimal <- function(y) {
  latregex <- str_match(y,"(\\d+)º\\s(\\d*.\\d*).(.)")
  latitud <-  as.numeric(latregex[, 2]) + as.numeric(latregex[, 3]) / 60)
  which_south <- which(latregex[, 4] == "S") 
  latitud[which_south] <- -latitud[which_south]
  latitud
}

Now that the function is ready, creating a column can be done using the $ operator. If the data is very large, it can be performed more efficiently using the data.table. See this stackoverflow page for an example of how to assign via the data.table package.

In base R we would simply perform the action as

datos$new_column <- latitud.decimal(datos$Latitude)
Oliver
  • 8,169
  • 3
  • 15
  • 37
1
datos$lat_decimal = sapply(datos$Latitude, latitud.decimal)
Adam Waring
  • 1,158
  • 8
  • 20
  • @Clemsang ... not sure what your point is here. It'd be great if this answer provided a little more explanation---if you're working on an answer that offers more explanation with the solution in your comment, please post it, and it will probably be upvoted. But in general commenting instead of answering is discouraged. Answers belong in answers, not in comments. Comments are for seeking clarification, light discussion tangentially related bits, etc. – Gregor Thomas Feb 21 '19 at 14:30
  • I don't see more explanations here than my comment. Understood the fact that I might have answered instead comment. – Clemsang Feb 21 '19 at 14:34
  • Right. Still don't understand your purpose in commenting here @Clemsang. – Gregor Thomas Feb 21 '19 at 15:08
  • transform "23º30`N" to 23.5 using a function – kamome Feb 21 '19 at 15:26