0

I have a column that was combined when I imported data in R. The column has data that looks like this c(-122.430061, 37.785553). How can I split this into just two columns long and lat?

Data looks like this:

#dput(coords[1:5,])
structure(list(type = c("Point", "Point", "Point", "Point", "Point"
), coordinates = list(c(-122.191986, 37.752671), c(-122.20254, 
37.777845), c(-122.250701, 37.827707), c(-122.270252, 37.806838
), c(-122.259369, 37.809819))), .Names = c("type", "coordinates"
), row.names = c(1L, 2L, 3L, 5L, 6L), class = "data.frame")
Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
Ted Mosby
  • 1,426
  • 1
  • 16
  • 41
  • sorry maybe my question really clear. The string above is in 1 column, not in 2. so the dataframe has 2 columns, x[1] = point x[2]=c(-122.430061, 37.785553) – Ted Mosby Oct 20 '16 at 19:33
  • The data is already imported as a json from a url. So scan wouldn't work on a data.frame. – Ted Mosby Oct 20 '16 at 19:36
  • You could do `cbind(df["type"], do.call(rbind, df$coordinates))` and then set the names. – Rich Scriven Oct 20 '16 at 21:24

2 Answers2

2

Well, after looking at your data, this seems the right way to go:

x <- structure(list(type = c("Point", "Point", "Point", "Point", "Point"
), coordinates = list(c(-122.191986, 37.752671), c(-122.20254, 
37.777845), c(-122.250701, 37.827707), c(-122.270252, 37.806838
), c(-122.259369, 37.809819))), .Names = c("type", "coordinates"
), row.names = c(1L, 2L, 3L, 5L, 6L), class = "data.frame")

x$coordinates is not a string column, but a list:

#[[1]]
#[1] -122.19199   37.75267
#
#[[2]]
#[1] -122.20254   37.77784
#
#[[3]]
#[1] -122.25070   37.82771
#
#[[4]]
#[1] -122.27025   37.80684
#
#[[5]]
#[1] -122.25937   37.80982

We can use an sapply with "[":

long <- sapply(x$coordinates, "[", 1)
# [1] -122.1920 -122.2025 -122.2507 -122.2703 -122.2594

lat <- sapply(x$coordinates, "[", 2)
# [1] 37.75267 37.77784 37.82771 37.80684 37.80982

But a more efficient way is via the trick used in my original answer below:

xx <- unlist(x$coordinates)

long <- xx[seq(1,length(xx),2)]
# [1] -122.1920 -122.2025 -122.2507 -122.2703 -122.2594

lat <- xx[-seq(1,length(xx),2)]
# [1] 37.75267 37.77784 37.82771 37.80684 37.80982

Original Answer

I think this is possibly what you are looking for, assuming you have a character column (if it is a factor at the moment, use as.character for coercion first):

## example column
x <- c("12.3, 15.2", "9.2,11.1", "13.7,22.5")
#[1] "12.3, 15.2" "9.2,11.1"   "13.7,22.5"

xx <- scan(text = x, what = numeric(), sep = ",")
#[1] 12.3 15.2  9.2 11.1 13.7 22.5

long <- xx[seq(1,length(xx),2)]
#[1] 12.3  9.2 13.7

lat <- xx[-seq(1,length(xx),2)]
#[1] 15.2 11.1 22.5
Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
1

If you don't want to rerun the import. library(tidyr) has a nice function for this seperate()

datf <- tidyr::separate(datf, coordinates, into = c("long", "lat"), sep = ",")
datf$long <- gsub("c\\(", "", datf$long)
datf$lat <- gsub("\\)", "", datf$lat)

The gsub() clean up is a little gross, but it gets the job done. Maybe someone can improve on my separate call.

Nate
  • 10,361
  • 3
  • 33
  • 40