I am generating a map with 6.8 million different points (latitude and longitude combinations) using ggmap and ggplot2. I succeeded but it took quite a while (6 hours).
I noticed R does not use all my cores so my next challenge would be to use all the power I have at my disposal.
How would I do that?
I know the 'parallel' package does that but since I am still learning R, I am not sure what to do with it. Here is a sample of the data (it's from the NYC open data platform):
df <- structure(list(pickup_datetime = structure(c(19L, 7L, 13L, 10L,
9L, 9L, 14L, 4L, 16L, 1L, 3L, 12L, 18L, 11L, 2L, 17L, 5L, 15L,
8L, 6L), .Label = c("01/02/2015 03:40:12 PM", "01/04/2015 01:03:42 AM",
"01/05/2015 12:22:10 PM", "01/05/2015 12:58:10 PM", "01/06/2015 02:16:47 PM",
"01/08/2015 12:19:51 PM", "01/09/2015 03:45:22 PM", "01/10/2015 07:15:39 PM",
"01/11/2015 08:37:20 PM", "01/13/2015 06:57:29 PM", "01/15/2015 03:03:59 AM",
"01/15/2015 10:55:29 PM", "01/16/2015 10:07:38 PM", "01/21/2015 02:04:33 AM",
"01/22/2015 04:48:35 PM", "01/23/2015 11:14:52 PM", "01/24/2015 06:35:44 PM",
"01/25/2015 07:32:09 PM", "01/27/2015 07:30:40 PM"), class = "factor"),
Pickup_latitude = c(40.8353157043457, 40.6699333190918, 40.7466583251953,
40.7337608337402, 40.8157424926758, 40.8157424926758, 40.7239418029785,
40.8073272705078, 40.7512817382812, 40.8260154724121, 40.7934989929199,
40.7457313537598, 40.6872291564941, 40.6822357177734, 40.8117980957031,
40.7610969543457, 40.7501640319824, 40.7329254150391, 40.7140312194824,
40.8164672851562), Pickup_longitude = c(-73.9201583862305,
-73.9856719970703, -73.8925704956055, -73.8689346313477,
-73.9182586669922, -73.9182586669922, -73.950813293457, -73.9444198608398,
-73.9399795532227, -73.9514389038086, -73.9496078491211,
-73.9035873413086, -73.990119934082, -73.9935302734375, -73.9296035766602,
-73.9349060058594, -73.8618927001953, -73.9548034667969,
-73.9550933837891, -73.953971862793)), .Names = c("pickup_datetime",
"Pickup_latitude", "Pickup_longitude"), row.names = c(NA, 20L
), class = "data.frame")
Here is my code:
library(plyr)
pickup <- count(df_sample, c("Pickup_latitude", "Pickup_longitude"))
detach("package:plyr", unload=TRUE)
library(dplyr)
pickup <- filter(pickup, Pickup_latitude != 0 | Pickup_longitude != 0)
library(ggplot2)
library(ggmap)
library(maps)
basemap <- get_map(location=c(lon= -73.8896695, lat= 40.74086), zoom = 11)
longitude = pickup$Pickup_longitude
latitude = pickup$Pickup_latitude
map1 <- ggmap(basemap, extent='panel', base_layer=ggplot(pickup, aes(x=longitude, y=latitude)))
print(map1)
map2 <- map1 + geom_point(color = "blue", size = 0.05)
Thank you,