0

The goal of the project is to determine how far casual riders travel vs. membership riders and have a visual graphic that supports the data conclusions.

I have plotted the maps using mapviewer and ggplot of start and end points. I would like to see which type of rider is traveling further and what routes they are taking. Below is the scipt I used for the first portions. Any insight into tracking information of how bikes traveled and from what start / end points would be immensely helpful. Thanks in advance

Best,

Kyle


mapview(testmap, map.types = "Stamen.Toner", zcol = "member_casual")

    ggplot(A202004.divvy.tripdata, aes(end_lat, end_lng, col=member_casual)) + geom_point() + ggtitle("end clustering")
    
```set.seed(55)
cluster.A202004.divvy.tripdata <- kmeans(A202004.divvy.tripdata[,11:12], 3, nstart = 20)
cluster.A202004.divvy.tripdata
table(cluster.A202004.divvy.tripdata$cluster, A202004.divvy.tripdata$member_casual)
ggplot(A202004.divvy.tripdata, aes(end_lat, end_lng, color=member_casual, shape=cluster.A202004.divvy.tripdata$cluster)) + geom_point() + ggtitle("A202004 Cluster Plot")

A sample set of data is included here. I'm just looking for potential steps in the right direction.

https://docs.google.com/spreadsheets/d/1VpIwNKK6uvK90rstTQbLNi-auzV-Z79hbpx3wJjdrzM/edit?usp=sharing

start clustering

end clustering

mapviewer

I've done the legwork of getting the data cleaned and inputted, but I haven't been able to find a tool that could draw lines between start_lng, start_lat, and end_lng, end_lat data points. Ideally I'd want to change the colors of the lines based on the length of the ride or other parameters such as trip length.

Kyle Kulinski
  • 21
  • 1
  • 4
  • did you test the package "leaflet"? (https://stackoverflow.com/questions/32275213/how-do-i-connect-two-coordinates-with-a-line-using-leaflet-in-r) – demarsylvain Jul 06 '23 at 00:42
  • This is a methods question with no data and not code in text format. You seem to expect us to give you a recommendation for a package that will will create a "visualization" that you cannot even specify. Voting to close as a package rec request. – IRTFM Jul 06 '23 at 01:49
  • Thanks for the leaflet suggestion. I'll be messing around with it to see if it can do what I need. I have no expectations for code, just a suggestion would help me in my learning. I'm still new to data analytics and am trying to see where my knowledge of things "breaks" and I have to learn more. Specifically, I would like to map latitude and longitude coordinates that are part of the same row from the start and stop points of each ride. It's easily to plot the specific points, but has been difficult to understand how to draw the interconnecting lines and colorize them. Thanks! – Kyle Kulinski Jul 06 '23 at 03:05
  • I would start with `st_linestring()`function from `{sf}` package. Or `st_distance()` to calculate the distance between end and start points and then try to run some statistics on it. – Grzegorz Sapijaszko Jul 06 '23 at 06:45
  • Sounds like a graph problem (igraph / tidygraph / sfnetworks with or without road network) with temporal elements (hour of the day / day of the week / day of the year ). This is likely related to some Kaggle task and as Kaggle already lists lot of contanet based on Divvy datasets, perhaps you can find some inspiration from there? +1 for close. – margusl Jul 06 '23 at 11:01

0 Answers0