5

I am working with the R and Python languages.

Suppose I search for the following Canadian Postal Code (M5V 3L9) on Google Maps:

https://www.google.com/maps/place/Toronto,+ON+M5V+3L9/@43.642566,-79.3875851,18z/data=!4m6!3m5!1s0x882b34d436f9c825:0x9e9c6195e38030f2!8m2!3d43.6429129!4d-79.3853443!16s%2Fg%2F1tvq4rqd?entry=ttu

When I search for this, I can see that the "perimeter" of this Postal Code is highlighted in red:

enter image description here

My Question: (Using Selenium via R/Python) From an HTML/CSS/XML perspective - I am trying to get a list of all coordinates that make up the boundary of this perimeter.

I have been trying to explore the source code that is generated from this website to try and see if there is something I can do to see where the source code of this perimeter (e.g. in JSON) is being stored - but so far, I can't find anything:

enter image description here

I was hoping that perhaps there might be something which would allow me to use Selenium to repeatedly click around this perimeter and extract the longitude/latitude points - but so far, I can not find anything.

Can someone please show me how to do this?

Thanks!

Note: Generic Selenium Code:

library(RSelenium)
library(wdman)
library(netstat)

selenium()
seleium_object <- selenium(retcommand = T, check = F)

remote_driver <- rsDriver(browser = "chrome", chromever = "114.0.5735.90", verbose = F, port = free_port())

remDr<- remote_driver$client


remDr$navigate("https://www.google.com/maps/place/Toronto,+ON+M5V+3L9/@43.642566,-79.3875851,18z/data=!4m6!3m5!1s0x882b34d436f9c825:0x9e9c6195e38030f2!8m2!3d43.6429129!4d-79.3853443!16s%2Fg%2F1tvq4rqd?entry=ttu")   
stats_noob
  • 5,401
  • 4
  • 27
  • 83
  • 2
    hi stats_noob. I had a play around with Google Maps a couple of weeks ago. Oddly, I too couldn't see anything in the Inspector I could scrape. They do have an API though, which could be useful – Mark Jul 28 '23 at 03:24
  • 1
    @ Mark: Thank you so much for your reply! I hope someone might be able to figure out something that we are missing :) – stats_noob Jul 30 '23 at 07:15
  • If you don't need to use selenium, you could certainly use Google's geocoding API to retrieve the geographical data as JSON/XML – Captain Hat Jul 31 '23 at 12:56
  • 1
    Google Maps' `robots.txt` disallows programmatic access to the `/maps` path. It makes sense since they offer a paid-for API for that purpose. If you are interested in the geometry of Canadian postal code areas, you could look into another source for that. It looks like these are in general not available for free though. – Till Jul 31 '23 at 15:56
  • There are similar questions on the network. The answers to the following suggest there is an open feature request to make the polygon available trough the maps API: https://stackoverflow.com/questions/12630696/where-how-can-i-get-polygon-data-from-google-maps-api . You could also try open street map which does provide polygons as sugessted in https://gis.stackexchange.com/questions/183248/getting-polygon-boundaries-of-city-in-json-from-google-maps-api . – kubatucka Aug 04 '23 at 16:26
  • Do you need to scrap / use google data ? Because it would easier to use OSM API, and Canadian gov / auth are most likely publishing official boundaries of postal codes. If the data (and not the selenium code) is your priority, you should seek answers on [GIS stackexchange](https://gis.stackexchange.com) – vidlb Aug 05 '23 at 10:32
  • @vidlb : there is no such official boundary file published unfortunately. Can this be done using osm? – stats_noob Aug 05 '23 at 11:34
  • Well there's the [postal code](https://wiki.openstreetmap.org/wiki/Key:postal_code) tag but if the geometries aren't published by anyone, it's most likely also missing from OSM. – vidlb Aug 05 '23 at 11:41
  • An alternative solution would be to generate envelopes using points / lines data of addresses, which is more easy to find (I wouldn't be surprised if it's how google does it !) – vidlb Aug 05 '23 at 11:46
  • @vidlb: thank you for your reply! Can you please show me how to do this? Any links online? Thank you so much! – stats_noob Aug 05 '23 at 12:16
  • 1
    Get the data [here](https://www.statcan.gc.ca/en/lode/databases/oda) then do some research about GIS, download QGIS, search for "convex hull" or "minimum bounding geometry". There are also python / r packages for this. You may find existing answers on gis.stackexchange.com – vidlb Aug 05 '23 at 20:18
  • @vidlb: thank you for this suggestion! this is a really cool answer! I spent the whole night reading about this and have started to attempt this and made some progress! would you like to see my work? (i.e. I can post a new question with my progress) – stats_noob Aug 06 '23 at 15:12
  • Just a question - why convex hull and not concave hull? – stats_noob Aug 06 '23 at 15:50
  • You're right ! I said convex hull because it's the most known / used, but for your problem concave hull is the one you need ;) – vidlb Aug 08 '23 at 12:57
  • And yes I'd be happy to see the result ! – vidlb Aug 08 '23 at 12:57

3 Answers3

1

You could use the PyAutoGui library to look on the screen for the HEX value of the red outline, then move the mouse to that point, right-click, then use another library's text recognition (like pytesseract) to scan the latitude and longitude coords that appear in the right-click menu. I'm not sure about the text recognition, but the PyAutoGui part is very easy to implement, around 10 lines of code. This is an example of how you could implement this:

import pyautogui

def find_and_right_click(color):
    screen_width, screen_height = pyautogui.size()
    for x in range(screen_width):
        for y in range(screen_height):
            pixel_color = pyautogui.pixel(x, y)
            if pixel_color == color:
                pyautogui.moveTo(x, y)
                pyautogui.rightClick()

target_color = (234, 68, 54)

find_and_right_click(target_color)

And here is how to do part of it with Selenium:

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains

driver = webdriver.Chrome()
actions = ActionChains(driver)
actions.context_click(element).perform() #right click

From my understanding, you cannot get the rgb value of an individual pixel with selenium, but you can get the rgb value of an element. This means we wont be able to find the rgb value of the pixels we want to move our cursor to (the red boundary). Installing pyautogui would be much easier (pip install pyautogui)

SSam202020
  • 71
  • 5
  • @ SSam202020: thank you so much for your answer! can you please show me how this can be used alongside with selenium? – stats_noob Aug 01 '23 at 22:18
  • @stats_noob I have edited the post to add selenium code :) – SSam202020 Aug 01 '23 at 22:37
  • thank you so much! If you have time on your side - I would be curious to see if you can get this to work on your computer? thank you for all your help! – stats_noob Aug 01 '23 at 22:40
  • Np :), but the pyautogui code is working perfectly fine, selenium isnt needed here. Why do you need to use selenium? – SSam202020 Aug 01 '23 at 22:42
  • I am new to pyautogui ... i am tyring to understand how this would work. Here is the hyperlink to the map in the question: https://www.google.com/maps/place/Toronto,+ON+M5V+3L9/@43.642566,-79.3875851,18z/data=!4m6!3m5!1s0x882b34d436f9c825:0x9e9c6195e38030f2!8m2!3d43.6429129!4d-79.3853443!16s%2Fg%2F1tvq4rqd?entry=ttu ....... based on this ... how i would use your code start to finish? i.e. in your code, where would I define the hyperlink to the map? – stats_noob Aug 01 '23 at 22:46
  • maybe you can please modify your answer to show where I would "feed" this link to the code you have? thank you so much! – stats_noob Aug 01 '23 at 22:47
  • Ah ok, I see what you mean now. Yes, for this case you would need to use selenium, as you cannot do that purely in pyautogui, without using other libraries. You could use pyautogui for finding the coords and selenium for copying the coords and searching the hyperlink. Just asking, are you new to selenium aswell? In which case I can give you a starting point to using that in this situation. – SSam202020 Aug 01 '23 at 22:51
1
library(RSelenium)
library(wdman)
library(netstat)

selenium()
selenium_object <- rsDriver(browser = "chrome", chromever = "114.0.5735.90", verbose = FALSE, port = free_port())

remDr <- selenium_object$client

remDr$navigate("https://www.google.com/maps/place/Toronto,+ON+M5V+3L9/@43.642566,-79.3875851,18z/data=!4m6!3m5!1s0x882b34d436f9c825:0x9e9c6195e38030f2!8m2!3d43.6429129!4d-79.3853443!16s%2Fg%2F1tvq4rqd?entry=ttu")
toyota Supra
  • 3,181
  • 4
  • 15
  • 19
arta rzv
  • 5
  • 2
  • thank you for your answer! but what exactly is this supposed to do? – stats_noob Aug 05 '23 at 05:08
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Aug 08 '23 at 23:56
1

Import the required library

from geopy.geocoders import Nominatim

Initialize Nominatim API

geolocator = Nominatim(user_agent="MyApp")

location = geolocator.geocode("Hyderabad")

print("The latitude of the location is: ", location.latitude) print("The longitude of the location is: ", location.longitude)