15

Sorry for this question, I am new to this.

I have a project where I need to scrape Google Maps to find all the companies in a region, I just heard about the term when we decided on the project, I have done some research and have found out that most scraping services require a certain company field to start the search but I need ALL companies in that area, Can someone explain how I should start?

I saw that in this thread : Is it ok to scrape data from Google results?

they talk about IP's I know that ISP's allocate public IP addresses to certain regions, but how do I go about using that to scrape?

Also I came across an article that said I had to use Google API but on their website : https://developers.google.com/maps/web-services/ which API should I use?

I am using Ubuntu system, if I need to install anything should I use a windows OS instead?

thanks and kind regards

UPDATE :

I found from http://py-googlemaps.sourceforge.net/ that I could use this python code :

local = gmaps.local_search('cafe near ' + destination) print local['responseData']['results'][0]['titleNoFormatting'] Vie De France Bakery & Cafe

If I replace Cafe by "Companies" or whatever name I believe I will get the information I am looking for right? Also I was wondering if someone could tell me how to go about getting into the configuration interface?

Community
  • 1
  • 1
BeachSamurai
  • 354
  • 1
  • 4
  • 13
  • 1
    Read about [google text search](https://developers.google.com/places/web-service/search#TextSearchRequests) and it may help you solve your problem. If all your companies have some particular word in common you can text search for the common word to get the places. Implementing in Javascript would be easier if you already know JS. – ihimv Apr 11 '16 at 08:18
  • Does bash work? Or python? – BeachSamurai Apr 11 '16 at 11:27
  • Actually, their terms prohibit scraping, and they probably have protection mechanisms, but they ultimately can not prevent scraping. – neverMind9 Jul 12 '18 at 20:42
  • for this reason OpenStreetMap is used and advocated for. You can go on https://overpass-turbo.eu/ and directly pull the data you need by querying by tags. Can save it in your preferred format. – Nikhil VJ Mar 25 '22 at 10:33

2 Answers2

5

You can use google-search-results package to scrape Google Maps.

Full example at Repl.it.

import os
from serpapi import GoogleSearch

params = {
    "engine": "google_maps",
    "q": "coffee",
    "type": "search",
    "ll": "@40.7455096,-74.0083012,14z",
    "api_key": os.getenv("API_KEY")
}

client = GoogleSearch(params)
data = client.get_dict()

print("Local results")

for result in data['local_results']:
    print(f"""
Title: {result['title']}
Address: {result['address']}
Rating: {result['rating']}
Reviews: {result['reviews']}""")

if 'ads_results' in data:
    print("Ads")

    for result in data['ads_results']:
        print(f"""
Title: {result['title']}
Address: {result['address']}""")

JSON Response

{
  "local_results": [
    {
      "position": 1,
      "title": "Birch Coffee",
      "data_id": "0x89c258ef40975c2b:0x4fa24ff965c3f3e",
      "gps_coordinates": {
        "latitude": 40.7638094,
        "longitude": -73.9666075
      },
      "rating": 4.5,
      "reviews": 477,
      "price": "$$",
      "type": "Coffee shop",
      "address": "134 1/2 E 62nd St, New York, NY 10065",
      "hours": "Open until 7:00 PM",
      "phone": "(212) 686-1444",
      "website": "http://www.birchcoffee.com/",
      "description": "Hip spot offering house-roasted brews. Local coffeehouse chain serving thoughtfully-sourced, house-roasted brews in a hip, bustling space.",
      "thumbnail": "https://lh5.googleusercontent.com/p/AF1QipPy035-T0IVHuC3CffD8UEf0n70HkkZXvkb7gSJ=w122-h92-k-no"
    },
    {
      "position": 2,
      "title": "Think Coffee",
      "data_id": "0x89c259ca0a28731f:0xd3d13e0daf7fae6c",
      "gps_coordinates": {
        "latitude": 40.7522222,
        "longitude": -74.0016667
      },
      "rating": 3.9,
      "reviews": 467,
      "price": "$$",
      "type": "Coffee shop",
      "address": "500 W 30th St, New York, NY 10001",
      "website": "http://www.thinkcoffee.com/",
      "thumbnail": "https://lh5.googleusercontent.com/p/AF1QipMIVRZJMr-bnGKw28VTrctmhVYQOnIKBRj0NmnN=w122-h92-k-no"
    }
    
    // Stripped...
  ]
}

Output

Local results
Title: Think Coffee
Address: 73 8th Ave, New York, NY 10014
Rating: 4.2
Reviews: 741

Title: Birch Coffee @Flatiron
Address: 21 E 27th St, New York, NY 10016
Rating: 4.4
Reviews: 940

Title: Irving Farm New York
Address: 135 E 50th St, New York, NY 10022
Rating: 4.3
Reviews: 248

// Stripped...

Ads
Title: Gotham Coffee Roasters
Address: 23 W 19th St, New York, NY 10011

Disclosure: I work at SerpApi.

ilyazub
  • 1,226
  • 13
  • 23
-16

You are not legally allowed to scrape data from Google Maps API. A better practice would be to store the place_id of any place and retrieve it for later use.

See this Google Maps terms of use

10.1.3 Restrictions against Data Export or Copying.

(a) No Unauthorized Copying, Modification, Creation of Derivative Works, or Display of the Content. You must not copy, translate, modify, or create a derivative work (including creating or contributing to a database) of, or publicly display any Content or any part thereof except as explicitly permitted under these Terms. For example, the following are prohibited: (i) creating server-side modification of map tiles; (ii) stitching multiple static map images together to display a map that is larger than permitted in the Maps APIs Documentation; (iii) creating mailing lists or telemarketing lists based on the Content; or (iv) exporting, writing, or saving the Content to a third party's location-based platform or service.

(b) No Pre-Fetching, Caching, or Storage of Content. You must not pre-fetch, cache, or store any Content, except that you may store: (i) limited amounts of Content for the purpose of improving the performance of your Maps API Implementation if you do so temporarily, securely, and in a manner that does not permit use of the Content outside of the Service; and (ii) any content identifier or key that the Maps APIs Documentation specifically permits you to store. For example, you must not use the Content to create an independent database of "places" or other local listings information.

(c) No Mass Downloads or Bulk Feeds of Content. You must not use the Service in a manner that gives you or any other person access to mass downloads or bulk feeds of any Content, including but not limited to numerical latitude or longitude coordinates, imagery, visible map data, or places data (including business listings). For example, you are not permitted to offer a batch geocoding service that uses Content contained in the Maps API(s).

ihimv
  • 1,308
  • 12
  • 25
  • 48
    What a terrible answer - this is not a forum to discuss ToS. Crawling public data is legal and discussing it does not break any stackexchange rules. OP asked how to do it, not whether it breaks google's terms of service. – Granitosaurus Apr 26 '18 at 05:06
  • 2
    I won't advise something which is not legal but you are welcome to do so at your own risk! – ihimv Apr 30 '18 at 09:53
  • 17
    "You should not scrape data from Google Maps API" that sounds like a challenge. – kjdion84 Jun 05 '18 at 23:41
  • 1
    I think it is important to point out such things. You obviously want to do things while staying in your legal limits. If any company has created a terms of use for it's product it should be respected. – Sahil Singh Apr 14 '20 at 06:59
  • 1
    It may not be strictly TOS-compliant, but there are plenty of ethical use cases for scraping for research. The exceptions in their TOS may be possible to obey by randomly obfuscating the lat/lon data and only retaining your own query information (such as classification and ID). The terms are obviously laughably restrictive and protective of their (overwhelmingly volunteer-gathered) data. Personally, I try to be compliant by crawling only what I need instead of every inch of my search area. I'm sure if google decides to ban me despite all that it doesn't really matter what I believe though. – kkjelgard Feb 01 '22 at 17:30