Also check Joel's Answer who did some research and refined some of the following information.
Pagination
You can use this tool to decrypt the pb-parameter. PB stands for protocol buffer (protobuf) and Google uses its own kind of it for maps. You can find different decoders for this by googling it.
In my case, the pagination was done via one parameter (8iX0). It seems, that it always comes with another similar parameter (7i20) but I don't know that it does. I can't yet confirm that this is always the case, but from my experience you're basically looking for two integers that are 20/40/60 etc. apart.
Here's what this looks like for me:
- page 2 (7i20, 8i20)
- page 3 (7i20, 8i40)
- page 4 (7i20, 8i60)
From this information, I tried 7i20 8i00 for page 1, that seemed to work. For lists with >100 items, it just continues like that (8i120, 8i140 etc.)
Here's a code snippet in python (quick & dirty). Make sure to add (long) delays if your list has many pages as you will get rate-limited by captchas eventually if you don't. Notice the 8i%s0 in the url, make sure to put the %s back when you paste your pb-block.
url = "https://www.google.com:443/search?tbm=map&pb=!7i20!8i%s0!..."
headers = {"Referer": "https://www.google.com/"}
def fetch_stops_from_maps():
new_results = -1
page = 0
results = []
while new_results != 0:
new_results = 0
x = requests.get(url % page, headers=headers)
txt = html.unescape(x.text)
txt = txt.split("\n")[1]
results = re.findall(r"\[null,null,[0-9]{1,2}\.[0-9]{4,15},[0-9]{1,2}\.[0-9]{4,15}]", txt)
print(len(results))
for cord in results:
# curr = the description you can manually type in when saving
curr = txt.split(cord)[1].split("\"]]")[0]
curr = curr[curr.rindex(",\"") + 2:]
cords = str(cord).split(",")
lat = cords[2]
lon = cords[3][:-1]
results.append(s)
new_results += 1
page += 2
Actually getting the correct url
Getting the correct url currently seems to be the hardest part when doing this and I have not fully figured this out aswell. However, for my use-case this is not really important, so I extracted the correct pb-block once and called it a day.
As explained in the other answers, the id of the list is visible in the basic url (here, the 2sXX...) when you navigate to the list in your browser. It seems to usually be 24-32 (?) characters long.
.../maps/<coords>/data=!4m3!11m2!2sXXXX...XXXX!3e3
If you have this id, you can put it into an existing protobuf-block and it may work (I only tested this with 3 different lists, which were all created by the same account, so this theory is far from proven).
Now, how do you get the block? I would just share the one I have, but because I only understand parts of what it does, I fear that it may contain some personal info. Instead, I will share my process of getting it. For this I use Burpsuite. It's a program mainly used for web-security testing and has a free community edition, however for our use-case it is the perfect tool, because with it you can easily tinker with requests, change small parts in the request, send it again and immediately see if your changes changed the response. However for extracting the pb-block, one should also be able to use any program that can intercept browser traffic.
Heres the basic rundown with burp:
From GMaps, share a list that has >20 items (this is important) and copy the public link
In Burp, go to the tab "Proxy", make sure "Intercept" is off and click "Open browser" to open the integrated chromium browser
There, paste the link and wait until maps loaded completely
In Burp, turn "Intercept" on, then in google maps, scroll down in the list, until it starts loading new results (always blocks of 20)
Burp now intercepted all requests the browser made since you turned intercepting on. Click "Forward" and go through all requests, until you see a request in the format
GET /search?tbm=map&authuser=0&hl=de&gl=de&pb=!7i20....
This is what you're looking for.
Optionally, you can now right click into the request-text and click "send to repeater", then switch to the repeater-tab. Here you can edit the request and then send it again, being able to see the response immediately. For example, removing the authuser, hl, gl, q, ech, psi
url parameters, the request still works flawlessly. If you remove the tch=1
parameter, the response you get will be in a more human readable format.
In the request-text you should now be able to just search for the list-id you got from the link previously and replace it with the id of another list (search bar is at the bottom in burp). As I said, this worked for me, but it may be possible that the pb-block contains some additional metadata that makes lists from different google-accounts or different types of lists incompatible with specific pb-blocks. Just a theory though. Let me know how it goes!
Further automating
I have theorised that one could automate getting the pb-block using requests-html because it can load html-sites fully but it doesn't get updated anymore. Another option (probably the better one) is Selenium Wire, as you should be able to load the page and intercept the requests, like we did in burp. Seems like a whole lot of work tho :D