I am trying to write a Python program to gather data from Google Trends (GT)- specifically, I want to automatically open URLs and access the specific values that are displayed in the line graphs:
I would be happy with downloading the CSV files, or with web-scraping the values (based on my reading of Inspect Element, cleaning the data would only require a simple split or two). I have many searches I want to conduct (many different keywords)
I am creating many URLs to gather data from Google Trends. I used the actual URL from a test search. Example of a URL: https://trends.google.com/trends/explore?q=sports%20cars&geo=US Physically searching this URL on a browser shows the relevant GT page. The problem comes when I try to access it through a program.
Most responses I have seen suggest using public modules from Pip (e.g. PyTrends and the "Unofficial Google Trends API")- my project manager has insisted I do not use modules that are not directly created by the site (i.e.: APIs are acceptable but only an official Google API). Only BeautifulSoup has been sanctioned as a plugin (don't ask why).
Below is an example of the code I have tried. I know it is basic, but on the very first request I got:
HTTPError: HTTP Error 429: unknown": too many requests.
Some responses to other questions mention Google Trends API - is this real? I could not find any documentation on an official API.
Here is another post which outlined a solution that I have tried that did not work for me:
https://codereview.stackexchange.com/questions/208277/web-scraping-google-trends-in-python
url = 'https://trends.google.com/trends/explore?q=sports%20cars&geo=US'
html = urlopen(url).read()
soup = bs(html, 'html.parser')
divs = soup.find_all('div')
return divs