I am using pytrends to download search interest in newspapers by metro area. Here is an example for one metro area (Austin, TX):
from pytrends.request import TrendReq
import pandas as pd
code='US-TX-635'
papers=['The Wall Street Journal','New York Post','The New York Times','Boston Herald','San Francisco Chronicle']
pytrend = TrendReq()
pytrend.build_payload(kw_list=papers,cat=408,timeframe='all',geo=code)
test = pytrend.interest_over_time()
I understand that there is some randomness in Google Trends (referenced in this post), but the differences I am getting are more drastic than they should be just based on that and they persist even when I take many samples and average across them. For example, when I perform the search for five newspapers on the Google Trends site, while the exact numbers vary, it is always the case that the papers in order of popularity are New York Times, Wall Street Journal, New York Post, San Francisco Chronicle, Boston Herald. This is not the case in any of the samples I get from pytrends. Further, as one would expect, search interest for most of the papers peaks during the financial crisis according to the data from the site, but this is also not the case in the pytrends data.
For reference, here is the query I did on the site.
Does anyone know why this might be happening or if there is another API that might yield more accurate results?