0

I am trying to iterate through symbols for different mutual funds, and using those scrape some info from their Morningstar profiles. The URL is the following: https://www.morningstar.com/funds/xnas/ZVGIX/quote.html In the example above, ZVGIX is the symbol. I have tried using xpath to find the data I need, however that returns empty lists. The code I used is below:

for item in symbols:
    url = 'https://www.morningstar.com/funds/xnas/'+item+'/quote.html'
    page = requests.get(url)
    tree = html.fromstring(page.content)
    totalAssets = tree.xpath('//*[@id="gr_total_asset_wrap"]/span/span/text()')
    print(totalAssets)

According to Blank List returned when using XPath with Morningstar Key Ratios and Web scraping, getting empty list that is due to the fact that the page content is downloaded in stages. The answer to the first link suggests using selenium and chromedriver, but that is unpractical given the amount of data that I am interested in scraping. The answer to the second suggests there may be a way to load the content with further requests, but it does not explain how one may formulate those requests. So, how can I apply that solution to my case?

Edit: The code above returns [], in case that was not clear.

MrKaplan
  • 61
  • 1
  • 5

1 Answers1

0

In case anyone else ends up here: eventually I solved my problem by analyzing the network requests when loading the desired pages. Following those links led to super simple html pages that held different parts of the original page. So rather than scraping from 1 page, I ended up scraping from around 5 pages for each fund.

MrKaplan
  • 61
  • 1
  • 5