0

Trying to open a URL link and read the data in it. The URL link itself is valid but once it is put into the urlopen function it returns a "404 not found" error. Please help. Thank you! Code below.

from bs4 import BeautifulSoup
import urllib.request as ur

# Enter a stock symbol
index = "MSFT"
# URL link
url_is = "https://finance.yahoo.com/quote/" + index + "/financials?p=" + index
url_bs = "https://finance.yahoo.com/quote/" + index + "/balance-sheet?p=" + index
url_cf = "https://finance.yahoo.com/quote/" + index + "/cash-flow?p=" + index

read_data = ur.urlopen(url_is).read()
soup_is= BeautifulSoup(read_data, "lxml")
Isaac
  • 1
  • 3
    If visiting `print(url_is)` works on a browser, it's probably some sort of anti-bot measure. Try [adding a normal-looking user-agent](https://stackoverflow.com/questions/24226781/changing-user-agent-in-python-3-for-urrlib-request-urlopen) to your request -- that works most of the time. – jedwards Mar 20 '22 at 20:29
  • servers may check many elements to stop bots/spamers/hackers. First is header `User-Agent`. Servers may need it also to send different content to different browsers and diffrent devices (phone, notebook, desktop). – furas Mar 20 '22 at 20:32
  • maybe try with `requests` - sometimes it is simpler to do something in `requests` then in `urllib` – furas Mar 20 '22 at 20:35
  • I would actually suggest using the `yfinance` library for interacting with yahoo finance programmatically. – evanstjabadi Mar 20 '22 at 20:42
  • @evanstjabadi thank you for the suggestion. I will check it out – Isaac Mar 22 '22 at 22:32

0 Answers0