-1

I am trying to get the hyperlink of anchor (a) element but I get keep getting:

h ttps://in.finance.yahoo.com/h ttps://in.finance.yahoo.com/

enter image description here I have tried all solutions provided here: link

Here's my code:

href_links = []
symbols = []
prices = []
commodities = []

CommoditiesUrl = "https://in.finance.yahoo.com/commodities"
r = requests.get(CommoditiesUrl)
data = r.text
soup = BeautifulSoup(data)

counter = 40
for i in range(40, 404, 14):
    for row in soup.find_all('tbody'):
        for srow in row.find_all('tr'):
            for symbol in srow.find_all('td', attrs={'class':'data-col0'}):
                symbols.append(symbol.text)
                href_link =  soup.find('a').get('href')
                href_links.append('https://in.finance.yahoo.com/' + href_link)
            for commodity in srow.find_all('td', attrs={'class':'data-col1'}):
                 commodities.append(commodity.text)
            for price in srow.find_all('td', attrs={'class':'data-col2'}):
                prices.append(price.text)


pd.DataFrame({"Links": href_links, "Symbol": symbols, "Commodity": commodities, "Prices": prices })

Also, I would like to know if it's feasible, to similarly to the website, to have the symbol of the commodity as a hyperlink in my pandas dataframe.

enter image description here

Alan
  • 157
  • 8
  • What on earth is the outer-most `for` loop for? `for i in range(40, 404, 14):` `i` isn't even referenced in the body of the loop – GordonAitchJay Mar 15 '20 at 08:41

1 Answers1

0

I'm not sure what's going on with the code you posted, but you can simply get that URL by finding an a element with the attribute data-symbol set to GC=F. The html has 2 such elements. The one you want is the first one, which is what is returned by soup.find('a', {'data-symbol': 'GC=F'}).get('href').

import requests, urllib

from bs4 import BeautifulSoup

CommoditiesUrl = "https://in.finance.yahoo.com/commodities"
r = requests.get(CommoditiesUrl)
data = r.text
soup = BeautifulSoup(data)

gold_href = soup.find('a', {'data-symbol': 'GC=F'}).get('href')

# If it is a relative URL, we need to transform it into an absolute URL (it always is, fwiw)
if not gold_href.startswith('http'):
    # If you insist, you can do 'https://in.finance.yahoo.com" + gold_href
    gold_href = urllib.parse.urljoin(CommoditiesUrl, gold_href)

print(gold_url)

Also, I would like to know if it's feasible, to similarly to the website, to have the symbol of the commodity as a hyperlink in my pandas dataframe.

I'm not familiar with pandas, but I'd say the answer is yes. See: How to create a table with clickable hyperlink in pandas & Jupyter Notebook

GordonAitchJay
  • 4,640
  • 1
  • 14
  • 16