0

I want to get a string from div data-pair-id which is "14958"

This is my code:

urlheader = {
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
    "X-Requested-With": "XMLHttpRequest"
}

url = "https://www.investing.com/indices/nasdaq-composite"
req = requests.get(url, headers=urlheader)
soup = BeautifulSoup(req.content, "lxml")
x = soup.find('div', id="data-pair-id")

But x comes up blank.

What's wrong with my code?

anarchy
  • 3,709
  • 2
  • 16
  • 48

2 Answers2

1
import requests
from bs4 import BeautifulSoup
import re

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:76.0) Gecko/20100101 Firefox/76.0'
}


def main(url):
    r = requests.get(url, headers=headers)
    soup = BeautifulSoup(r.content, 'html.parser')
    target = soup.find("div", {'data-pair-id': True}).get('data-pair-id')
    match = re.search(r'smlID = (.*?);', r.text).group(1)
    print(target)
    print(match)


main("https://www.investing.com/indices/nasdaq-composite")

Output:

14958
2035293
  • if you have some extra time, could you see if it's also possible to get window.siteData.smlID = 2035293 from the javascript part of the webpage? i need the "2035293" – anarchy May 23 '20 at 11:39
0

On the given page, there are only two places data-pair-id was found. For both divs, it was not the div's id but an attribute of the div whose value was 14958.

So, given data-pair-id, you can find the attribute's value by finding the first div, passing another parameter specifying the attribute it must have.

divs = soup.find('div', {"data-pair-id": True})
print(divs.get('data-pair-id'))

See: https://stackoverflow.com/a/39055066/11890300

tanmay_garg
  • 377
  • 1
  • 13
  • So does that mean I can’t get the value 14958 any possible way on that page? – anarchy May 23 '20 at 11:25
  • I need to key in something generic to get that value of 14958 because it’ll be different on a different page – anarchy May 23 '20 at 11:26
  • Oh ok. I understood your question wrong. I'll just edit and fix it. – tanmay_garg May 23 '20 at 11:27
  • Umm, typo perhaps? Did you accidentally write ``data-pair-id`` as ``data-pair-is`` in your code? – tanmay_garg May 23 '20 at 11:34
  • my comment was a typo, i meant ```KeyError: 'data-pair-id'```. i checked attributes, i got this ```{'id': 'sideNotificationZone', 'class': ['sideNotificationZone']}``` – anarchy May 23 '20 at 11:36
  • Ohk so I understood how to do it. But, someone else has already posted an answer doing the same thing. So, I'll just add a url to my answer. – tanmay_garg May 23 '20 at 11:47