0

I am attempting to create a python based web scraper to get the price of gold from: https://www.jmbullion.com/charts/gold-price/. However when I run the code it returns the span i'm looking for but its empty.< span id="oz_display">< /span>. I checked the site and it seems to be running some java script that replaces the value " jQuery("#oz_display").html("$ "+gold_oz.toString().replace(/(\d)(?=(\d\d\d)+(?!\d))/g,"$1,"))" How could I get this data?

import re
from bs4 import BeautifulSoup
from urllib.request import urlopen

my_url = "https://www.jmbullion.com/charts/gold-price/"

gold_url = urlopen(my_url)
page_html = gold_url.read()
gold_url.close()

page_soup = BeautifulSoup(page_html, "html.parser")

containers = page_soup.findAll("td", {"class": "td_2"})
print(containers)
input("end?")```

Killer Kat
  • 35
  • 1
  • 5
  • Does this answer your question? [Web-scraping JavaScript page with Python](https://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python) – Manikiran Dec 14 '19 at 09:40

2 Answers2

0

To answer your question: yes, there are many ways you can evaluate javascript using python. I believe nowadays people use (Selenium)[https://selenium.dev/].

Buy in your particular case, if you look the javascript code a little before you will see that is getting the value from a div with id gounce:

var gold_oz=jQuery("#gounce").html()

So you just need to get the value from there. As of this writing the value is:

<div id="gounce">1478.12</div>
Josep Anguera
  • 101
  • 2
  • 4
0

The values are being calculated and written by jQuery in the body so you have two options :

  1. Use selenium and let it render the javascript for you then scrape the data you seek form the dom
  2. follow the jQuery code and try to apply the same logic in python

Approach 1 :

from selenium import webdriver

driver = webdriver.Chrome()
try:
    driver.get("https://www.jmbullion.com/charts/gold-price/")
    gold_value = driver.find_elements_by_id('oz_display')
    if gold_value:
        print('Gold Price Per Ounce ==>'    ,gold_value[0].text)
    gold_per_gram = driver.find_elements_by_id('gr_display')
    if gold_per_gram:
        print('Gold Price Per Gram ==>' ,gold_per_gram[0].text)
    gold_per_kilo = driver.find_elements_by_id('kl_display')
    if gold_per_kilo:
        print('Gold Price Per Kilo ==>' ,gold_per_kilo[0].text)
except Exception as e:
    print(e)
finally:
    if driver is not None : driver.close()

Output:

Gold Price Per Ounce ==> $ 1,478.12
Gold Price Per Gram ==> $ 47.52
Gold Price Per Kilo ==> $ 47,522.66

approach 2:

from bs4 import BeautifulSoup
import requests , re

url = "https://www.jmbullion.com/charts/gold-price/"

res = requests.get(url)

page_soup = BeautifulSoup(res.text, "html.parser")
gold_ask_value = page_soup.find("div", {"id": "gounce"}).text

# Gold Price Per Ounce
#var gold_oz = jQuery("#gounce").html();
#This code get's the value of div with id gounce
#jQuery("#oz_display").html("$ " + gold_oz.toString().replace(/(\d)(?=(\d\d\d)+(?!\d))/g, "$1,"));
#This code acts like a formatter . $1 here means the first match wich in this case is 1 then replce this first match with ,
# for example if the value is 5324 then the match will be 5 and that will lead to 5,324 and so on

first_digit = re.search(r"(\d)(?=(\d\d\d)+(?!\d))", gold_ask_value).group(1)
formatted_gold_value = re.sub(r"(\d)(?=(\d\d\d)+(?!\d))",f'$ {first_digit},',gold_ask_value)

# Gold Price Per Gram
# var gold_oz2 = gold_oz.replace(/,/g, "");
#this code remove the formats we did before and return the number without ,
# var gold_gr = Math.round((gold_oz2 / 31.1034768) * 100) / 100;
#This code divid the golden value by 31.1034768 then multiply it by 100 then uses Math.round to round the number to its nearest integer then divid by 100

gold_per_gram = round((float(gold_ask_value) / 31.1034768) * 100) / 100
formatted_gold_per_gram = f'$ {gold_per_gram}' #to make it look like identical to the website 

# var gold_kl = Math.round((gold_oz2 / 0.0311034768) * 100) / 100;
# does the same as gold per gram except the dividing num
# var gold_kl2 = gold_kl.toFixed(2).replace(/\d(?=(\d{3})+\.)/g, '$&,');
# this code acts like the formater before
gold_per_kilo = str(round((float(gold_ask_value) / 0.0311034768) * 100) / 100)
second_digit = re.search(r"\d(?=(\d{3})+\.)", gold_per_kilo).group(0)
gold_per_kilo = re.sub(r"\d(?=(\d{3})+\.)",f'{first_digit},',gold_per_kilo)
formatted_gold_per_kilo = f'$ {gold_per_kilo }' #to make it look like identical to the website 

print('Gold Price Per Ounce ==>'    ,formatted_gold_value)
print('Gold Price Per Gram ==>' ,formatted_gold_per_gram)
print('Gold Price Per Kilo ==>' ,formatted_gold_per_kilo)

Output:

Gold Price Per Ounce ==> $ 1,478.12
Gold Price Per Gram ==> $ 47.52
Gold Price Per Kilo ==> $ 47,522.66
Ahmed Soliman
  • 1,662
  • 1
  • 11
  • 16