Parsing Text From Page WIth BS4

Question

On the page https://bittrex.com/api/v2.0/pub/Markets/GetMarketSummaries i am trying to parse the text that i pull with requests. The code i am using to pull the text is here

import requests
from bs4 import BeautifulSoup

link = 'https://bittrex.com/api/v2.0/pub/Markets/GetMarketSummaries'
html = requests.get('https://bittrex.com/api/v2.0/pub/Markets/GetMarketSummaries').text
print(html)

I can easy pull all the text from the page but now i want to parse it with bs4 so that it only gets the numbers of specific currency, such as ADX, or ADT. (Shown as "MarketCurrency":"ADX") I want it to be able to find the information such as the High, Low, Volume and the Last from the page without pulling all the other junk. So for example i input the code for the currency i want, ex: ADX and it then parses that text and prints just the numbers for the high, low, volume, and last of the day. Thanks for any help!

That API appears to return JSON, not HTML. BeautifulSoup is an HTML parser; for JSON you can just use the native python JSON parser: https://docs.python.org/2/library/json.html — Hamms, Aug 18 '17 at 22:24
any thoughts on were to start then, sorry im pretty new to this — Braden Fenlong, Aug 18 '17 at 22:26
start by parsing the results of the API from JSON to a python dictionary, which you can learn more about here http://introtopython.org/dictionaries.html — Hamms, Aug 18 '17 at 22:29

score 0 · Accepted Answer · answered Aug 18 '17 at 23:46

0

Actually, you're pretty close. As the comments say, the output is not HTML, it is JSON. Luckily python has some nice built in functionality for this. The following code will parse the JSON text output from the site as a native python dictionary (json_dict).

import requests
import json

link = 'https://bittrex.com/api/v2.0/pub/Markets/GetMarketSummaries'
raw_json = requests.get('https://bittrex.com/api/v2.0/pub/Markets/GetMarketSummaries').text
json_dict = json.loads(raw_json)
print(json_dict)

answered Aug 18 '17 at 23:46

somil

360
6
20

thanks for this, the problem im having now is that isnt it multiple dictionaries all with the same variables in it. ex would be "BaseCurrency" which is repeated throughtout the entire page, what would be the best way to pull in only the one that we wanted? – Braden Fenlong Aug 18 '17 at 23:55
You would have to get the value of the key "result" which is a list of dictionaries that each have market summaries. Then you would have to iterate through this list and process each dictionary however you wish. It just works as python lists and dictionaries are expected. – somil Aug 19 '17 at 16:49

efeakaroz13 · Answer 2 · 2022-07-16T17:28:57.820

-1

If you are using selenium try this:

soup = BeautifulSoup(driver.page_source,"html.parser")
page_text = soup.find_all("body")[0].get_text()

I tested it and it works

edited Jul 16 '22 at 17:28

answered Jul 16 '22 at 12:12

efeakaroz13

67
7

3

Downvoted this answer , cause it is of very low quality and it is not working at all in this context. Keep in mind for future answers that you check it twice and add some more information. *I tested it and it works* is not giving any benefit to the readers. – HedgeHog Jul 16 '22 at 13:44

Parsing Text From Page WIth BS4

2 Answers2