Firstly, I would like to point out that I am very much a beginner to web scraping. I am just beginning a project that scrapes data off of https://coinmarketcap.com. Currently, I am focused on scraping the names of the cryptocurrencies (ie. Bitcoin, Ethereum, Tether, etc.). However, the best I can get is the name of the currency followed by a bunch of formatting such as color, font-size, class, etc. How can I code this so that I can store just the names of the currencies and not have this extra information. Here is my current code:
import requests
from bs4 import BeautifulSoup
#array of just crypto names
names = []
#gets content from site
site = requests.get("https://coinmarketcap.com")
#opens content from site
info = site.content
soup = BeautifulSoup(info,"html.parser")
#class ID for name of crypto
type_name = 'sc-1eb5slv-0 iJjGCS'
#crypto names + other unnecessary info
names_raw = soup.find_all('p', attrs={'class': 'sc-1eb5slv-0 iJjGCS'})
for type_name in names_raw:
print(type_name.text, type_name.next_sibling)
In case a picture is of more use: my current code
As you can see, I am only 20 lines in but having a pretty tough time figuring this out. I appreciate any help or advice you can give me.
` and `` tags, join them together and split the name and abbreviation.
– Andrej Kesely Jul 27 '21 at 00:21