Basically I have the table on this page: https://en.wikipedia.org/wiki/List_of_cakes and I want to grab the text from the first, third and forth columns and format them to look as such:
Amandine - Romania - Chocolate layered cake filled with chocolate, caramel and fondant cream
So far I have this bit of code which I modified from this post:How do I extract text data in first column from Wikipedia table?.
from bs4 import BeautifulSoup
url = "https://en.wikipedia.org/wiki/List_of_cakes"
res = requests.get(url)
soup = BeautifulSoup(res.text,"lxml")
for items in soup.find(class_="wikitable").find_all("tr")[1:]:
data = items.get_text(strip=True)
print(data)
Which outputs
AmandineRomaniaChocolate layered cake filled with chocolate, caramel and fondant cream
AmygdalopitaGreeceAlmond cake made with ground almonds, flour, butter, egg and pastry cream
Angel cakeUnited Kingdom[1]Sponge cake,cream,food colouring
Angel food cakeUnited StatesEgg whites, vanilla, andcream of tartar
etc...
I am just trying to scrape this wiki page and have a text file of these so if someone on my twitch uses the command !cake it will pick one at random.