Export results to excel file python BeautifulSoup

Question

After the great support of @αԋɱҽԃ αмєяιcαη I have the following code

import requests
from bs4 import BeautifulSoup
import pandas as pd

masterlist = []

def main(url):
    with requests.Session() as req:
        for item in range(1, 2):
            r = req.get(url.format(item))
            print(r.url)
            soup = BeautifulSoup(r.content, 'html.parser')
            s in soup.findAll('p', class_='star-rating')
            goal = [(x.h3.a['title'], x.select_one("p.price_color").text, x.select_one("p.star-rating")['class'][-1], 'http://books.toscrape.com' + x.a.img['src'].replace('..',''))
                    for x in soup.select("li.col-xs-6")]
            #print(goal)
            masterlist.append(goal)

main("http://books.toscrape.com/catalogue/page-{}.html")
pd = df.DataFrame(masterlist)
df

The result is perfect. Now I need to learn how to export the results to excel file? Forgive me as I am trying to learn step by step. I think I have to use pandas package .. Will it be easy to use pandas in that case?

From the [docs](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#multi-valued-attributes) -- "The most common multi-valued attribute is class (that is, a tag can have more than one CSS class). Others include rel, rev, accept-charset, headers, and accesskey. Beautiful Soup presents the value(s) of a multi-valued attribute as a list" — , Nov 26 '20 at 20:58
Thanks a lot. But I didn't get what you mean. Can you give me a solution to the problem as I am a newbie to python? — YasserKhalil, Nov 26 '20 at 21:00
`(x.h3.a.text, x.select_one("p.star-rating")['class'][-1], x.select_one("p.price_color").text)` — αԋɱҽԃ αмєяιcαη, Nov 26 '20 at 21:02
@YasserKhalil you welcome. glad to help. Kindly be informed to avoid opening question multiple times to not get down-votes or a close vote — αԋɱҽԃ αмєяιcαη, Nov 26 '20 at 21:05
OK my bro. I will try to be more patient. Last question : I tried this `, x.select_one("div.image_container.a")['href'])` to get the link of the image but this throws an error. Why do I fail at these stuff? — YasserKhalil, Nov 26 '20 at 21:10
@YasserKhalil `x.a.img['src']` , you've to read `bs4` documentation and to understand the meaning of `CSS` selectors — αԋɱҽԃ αмєяιcαη, Nov 26 '20 at 21:12
Amazing. Thanks a lot for the great support. I have edited the question to be for another issue >> sorry if I was disturbing you. — YasserKhalil, Nov 26 '20 at 21:17
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/225179/discussion-between-yasserkhalil-and--c). — YasserKhalil, Nov 26 '20 at 21:33
@YasserKhalil you've to check [ask] as what you are doing is against community rules. — αԋɱҽԃ αмєяιcαη, Nov 26 '20 at 22:30

score 0 · Answer 1 · answered Nov 26 '20 at 21:05

from bs4 import BeautifulSoup
import requests


def main(url):
    with requests.Session() as req:
        for item in range(1, 2):
            r = req.get(url.format(item))
            print(r.url)
            soup = BeautifulSoup(r.content, 'html.parser')
            goal = [(x.h3.a.text, x.select_one("p.price_color").text, x.select_one("p.star-rating").attrs.items())
                    for x in soup.select("li.col-xs-6")]
            try:
                print(list(goal[0][2])[0][1][1])
            except TypeError:
                pass


main("http://books.toscrape.com/catalogue/page-{}.html")

Export results to excel file python BeautifulSoup

1 Answers1