Enter query in search bar and scrape results

Question

I have a database with ISBN numbers of different books. I gathered them using Python and Beautifulsoup. Next I would like to add categories to the books. There is a standard when it comes to book categories. A website called https://www.bol.com/nl/ has all the books and categories according to the standard.

Start URL: https://www.bol.com/nl/

ISBN: 9780062457738

URL after search: https://www.bol.com/nl/p/the-subtle-art-of-not-giving-a-f-ck/9200000053655943/

HTML class of categories: <li class="breadcrumbs__item"

Does anyone know how to (1) enter the ISBN value in the search bar, (2) then submit the search query and use the page for scraping?

Step (3) scraping all the categories is something I can do. But I don't know how to do the first 2 steps.

Code that I have so far for step (2)

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

webpage = "https://www.bol.com/nl/" # edit me
searchterm = "9780062457738" # edit me

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get(webpage)

sbox = driver.find_element_by_class_name("appliedSearchContextId")
sbox.send_keys(searchterm)

submit = driver.find_element_by_class_name("wsp-search__btn  tst_headerSearchButton")
submit.click()

Code that I have so far for step (3)

import requests
from bs4 import BeautifulSoup

data = requests.get('https://www.bol.com/nl/p/the-subtle-art-of-not-giving-a-f-ck/9200000053655943/')

soup = BeautifulSoup(data.text, 'html.parser')

categoryBar = soup.find('ul',{'class':'breadcrumbs breadcrumbs--show-last-item-small'})

for category in categoryBar.find_all('span',{'class':'breadcrumbs__link-label'}):
    print(category.text)

@Dev I dont get any errors. I just dont know where to start. The code from (2) is form the internet but I dont know how to use webdriver properly. Do you know how to do this? — T. de Jong, Sep 21 '19 at 13:34
This question is great and shows a lot of prior work. Yet I think you could have split it up into two different questions, where you ask for step 1 and 2 separately. — saQuist, May 10 '22 at 09:18

score 4 · Accepted Answer · answered Sep 21 '19 at 15:03

You can use selenium to locate the input box and loop over your ISBNs, entering each:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
d = webdriver.Chrome('/path/to/chromedriver')
books = ['9780062457738']
for book in books:
  d.get('https://www.bol.com/nl/')
  e = d.find_element_by_id('searchfor')
  e.send_keys(book)
  e.send_keys(Keys.ENTER)
  #scrape page here

Now, for each book ISBN in books, the solution will enter the value into the search box and load the desired page.

score 1 · Answer 2 · answered Sep 21 '19 at 15:52

You could write a function which returns the category. You can base it on the actual search the page does just tidy up the params and you can use a GET.

import requests
from bs4 import BeautifulSoup as bs

def get_category(isbn): 
    r = requests.get(f'https://www.bol.com/nl/rnwy/search.html?Ntt={isbn}&searchContext=books_all') 
    soup = bs(r.content,'lxml')
    category = soup.select_one('#option_block_4 > li:last-child .breadcrumbs__link-label')

    if category is None:
        return 'Not found'
    else:
        return category.text

isbns = ['9780141311357', '9780062457738', '9780141199078']

for isbn in isbns:
    print(get_category(isbn))

Thanks for your help. But the solution from Ajax1234 worked for me — T. de Jong, Sep 22 '19 at 17:44
that's ok. Did this one not work for you? I tested it and it seems to work perfectly. — QHarr, Sep 22 '19 at 17:45

Enter query in search bar and scrape results

2 Answers2

Linked