1

I'm trying to webscrape the site to collect the total number of products. I've tried BeautifulSoup and Selenium without success. I can't capture the correct information. See the codes below.

The site is this: https://resgatepontos.latampass.latam.com/fornecedor/fastshop?modalidade=ac&produtoHome=false&relevancia=SCORE&categorias=b086c071-6d3e-485f-9a55-4f4510bec929&page=1

The information that I need is the products total. See the image:

Site Image

import numpy as np
import pandas as pd
from bs4 import BeautifulSoup
import requests
import math
import re
from requests_html import HTMLSession, AsyncHTMLSession
from lxml import etree
import xlwt
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC



## WITH SELENIUM 

url_base = 'https://resgatepontos.latampass.latam.com/fornecedor/fastshop?modalidade=ac&produtoHome=false&relevancia=SCORE&categorias=b086c071-6d3e-485f-9a55-4f4510bec929&page=1'
executable_path = r'C:\Users\vinig\Downloads\chromedriver_win32\chromedriver.exe'
browser = webdriver.Chrome(executable_path=executable_path)
browser.get(url_base)
element = WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.CLASS_NAME,"ng-star-inserted")))
qtd = element.text
browser.quit()


## WITH BEAUTIFULSOUP

url_base = 'https://resgatepontos.latampass.latam.com/fornecedor/fastshop?modalidade=ac&produtoHome=false&relevancia=SCORE&categorias=b086c071-6d3e-485f-9a55-4f4510bec929&page=1'
headers = {'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36"}
executable_path = r'C:\Users\vinig\Downloads\chromedriver_win32\chromedriver.exe'
browser = webdriver.Chrome(executable_path=executable_path)
browser.get(url_base)
html = browser.page_source
soup = BeautifulSoup(html, 'lxml')
qtd_itens = soup.find_all('div', attrs={'class':"resultado d-none d-lg-block ng-star-inserted"})


The errors are following:

#SELENIUM: This is not the output I expected.

Selenium_Return

#BEAUTIFULSOUP: This is not the output I expected. Does not contain information on the total number of products.

BeautifulSoup_Return

Davi Riani
  • 43
  • 4

2 Answers2

1

That page is pulling data from an API via an XHR call. Here is one way to get that data, by scraping the API endpoint (you can find the API in Dev tools, under Network tab):

import requests
from bs4 import BeautifulSoup as bs
import pandas as pd

url = 'https://service-resgatepontos.latampass.latam.com/produtos/buscar?ordenacao=SCORE&categorias=b086c071-6d3e-485f-9a55-4f4510bec929&habilitado=true&publicado=true&produtoHome=false&page=1&pageSize=999&codigosFornecedor=fastshop&codigoFornecedor=fastshop&modalidade=ACUMULO'
headers = {
    'accept-language': 'pt-BR',
    'accept': 'application/json, text/plain, */*',
    'origin': 'https://resgatepontos.latampass.latam.com',
    'referer': 'https://resgatepontos.latampass.latam.com/',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'
}


r = requests.get(url, headers=headers)
df = pd.json_normalize(r.json()['itens'])
print(df) 

Result printed in terminal:

    id  idProdutoFornecedor nome    skusCatalogo    categoriasFornecedor    categoriasGoPoints  descricao   idProdutoGopoints   avulso  habilitado  ... prioridade  temDuplaCustodiaPendente    fornecedor.idFornecedor fornecedor.nome fornecedor.imagemLogo   fornecedor.codigoFornecedor fornecedor.hasResgate   fornecedor.hasAcumulo   fornecedor.nomeExibicaoCampos.nome  fornecedor.temDuplaCustodiaPendente
0   B02E2803-8606-495B-A374-27511E1120EC    LJLTAPCABKPTO_PRD   Estojo Protetor para Fone Air Pod Impkt em Pol...   [{'idSkuGopoints': '67780DFD-C194-4CAD-8DEE-A8...   [{'name': 'Smartphones', 'categoryId': 'B086C0...   [{'name': 'Cuidados Pessoais', 'categoryId': '...   Estojo Protetor em Policarbonato para Fone Air...   B02E2803-8606-495B-A374-27511E1120EC    False   True    ... 0   False   6   Fast Shop   fornecedor0fastshop.png fastshop    False   False   Fornecedor  False
1   B4D3656C-F9CD-449D-8A76-097CA875EB32    AEMX4A2ZMAMRM_PRD   Laço para AirTag em Couro Castanho - Apple - M...   [{'idSkuGopoints': 'AF0864F4-A6C5-4CA1-9F56-F8...   [{'name': 'Smartphones', 'categoryId': 'B086C0...   [{'name': 'Celulares e Acessórios', 'categoryI...   Laço em couro europeu curtido que se fixa com ...   B4D3656C-F9CD-449D-8A76-097CA875EB32    False   True    ... 0   False   6   Fast Shop   fornecedor0fastshop.png fastshop    False   False   Fornecedor  False
2   675FBA16-E2FD-459B-B28B-E98673ACE43E    AECJMQ1F3BEA    iPhone 14 Pro Apple (256GB) Roxo-Profundo, Tel...   [{'idSkuGopoints': '2C733DB1-F606-49F7-B60E-FA...   [{'name': 'Smartphones', 'categoryId': 'B086C0...   [{'name': 'Celulares e Acessórios', 'categoryI...   iPhone 14 Pro Apple (256GB) Roxo-Profundo, Tel...   675FBA16-E2FD-459B-B28B-E98673ACE43E    False   True    ... 0   False   6   Fast Shop   fornecedor0fastshop.png fastshop    False   False   Fornecedor  False
3   B7200399-A96E-4641-B0C4-24B70EDDCC5B    AEMNKQ3BZADRD_PRD   Apple Watch Series 8 (GPS + Cellular 45 mm) Ca...   [{'idSkuGopoints': '06AF686D-73E9-406D-A665-AD...   [{'name': 'Smartphones', 'categoryId': 'B086C0...   [{'name': 'Celulares e Acessórios', 'categoryI...   <p>O Apple Watch Series 8 tem sensores e apps ...   B7200399-A96E-4641-B0C4-24B70EDDCC5B    False   True    ... 0   False   6   Fast Shop   fornecedor0fastshop.png fastshop    False   False   Fornecedor  False
4   3583A0DF-A855-424F-BA43-E98535D1C6A6    AEMG3T3AMABCO_PRD   Pulseira para Apple Watch 40mm Nike Sport Band...   [{'idSkuGopoints': '296BD544-F399-4657-B90A-6B...   [{'name': 'Smartphones', 'categoryId': 'B086C0...   [{'name': 'Celulares e Acessórios', 'categoryI...   Pulseira em Fluorelastômero, Compatível com Ap...   3583A0DF-A855-424F-BA43-E98535D1C6A6    False   True    ... 0   False   6   Fast Shop   fornecedor0fastshop.png fastshop    False   False   Fornecedor  False
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
430 E3C9E5FE-54B1-464E-BAE0-AEFFD626F5E9    AEMQ293BEARXO_PRD   iPhone 14 Pro Apple (512GB) Roxo-Profundo, Tel...   [{'idSkuGopoints': '4BE0787E-F087-4DE2-9D14-78...   [{'name': 'Smartphones', 'categoryId': 'B086C0...   [{'name': 'Celulares e Acessórios', 'categoryI...   <p><br />iPhone 14 Pro. C&acirc;mera grande-an...   E3C9E5FE-54B1-464E-BAE0-AEFFD626F5E9    False   True    ... 0   False   6   Fast Shop   fornecedor0fastshop.png fastshop    False   False   Fornecedor  False
431 A32303B0-F229-4D9D-B3C9-C0FF70755CD5    AECJMQ2N3BEA    iPhone 14 Pro Apple (1TB) Prateado, Tela de 6,...   [{'idSkuGopoints': '938444C1-5DB4-429E-9AF1-FF...   [{'name': 'Smartphones', 'categoryId': 'B086C0...   [{'name': 'Utilidades', 'categoryId': '203', '...   iPhone 14 Pro Apple (1TB) Prateado, Tela de 6,...   A32303B0-F229-4D9D-B3C9-C0FF70755CD5    False   True    ... 0   False   6   Fast Shop   fornecedor0fastshop.png fastshop    False   False   Fornecedor  False
432 C8733B15-B5D1-4279-B03D-C1FB973D3B46    AEMTFF2ZMALRJ_PRD   Capa para iPhone XS Max de Silicone Nectarina ...   [{'idSkuGopoints': 'E9E6C71C-D61E-4CD0-8FD2-15...   [{'name': 'Smartphones', 'categoryId': 'B086C0...   [{'name': 'Celulares e Acessórios', 'categoryI...   <P>Criada pela Apple para complementar seu iPh...   C8733B15-B5D1-4279-B03D-C1FB973D3B46    False   True    ... 0   False   6   Fast Shop   fornecedor0fastshop.png fastshop    False   False   Fornecedor  False
433 8CBB76D0-5221-4FD0-8D07-A724DDE5B6A2    AEMX532BEA_PRD  Apple AirTag (pacote com 1) ... [{'idSkuGopoints': 'A89DFBFB-2A02-45A6-97CE-A7...   [{'name': 'Smartphones', 'categoryId': 'B086C0...   [{'name': 'Celulares e Acessórios', 'categoryI...   Com o AirTag, fica muito fácil encontrar suas ...   8CBB76D0-5221-4FD0-8D07-A724DDE5B6A2    False   True    ... 0   False   6   Fast Shop   fornecedor0fastshop.png fastshop    False   False   Fornecedor  False
434 B0E41CD1-6B6E-4EA1-BFAB-AEDAD89B0542    AEMM2G3ZEAAZL_PRD   Capa para iPhone 13 Pro com MagSafe de Silicon...   [{'idSkuGopoints': '486E9B49-0BEB-46F0-AA86-1D...   [{'name': 'Smartphones', 'categoryId': 'B086C0...   [{'name': 'Celulares e Acessórios', 'categoryI...   Compativel com iPhone 13 Pro em Silicone<p sty...   B0E41CD1-6B6E-4EA1-BFAB-AEDAD89B0542    False   True    ... 0   False   6   Fast Shop   fornecedor0fastshop.png fastshop    False   False   Fornecedor  False
435 rows × 24 columns

EDIT: I changed the number of products in the url, so you can get all of them in a single call (I replaces 12 with 999).

Barry the Platipus
  • 9,594
  • 2
  • 6
  • 30
0

Using only Selenium to print the total number of products and extract the names of the products you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following locator strategies:

  • Code block:

    driver.get('https://resgatepontos.latampass.latam.com/fornecedor/fastshop?modalidade=ac&produtoHome=false&relevancia=SCORE&categorias=b086c071-6d3e-485f-9a55-4f4510bec929&page=1')
    print(len(WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.ng-star-inserted > img[src$='jpg']")))))
    print([my_elem.get_attribute("alt") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.ng-star-inserted > img[src$='jpg']")))])
    driver.quit()
    
  • Console Output:

    12
    ['Smartphone Samsung Galaxy Z Flip4 5G Preto, 256GB, 8GB RAM e Câmera Dupla de 12MP                                                                                                                                                                         ', 'Película Protetora para iPhone 14 Pro Max de Vidro Temperado Transparente - Laut - LT-IP22DPG                                                                                                                                                             ', 'Capa para iPhone 12 Mini em Silicone Laranja kinkan - Apple - MHKN3ZE/A                                                                                                                                                                                   ', 'Adaptador para MacBook Pro com tela Retina de 13" Magsafer2 85 W Branco Apple - MD506BZA                                                                                                                                                                  ', 'Carregador MagSafe Apple - MHXH3BE/A                                                                                                                                                                                                                      ', 'Capa para iPhone 12 e iPhone 12 Pro em Silicone Escalarte - Apple - MHL63ZE/A                                                                                                                                                                             ', 'Capa para iPhone 7 e 8 Plus de Couro Azul-Cosmo - Apple - MQHR2ZM/A                                                                                                                                                                                       ', 'Capa para iPhone 14 Plus com MagSafe em Silicone Suculenta - Apple - MPTC3ZE/A                                                                                                                                                                            ', 'Capa para iPhone 12 Mini em Couro Papoula Laranja California - Apple - MHK63ZE/A                                                                                                                                                                          ', 'Capa para iPhone XS Max de Couro Taupe - Apple - MRWR2ZM/A                                                                                                                                                                                                ', 'Cabo Adaptador USB-C para USB 3.0, HDMI e Leitor de Cartões Prateado - Geonav - UCA10                                                                                                                                                                     ', 'iPhone 14 Pro Apple (256GB) Dourado, Tela de 6,1", 5G, Câmera Tripla de 48MP + 12MP + 12MP
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352