1

I'm trying to scrape (parse) the data shown in a graph from https://www.poder360.com.br/agregador-de-pesquisas/.

I have tried requests, requests-html and beautifulsoup but I'm unable to parse the whole website. Even when I click the right button and view the page source, it won't show the table with the data, whose id is "method-table".

Code from last attempt:

from requests_html import HTMLSession

def get_data(url_path):
    from requests_html import HTMLSession
    session = HTMLSession()

    r = session.get(url_path)
    r.html.render(wait = 8, sleep = 8)

    return r.html

url_path = 'https://www.poder360.com.br/agregador-de-pesquisas'
content = get_data(url_path)
print(content.html)

Also trying the following code

import requests
import json
from bs4 import BeautifulSoup

url = 'https://www.poder360.com.br/agregador-de-pesquisas'

r = requests.get(url)

soup = BeautifulSoup(r.content, 'html.parser')

print(soup)
rici
  • 234,347
  • 28
  • 237
  • 341
encrypted
  • 11
  • 2
  • That data table is created dynamically with JavaScript, so it doesn't exist in the page source. You'll need to use a scraper which is capable of running the JavaScript, or figure out the URL from which the data is downloaded (probably as JSON). – rici Sep 20 '22 at 19:55
  • I tried replicating the fetch request for the API but it needs authentication. The link is https://pesquisas.poder360.com.br/api/consulta/fetch?data_pesquisa_de=1900-01-01&data_pesquisa_ate=2999-12-31&cargos_id=3&tipo_id=2&unidades_federativas_id=6&ano=2022&turno=1 – encrypted Sep 20 '22 at 20:35

1 Answers1

1

I think that is because you need Javascript to run to render the whole page and show the graph, which does not work with HTMLSession oder requests.

If you click "Inspect" in the Browser on the page and look at the live code instead of the page source, you can search for "circle" and find the data points of the graph.

Maybe this could help: Using python Requests with javascript pages

JHNT
  • 25
  • 5
  • Thanks for your help JHNT but before posting, I had already found the link you suggested. I tried implementing as shown in the first code snippet but I had no luck. – encrypted Sep 20 '22 at 20:33