I want to take some data from (https://gps24.juwentus.pl) but to do this its necessary to log in. I dont know how to get autorization and then take data. Of cours I have login and password. Login page is (https://gps24.juwentus.pl/login).
After exploring I found that login name is "login" and password name is "pass", from below:
<input class="loginInput" type="text" name="login" value="" placeholder="Login" id="log">
<input class="loginInput" type="password" name="pass" value="" placeholder="Hasło" id="pwd">
I think the login page is: "https://gps24.juwentus.pl/openid/examples/consumer/try_auth.php" from:
<form method="get" action="/openid/examples/consumer/try_auth.php">
<input type="hidden" name="action" value="verify">
<input type="hidden" name="openid_identifier" value="https://juweid.juwentus.pl:9443/openid/">
<input type="submit" id="submitloginOpenid" value="Zaloguj przez OpenID" style="padding-left: 30px; white-space: normal; padding-right: 30px;" class="login">
</form>
(but I also tried https://juweid.juwentus.pl:9443/openid/ as action in different ways)
i tried requests, session, but still getting 'not logged in page' data (supported by How to "log in" to a website using Python's Requests module?
import requests
payload = {'login': 'good_login',
'pass': 'good_password'}
with requests.session() as c:
c.post('https://gps24.juwentus.pl/openid/examples/consumer/try_auth.php', data=payload)
response = c.get('https://gps24.juwentus.pl')
print(response.text)
I tried somehow use 'after-logging-in-cookies' but also nothing happend (dont want to put them here becouse I dont know if this is safe)
I also tried something with http.cookiejar, urllib.request, urllib.parse supporting from other posts but also couldnt manage what to put where. Trying to get help from other posts but many of them seems to be outdated. Any advices where I am making mistake? Or maybe this page has to strong security?
EDIT: I make selenium headless mode but it is very slow anyway? Anyone know how to make it faster?
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.binary_location = r"C:\my_path\chrome.exe"
driver = webdriver.Chrome(executable_path=os.path.abspath("chromedriver"),options=chrome_options)
driver.get("https://gps24.juwentus.pl/")
driver.find_element_by_class_name('loginInput').send_keys('***')
driver.find_element_by_name('pass').send_keys('***').send_keys(Keys.ENTER)
print(driver.find_element_by_name('something'))
Maybe somebody know how to scrape a page with is already opened and logged in? this way for sure the data will be take much much faster