6

After researching and tinkering, I seem to be stumped as to what I could try. I'm essentially looking to do the reverse of this question right here: Is it possible to "transfer" a session between selenium.webdriver and requests.session

I want to "click" on a JavaScript button on a webpage that I've "reached" through a series of GET/POST requests in a session (it's important that the cookies are maintained and seamlessly transferred since my GET/POST requests are on pages that require a logged-in user).

However, after some googling, I found that requests doesn't seem to offer something like that. I found selenium and have since been trying to properly transfer the cookies over (unsuccessfully).

import requests, requests.utils, lxml.html
from lxml.cssselect import CSSSelector
from selenium import webdriver

# urls which requests will be made to
login_url = 'login-url-here'
logged_in_data_url = 'logged-in-data-here'

# create my Session to contain my cookies
with requests.Session() as s:
    login_html = s.get(login_url)
    tree = lxml.html.fromstring(login_html.text)
    important_key1 = list(set(tree.xpath('//*[@id="fm1"]/div/div[3]/input[1]/@value')))[0]
    important_key2 = list(set(tree.xpath('//*[@id="fm1"]/div/div[3]/input[2]/@value')))[0]
    form_value = "submit"

    login_payload = {
        'post-field-1': 'post-data-1',
        'post-field-2': 'post-data-2',
        'important_key1': 'important_value1',
        'important_key2': 'important_value2',
        'important_key3': 'important_value3'
    }

    login_result = s.post(login_url,
                    data=login_payload,
                    headers = dict(referer=login_url))

    logged_in_data_html = s.get(logged_in_data_url)
    tree = lxml.html.fromstring(logged_in_data_html.text)
    print(logged_in_data_html.text)

    # Attempt at transferring cookies, currently fails
    cookie_dict = requests.utils.dict_from_cookiejar(s.cookies)
    driver = webdriver.Firefox()
    for cookie in cookie_dict:
        driver.add_cookie(cookie)

    driver.get(logged_in_data_url)

    # prints same contents as login_html.text,
    # meaning cookie transfer failed and the session was thrown out
    print(driver.page_source) 

Any advice or pointers on what to do from here?

EDIT: My attempt with selenium-requests:

import seleniumrequests
import lxml.html
from lxml.cssselect import CSSSelector

# urls which requests will be made to
login_url = 'login-url-here'
logged_in_data_url = 'logged-in-data-here'

driver = seleniumrequests.Firefox()

login_html = driver.request('GET', login_url)
tree = lxml.html.fromstring(login_html.text)
important_key1 = list(set(tree.xpath('//*[@id="fm1"]/div/div[3]/input[1]/@value')))[0]
important_key2 = list(set(tree.xpath('//*[@id="fm1"]/div/div[3]/input[2]/@value')))[0]
form_value = "submit"

# following print statements print value1, value2 respec
print ("important_key1 = " + important_key1)
print("important_key2 = " + important_key2)

login_payload = {
    'post-field-1': 'post-data-1',
    'post-field-2': 'post-data-2',
    'important_key1': 'important_value1',
    'important_key2': 'important_value2',
    'important_key3': 'important_value3'
}

login_result = driver.request('POST', login_url,
                              data=login_payload,
                              headers = dict(referer=login_url))

# this should print out the landing page after being logged in
# source code contains important_key1, 2, and 3 with different values
# the GET and POST requests seem to be in different sessions
# how do I fix that?
print(login_result.text)
Community
  • 1
  • 1
Sean Pianka
  • 2,157
  • 2
  • 27
  • 43

1 Answers1

1

I don't believe it is possible to do that natively. There is, however, an extension to Selenium called selenium-requests that you should be able to use.

EDIT:

Try adding the following to your code. Upon reading the source, this should work (and use the requests Session auto-initialized during the POST request.

response = driver.request('GET', logged_in_data_url)
2Cubed
  • 3,401
  • 7
  • 23
  • 40
  • I did try to use selenium-requests, but I ran into a separate issue where I would perform the following: 1) create a webdriver via seleniumrequests.Firefox(); 2) issue a GET request on the login_url; 3) perform the xpath scrapping to get necessary data for upcoming POST; 4) attempt to POST with data; 5) read the page_source from the driver (and it still would read the same source as the login_url page, meaning it hadn't logged in. I suppose I can try it again though... – Sean Pianka Apr 12 '16 at 00:10
  • 1
    If you post your code as an **Update** to your original post, I may be able to help. – 2Cubed Apr 12 '16 at 00:22
  • I apologize for the long wait -- I have added my attempt/code to the original post (and described more specifically (hopefully) what my issue is with selenium-requests)! – Sean Pianka Apr 12 '16 at 01:05
  • Can you elaborate on the requests.Session() that is auto-initialized during a POST request? The POST request is not able to login due to the "important_key#"s not being of the right value due to the session not being maintained between the GET and the POST requests. It creates a GET request, obtains the "important_key#"s from the source in the response, but the POST request is accessing the page again where the previously stored "important_key#"s are invalid. This is the issue I'm trying to correct, and I do not see how your code could solve it. Thank you for your time, it's very appreciated! – Sean Pianka Apr 12 '16 at 01:28
  • You may find this helpful. :) http://docs.python-requests.org/en/master/user/advanced/ – 2Cubed Apr 12 '16 at 02:09