2

I am trying to automatically process some data on a webpage using the requests package of Python but I just get "Response [403]" as return value. What am I doing wrong? Here is the minimal (not)working example:

import requests

url = 'http://tools.iedb.org/mhcii/'
seq = "GGPKLRGNVTSNIKFPSDNKGKIIRGSNDKLNKNSEDVLEQSEKSLVSENVPSGLDIDDI"

form_data = {'sequence_text': seq,
             'submit': 'submit'}

response = requests.post(url, data=form_data)

print(response)

I have also tried to give it an extra header with browser information as suggested in here and here but the response still stays the same. Ideas?

UPDATE:

I have incorporated the suggestions below and the 403 error is gone! Now the webform returns a valid htm page that contains the message "You must select an allele". I am not quite sure what I am doing wrong as I have set "allele_list" in my request:

import requests

url = 'http://tools.iedb.org/mhcii/'
seq = "GGPKLRGNVTSNIKFPSDNKGKIIRGSNDKLNKNSEDVLEQSEKSLVSENVPSGLDIDDI"

client = requests.session()

# retrieve the CSRF token
client.get(url)  # sets cookie
if 'csrftoken' in client.cookies:
    csrftoken = client.cookies['csrftoken'] # Django 1.6 and up
else:
    csrftoken = client.cookies['csrf'] # older versions

form_data = {"sequence_text": seq,
             "method": "1",
             "sequence_format": "fasta",
             "locus_list": "Human, HLA-DR",
             "allele_list": "DRB1*01:01", # or HLA-DRB1*01:01
             "sort_output": "MHC_IC50",
             "output_format": "ascii",
             "submit": "submit",
             "csrfmiddlewaretoken": csrftoken}

header_data = {"Referer": url,
               "User-Agent": "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0"}

r = client.post(url, data=form_data, headers=header_data)

print(r.text)
lordy
  • 610
  • 15
  • 30
  • 403 response means forbidden. So, I would say you are being blocked from making that request. – luis.galdo Feb 22 '19 at 08:53
  • The first thing to do would be to submit all field in the request. – Klaus D. Feb 22 '19 at 08:55
  • 2
    Looking at the webpage, the form is protected by a csrf token. You will probably need to get the page in requests.session, extract the token and than post your form. – mfrackowiak Feb 22 '19 at 08:56

0 Answers0