0

In a similar post a question was asked about changing a form value from [on] to not on, which was simply setting a 'True' and 'False' value (using Mechanize).

How would this be accomplished in scrapy FormRequest.from_response?

EDIT
For example, using mechanize to get form information,
this is the default that comes with the webpage form.
By default, everything on the form is checked:

<CheckboxControl(ac=[*on])>
type=checkbox, name=ac value=['on']
<CheckboxControl(<None>=[*on])>
type=checkbox, name=None value=[]
<TextControl(p=)>
type=text, name=p value=
<CheckboxControl(pr[]=[*0, *1, *2])>
type=checkbox, name=pr[] value=['0', '1', '2']
<CheckboxControl(a[]=[*0, *1, *2, *3, *4])>
type=checkbox, name=a[] value=['0', '1', '2', '3', '4']
<CheckboxControl(pl=[*on])>
type=checkbox, name=pl value=['on']
<CheckboxControl(sp[]=[*1, *2, *3])>
type=checkbox, name=sp[] value=['1', '2', '3']
<SelectControl(pp=[0, 1, *2, 3])>
type=select, name=pp value=['2']

Note the 'ac', '<None>' and 'pl'.
They have a value of [*on]
The goal is to turn them 'off'(?) (uncheck them)

FormRequest.from_response(response, formnumber=0, formdata={'pr[]': '2', 'sp[]': '3', 'pp': '3', 'a[]': ['3', '4']}))

This returns a form with the modified boxes per the formdata. Those keys not mentioned in the formdata are still checked.

Following the example in the above post:

FormRequest.from_response(response, formdata={'live': 'False'})

I have done the FormRequest with a variety of values: 'False', 'True', '', [''], 'on', 'off' and 'None' but can't seem to get the right response.

Any suggestions?

EDIT:
Have attempted:

FormRequest(url, formdata = {'pl': 'False'}, callback=parse_this)  
FormRequest(url, formdata = {'pl': 'off'}, callback=parse_this)  
FormRequest(url, formdata = {'pl': ''}, callback=parse_this) 
FormRequest(url, formdata = {'pl': 'None'}, callback=parse_this)
FormRequest(url, formdata = {'pl': None}, callback=parse_this) 

FormRequest.from_response(response, formdata = {'pl': 'False'})  
FormRequest.from_response(response, formdata = {'pl': 'off'})  
FormRequest.from_response(response, formdata = {'pl': '')  

By default, the webpage provides a form that contains checkboxes that are already checked. The goal is submit the form and 'turn off' some checkbox that only have two options: 'on'/'off'

Cœur
  • 37,241
  • 25
  • 195
  • 267
user1460015
  • 1,973
  • 3
  • 27
  • 37

1 Answers1

1

Checkbox is an input field like any others, i.e. it has value attribute, which is sent to the server. The only difference is that if it is not checked, it is not sent at all, and if it is checked, it is sent along with other fields. I mean a server usually checks if a checkbox is checked by simply checking if its name is in the form data.

You want to "uncheck" checkbox called 'live'. That means that, it just has to be NOT sent to the server at all.

I would use a subclass of FormRequest (not tested, but you should get the idea):

class MyFormRequest(FormRequest):
    """FormRequest subclass which filters from form data submitted to the server None values.
    This allows removing some fields automatically collected from a form by FormRequest.from_response method."""

    def __init__(self, *args, **kwargs):
        formdata = kwargs.get('formdata')
        if formdata: # filter out input fields with None values
            formdata = dict((name, value) for name, value in formdata.iteritems() if value is not None)
            kwargs['formdata'] = formdata

        super(MyFormRequest, self).__init__(*args, **kwargs)

And then use MyFormRequest.from_response instead of FormRequest.from_response

Another option to solve you problem is constructing FormRequest manually, passing it only that form data which is needed, without using FormRequest.from_response.

Here is an example what happens with checkboxes which are unchecked:

In the PHP script (checkbox-form.php), we can get the submitted option from the $_POST array. If $_POST['formWheelchair'] is "Yes", then the box was checked. If the check box was not checked, $_POST['formWheelchair'] won't be set.

warvariuc
  • 57,116
  • 41
  • 173
  • 227
  • What if some checkboxes are checked by default? So when you load the webpage, the form provides checked boxes (the boxes are already checked by default). The goal is to resubmit the form with some boxes checked and others not. – user1460015 Jul 10 '12 at 19:42
  • Then smoth like `MyFormRequest.from_response(response, formname='form', formdata = {'live': None})` should work. Keys in `formdata` dict with `None` values will be excluded from form data sent to server. – warvariuc Jul 11 '12 at 03:57
  • I don't have anything to add to my answer. Just see my little update – warvariuc Jul 11 '12 at 17:26