9

I've just discovered that Django doesn't automatically strip out extra whitespace from form field inputs, and I think I understand the rationale ('frameworks shouldn't be altering user input').

I think I know how to remove the excess whitespace using python's re:

#data = re.sub('\A\s+|\s+\Z', '', data)
data = data.strip()
data = re.sub('\s+', ' ', data)

The question is where should I do this? Presumably this should happen in one of the form's clean stages, but which one? Ideally, I would like to clean all my fields of extra whitespace. If it should be done in the clean_field() method, that would mean I would have to have a lot of clean_field() methods that basically do the same thing, which seems like a lot of repetition.

If not the form's cleaning stages, then perhaps in the model that the form is based on?

Braiam
  • 1
  • 11
  • 47
  • 78
Westerley
  • 591
  • 1
  • 6
  • 11

8 Answers8

11

My approach is borrowed from here. But instead of subclassing django.forms.Form, I use a mixin. That way I can use it with both Form and ModelForm. The method defined here overrides BaseForm's _clean_fields method.

class StripWhitespaceMixin(object):
    def _clean_fields(self):
        for name, field in self.fields.items():
            # value_from_datadict() gets the data from the data dictionaries.
            # Each widget type knows how to retrieve its own data, because some
            # widgets split data over several HTML fields.
            value = field.widget.value_from_datadict(self.data, self.files, self.add_prefix(name))

            try:
                if isinstance(field, FileField):
                    initial = self.initial.get(name, field.initial)
                    value = field.clean(value, initial)
                else:
                    if isinstance(value, basestring):
                        value = field.clean(value.strip())
                    else:
                        value = field.clean(value)
                self.cleaned_data[name] = value
                if hasattr(self, 'clean_%s' % name):
                    value = getattr(self, 'clean_%s' % name)()
                    self.cleaned_data[name] = value
            except ValidationError as e:
                self._errors[name] = self.error_class(e.messages)
                if name in self.cleaned_data:
                    del self.cleaned_data[name]

To use, simply add the mixin to your form

class MyForm(StripeWhitespaceMixin, ModelForm):
    ...

Also, if you want to trim whitespace when saving models that do not have a form you can use the following mixin. Models without forms aren't validated by default. I use this when I create objects based off of json data returned from external rest api call.

class ValidateModelMixin(object):
    def clean(self):
        for field in self._meta.fields:
            value = getattr(self, field.name)

            if value:
                # ducktyping attempt to strip whitespace
                try:
                    setattr(self, field.name, value.strip())
                except Exception:
                    pass

    def save(self, *args, **kwargs):
        self.full_clean()
        super(ValidateModelMixin, self).save(*args, **kwargs)

Then in your models.py

class MyModel(ValidateModelMixin, Model):
    ....
pymarco
  • 7,807
  • 4
  • 29
  • 40
  • Only thing missing is `super(ValidateModelMixin, self).clean()` in case the model it is applied to already has a clean method. – mkoistinen Jul 16 '14 at 16:24
  • 1
    StripWhitespaceMixin is now problematic, because more recent versions of Django have changed the implementation of the `_clean_fields` method, so it misses some behaviour in the base class. See my answer below which won't have this problem. – spookylukey Jul 06 '15 at 14:27
9

Create a custom model field so that your custom form field will be used automatically.

class TrimmedCharFormField(forms.CharField):
    def clean(self, value):
        if value:
            value = value.strip()
        return super(TrimmedCharFormField, self).clean(value)

# (If you use South) add_introspection_rules([], ["^common\.fields\.TrimmedCharField"])
class TrimmedCharField(models.CharField):
    __metaclass__ = models.SubfieldBase

    def formfield(self, **kwargs):
        return super(TrimmedCharField, self).formfield(form_class=TrimmedCharFormField, **kwargs)

Then in your models just replace django.db.models.CharField with TrimmedCharField

philfreo
  • 41,941
  • 26
  • 128
  • 141
6

How about adding that to the def clean(self): in the form?

For further documentation see: https://docs.djangoproject.com/en/dev/ref/forms/validation/#cleaning-and-validating-fields-that-depend-on-each-other

Your method could look something like this:

def clean(self):
  cleaned_data = self.cleaned_data
  for k in self.cleaned_data:
    data = re.sub('\A\s+', '', self.cleaned_data[k])
    data = re.sub('\s+\Z', '', data)
    data = re.sub('\s+', ' ', data)
    cleaned_data[k]=data
  return cleaned_data
tutuDajuju
  • 10,307
  • 6
  • 65
  • 88
Arthur Neves
  • 11,840
  • 8
  • 60
  • 73
4

Use the following mixin:

class StripWhitespaceMixin(object):

    def full_clean(self):
        # self.data can be dict (usually empty) or QueryDict here.
        self.data = self.data.copy()
        is_querydict = hasattr(self.data, 'setlist')
        strip = lambda val: val.strip()
        for k in list(self.data.keys()):
            if is_querydict:
                self.data.setlist(k, map(strip, self.data.getlist(k)))
            else:
                self.data[k] = strip(self.data[k])
        super(StripWhitespaceMixin, self).full_clean()

Add this as a mixin to your form e.g.:

class MyForm(StripWhitespaceMixin, Form):
    pass

This is similar to pymarco's answer, but doesn't involve copy-pasting and then modifying Django code (the contents of the _clean_fields method).

Instead, it overrides full_clean but calls the original full_clean method after making some adjustments to the input data. This makes it less dependent on implementation details of Django's Form class that might change (and in fact have changed since that answer).

spookylukey
  • 6,380
  • 1
  • 31
  • 34
4

Since Django 1.9 you can use the strip keyword argument in the field of your form definition :

strip¶ New in Django 1.9.

If True (default), the value will be stripped of leading and trailing whitespace.

Which should give something like :

class MyForm(forms.Form):

    myfield = forms.CharField(min_length=42, strip=True)

And since its default value is True this should be automatic with django>=1.9.

It's also relevant with RegexField.

Braiam
  • 1
  • 11
  • 47
  • 78
vmonteco
  • 14,136
  • 15
  • 55
  • 86
0

If you want to strip() every CharField in your project; it may be simplest to monkeypatch CharField's default cleaning method.

within: monkey_patch/__init__.py

from django.forms.fields import CharField

def new_clean(self, value):
    """ Strip leading and trailing whitespace on all CharField's """
    if value:
        # We try/catch here, because other fields subclass CharField. So I'm not totally certain that value will always be stripable.
        try:
            value = value.strip()
        except:
            pass
    return super(CharField, self).clean(value)

CharField.clean = new_clean
Community
  • 1
  • 1
Aaron
  • 2,409
  • 29
  • 18
0

In this case, it could be useful to create your own form field (it's not that hard as it sounds). In the clean() method you would remove that extra whitespaces.

Quoting the documentation:

You can easily create custom Field classes. To do this, just create a subclass of django.forms.Field. Its only requirements are that it implement a clean() method and that its __init__() method accept the core arguments (required, label, initial, widget, help_text).

More about it: https://docs.djangoproject.com/en/1.3/ref/forms/fields/#creating-custom-fields

juliomalegria
  • 24,229
  • 14
  • 73
  • 89
  • This looks very promising! It could address the problem closer to the source as it were. However, I'm using modelForms, and so to change the form field associated with each model field, I have would have to 'redeclare' all the modified fields in the form specification as well. Is there a way around this? Would I have to customize the model field as well to use the modified form field? – Westerley Nov 30 '11 at 04:57
0

One way to do this is to specify custom form widget that strips whitespace:

>>> from django import forms
>>> class StripTextField(forms.CharField):
...     def clean(self,value):
...         return value.strip()
...         
>>> f = StripTextField()
>>> f.clean('  hello  ')
'hello'

Then to use this in your ModelForm:

class MyForm(ModelForm):
    strip_field = StripTextField()

    class Meta:
        model = MyModel

However, the best place to do this is in your view after the form has been validated; before you do any inserts into the db or other manipulation of data if you are using ModelForms.

You can always create your own non-ModelForm forms and control every aspect of the field and validation that way.

ModelForm's validation adds checks for values that would violate the db constraints; so if the field can accept ' hello ' as a valid input, ModelForm's is_valid() would have no reason to strip the whitespaces (as it wouldn't make for arbitrary clean logic, in addition to what you mentioned "frameworks shouldn't alter user's input").

Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284