11

I have a Post model with a filefield which is used to upload files. How can I validate the file type (pdf for now, or any other types if I change to later). Preferably i'd like to validate the content, but if not I guess suffix would do too. I tried to look up online but most of the solutions I found are from way back and as the Django document get updated they don't work any more. Please if anyone can help. Thanks.

class Post(models.Model):
    author = models.ForeignKey('auth.User',default='')
    title = models.CharField(max_length=200)
    text = models.TextField()
    PDF = models.FileField(null=True, blank=True)
    created_date = models.DateTimeField(
            default=timezone.now)
    published_date = models.DateTimeField(
            blank=True, null=True)

    def publish(self):
        self.published_date = timezone.now()
        self.save()

    def __str__(self):
        return self.title
kichik
  • 33,220
  • 7
  • 94
  • 114
hakuro
  • 127
  • 2
  • 2
  • 8

4 Answers4

23

With Django 1.11 you can use FileExtensionValidator. With earlier versions, or for extra validation, you can build your own validator based on it. And you should probably create a validator either way because of this warning:

Don’t rely on validation of the file extension to determine a file’s type. Files can be renamed to have any extension no matter what data they contain.

Here's a sample code with the existing validator:

from django.core.validators import FileExtensionValidator
class Post(models.Model):
    PDF = models.FileField(null=True, blank=True, validators=[FileExtensionValidator(['pdf'])])

Source code is also available so you can easily create your own:

https://docs.djangoproject.com/en/1.11/_modules/django/core/validators/#FileExtensionValidator

kichik
  • 33,220
  • 7
  • 94
  • 114
  • 1
    You **REALLY** should not trust only validation of the file name extension! Django really should include a validator that checks the file's content to be what is claimed by using `libmagic`. See @bimsapi's answer below, and check out https://github.com/mbourqui/django-constrainedfilefield/ or write a custom validator that uses `libmagic`! – peterhil May 22 '19 at 19:09
2

Think of validation in terms of:

  • Name/extension
  • Metadata (content type, size)
  • Actual content (is it really a PNG as the content-type says, or is it a malicious PDF?)

The first two are mostly cosmetic - pretty easy to spoof/fake that information. By adding content validation (via file magic - https://pypi.python.org/pypi/filemagic) you add a little bit of additional protection

Here is a good, related answer: Django: Validate file type of uploaded file It may be old, but the core idea should be easily adapted.

bimsapi
  • 4,985
  • 2
  • 19
  • 27
0

Firstly, I'd advise you change 'PDF' to 'pdf', then to validate in older versions of Django, you could do this

forms.py

class PostForm(forms.ModelForm):
    # fields here
    class Meta:
        model = Post
        fields = ["title", "text", "pdf"]

    def clean(self):
        cd = self.cleaned_data
        pdf = cd.get('pdf', None)
        if pdf is not None:
            main, sub = pdf.content_type.split('/')
            # main here would most likely be application, as pdf mime type is application/pdf, 
            # but I'd like to be on a safer side should in case it returns octet-stream/pdf
            if not (main in ["application", "octet-stream"] and sub == "pdf"):
                raise forms.ValidationError(u'Please use a PDF file')
         return cd
phourxx
  • 591
  • 1
  • 4
  • 15
  • This works great when I upload a file, but somehow gives an error when I try to edit my file and delete the attachment – hakuro Jul 18 '17 at 18:07
  • content_type comes from the user, and they could say it's any filetype they want. It's bad practice to trust that, please use the answer provided by @bimsapi instead – Calum Mackervoy Feb 23 '23 at 13:13
0

Here is a simple example for a form with file type validation based on Django 1.11 FileExtensionValidator

class ImageForm(ModelForm):
    ALLOWED_TYPES = ['jpg', 'jpeg', 'png', 'gif']

    class Meta:
        model = Profile
        fields = ['image', ]

    def clean_avatar(self):
        image = self.cleaned_data.get('image', None)
        if not avatar:
            raise forms.ValidationError('Missing image file')
        try:
            extension = os.path.splitext(image.name)[1][1:].lower()
            if extension in self.ALLOWED_TYPES:
                return avatar
            else:
                raise forms.ValidationError('File types is not allowed')
        except Exception as e:
            raise forms.ValidationError('Can not identify file type')
Rani
  • 6,424
  • 1
  • 23
  • 31