0

I have a model class KeyWord

class KeyWord(models.Model):
    keyword = models.CharField(verbose_name="Topic",max_length=50)
    link = models.URLField(verbose_name = "Next Best Action", max_length=500,null=True)

    def __str__(self):
        return self.keyword

If i create an object like this:

KeyWord.objects.create(keyword="hello",link="world")

Ideally an error be raised because i am assigning normal text to link field which is a URLField but object created successfully?

Which field should i use or what should i do so that objects with valid links are saved?

Amandeep Singh
  • 1,371
  • 1
  • 11
  • 33
  • Why is it not a valid URL? This resolves to the url "world" relative to the current document location. So if document location is http://hello.com/ then the URL world would be http://hello.com/world. There is no way to force the vanilla Django URLField to require fully qualified URLs. –  Oct 27 '20 at 13:45
  • Actually i am saving objects by reading data from excel file. I want to raise an error if i don't get a valid link. e.g if at any certain row if get ```keyword="macbook"``` and ```link="https://apple.com"``` i will save that object else if get ```keyword="macbook"``` and ```link="apple"``` then an error meesage will be passed that ```apple is not a valid url``` and function will return. – Amandeep Singh Oct 27 '20 at 13:54
  • I hope i made my question clear now and you can guide me on this. – Amandeep Singh Oct 27 '20 at 13:55

1 Answers1

2

Pre-requisite: model validation only happens when you call full_clean(). If you use a ModelForm, this is done for you when you call form.save(), but if you upload an Excel file with custom view logic, then you need to do this yourself:

There are three steps involved in validating a model:

  • Validate the model fields - Model.clean_fields()
  • Validate the model as a whole - Model.clean()
  • Validate the field uniqueness - Model.validate_unique()

All three steps are performed when you call a model’s full_clean() method.

As I said earlier, there's no way to tell the URLField to require fully qualified URLs. For that you need to override the URLValidator.

That has a very nasty regular expression and you probably do not want to mess with that, so an alternative is to add additional validators:

from django.core.exceptions import ValidationError
from django.utils.deconstruct import deconstructible

@deconstructible
class RequireHttpOrHttpsUrl:
    def __call__(self, value):
        if not value.startswith("http://") and not value.startswith("https://"):
            raise ValidationError('Please provide a http or https resource')

class KeyWord(models.Model):
    keyword = models.CharField(verbose_name="Topic",max_length=50)
    link = models.URLField(
        verbose_name = "Next Best Action",
        max_length=500, null=True,
        validators=[RequireHttpOrHttpsUrl()]
    )

    def __str__(self):
        return self.keyword

On using urllib.parse() as suggested in the comments:

I highly suggest playing around with URLValidator to where urllib.parse() does better. URLValidator rejects:

  • http://
  • http://bla
  • https://bla

Accepts:

So, I can't find the upside to adding another parser.

  • You can use urllib to make it a bit more rigorous too (example with Python 3.8) ```python from urllib.parse import urlparse parsed = urlparse('https://apple.com/foo') bool(parsed.scheme) and bool(parsed.netloc) # True ``` – Andrew Ingram Oct 27 '20 at 14:36
  • That's not needed, because the default URLValidator is still there, which does a good enough job. –  Oct 27 '20 at 14:37
  • First i applied migrations ```python manage.py makemigrations appName``` and then ```python manage.py migrate```. Then i again uploaded the excel file. Still values with ```non http and https``` got saved. However i followed this answer https://stackoverflow.com/questions/7160737/how-to-validate-a-url-in-python-malformed-or-not and found the fix. Thanks for your support though. – Amandeep Singh Oct 27 '20 at 14:38
  • @Melvyn it's not about verifying that it's a valid URL, it's about letting the parser do the heavy-lifting of determining which components of the URL are present. In my example case, verifying that we have both the schema and the domain. – Andrew Ingram Oct 27 '20 at 14:40
  • @AmandeepSinghSawhney You probably used the earlier version, I added the validator as a class not an instance before the edit. –  Oct 27 '20 at 14:42
  • @Melvyn Sir now i am getting this error ```ValueError: Cannot serialize: There are some values Django cannot serialize into migration files. For more, see https://docs.djangoproject.com/en/1.11/topics/migrations/#migration-serializing``` – Amandeep Singh Oct 27 '20 at 14:46
  • @AndrewIngram Yeah, true I guess. It's all not too computation heavy anyway. I find it a bit overkill for the fact that you can still create urls with hostnames that aren't valid, but give an illusion that you have "real" urls. –  Oct 27 '20 at 14:46
  • @AmandeepSinghSawhney Ack, sorry, did this from top of my head and I know I shouldn't. Should be fixed now. –  Oct 27 '20 at 14:49
  • @Melvyn I wish i could show you but again it saved ```non http and https```. Please edit the spelling ```deconstrucible``` to ```deconstructible``` in the answer. This time no error encountered during migrations. – Amandeep Singh Oct 27 '20 at 15:00
  • Then you are not validating your model. You must call `instance.full_clean()` for validators to work. If you just call `save()`, then your validation is bypassed. See [the documation](https://docs.djangoproject.com/en/3.1/ref/models/instances/#validating-objects). –  Oct 27 '20 at 15:12
  • @Melvyn I called ```instance.full_clean()``` method before saving object and it worked. Thanks ! – Amandeep Singh Oct 28 '20 at 08:12