0

I have a Django form entry where I want the model saved to the database to have a unique title, or rather a unique "slug" (derived from the title). I do this by appending a counter to the slug. So something like "this-is-a-title-1" and possibly then later "this-is-a-title-2".

But if two users at the same time submit an entry with the same title, only one entry will make it to the database (the slug field is unique); saving the other will throw an IntegrityError. To create a unique counter, I now loop upon encountering such a a duplicate entry, as follows:

class MyClass(models.Model):
    self.title = models.CharField()
    self.slug = models.SlugField(unique=True, blank=False)

    def save(*args, **kwargs):
        baseslug = slugify(self.title)
        self.slug = baseslug + "-1"
        count = 1
        while count < MAXCOUNT:  # MAXCOUNT something like 1000
            try:
                super(MyClass, self).save(*args, **kwargs)
            except IntegrityError as exc:
                count += 1
                self.slug = baseslug + "-{:d}".format(count)
            else:
                break  # we're fine; break out of the loop

I do this through the admin, where the slug is not entered explicitly (but is created in the save() method).

My question is: is this a good idea? Somehow it feels like a work-around that has a better solution.

Another solution I can think of is to pop-up a message to the unfortunate second user, but that would only say that "your entry can't be saved at this time. Try again in a few seconds." I would then also first need to get the current count from existing (identical-except-for-the-counter) slugs. The above method automatically avoids that, although it will run into problems if ever there are more than MAXCOUNT identical slugs.

Are there any other suggestions, or does this seems like a viable solution?

(Note: I came across this similar question suggested by SO while entering my question, but there, a get() is first performed before saving the entry. This potentially allows enough time between the get() and save() to still attempt to save duplicate entries, thus causing an IntegrityError. Otherwise it appears to suggest pretty much the same solution as I have now.)

Community
  • 1
  • 1
  • 2
    Why is it important the slug be unique? – yuvi Jan 09 '14 at 15:09
  • could you use the `pk`? which is guarenteed to be unique across the table. – dm03514 Jan 09 '14 at 15:34
  • @yuvi It provides unique URLs. –  Jan 09 '14 at 15:57
  • 1
    @dm03514 Not sure I understand: I'd need to fetch the `pk` from the table, and thus I'd need to save the entry first, for which I want the `pk`. Or do you mean, when I need to read the entry, fetch the (non-unique) slug and an (automatically generated) `pk`, and generate a unique URL (see comment above) from that? Much like the URL of an SO question. –  Jan 09 '14 at 16:00
  • Using them together to create unique urls is the answer. It's neat, natural and practical. btw you're question is a good example of why you should always prefer surrogate keys over natural ones – yuvi Jan 09 '14 at 21:46
  • @yuvi, what are the surrogate versus natural keys here? I would think `pk` is the natural one (automatically generated by the database), and thus preferred over a manually created one such as I do above (through the loop counter). In which case natural is preferred instead. –  Jan 10 '14 at 09:31
  • nope. surrogate keys are business-meaningless keys that are only used to reference a row in a table and natural keys are ones that usually have meaning (that's what makes them natural - just like making a url with a slug instead of a number is more natural). You can read about it [here](http://stackoverflow.com/a/590469/2387772) and [here](http://decipherinfosys.wordpress.com/2007/02/01/surrogate-keys-vs-natural-keys-for-primary-key/) – yuvi Jan 10 '14 at 09:42
  • Ah, cool, thanks for that. But that's behind the scenes then. In something more visible like a URL, I'd prefer both, and I see that e.g. SO does the same. In fact, when you leave off the natural part in the URL, SO happily uses the surrogate key to direct you to the full URL; neat. –  Jan 10 '14 at 10:04

3 Answers3

2

If you're just needing to have a value to pass through the querystring, you could just add a property to the model, that combines the slug + the pk:

class MyClass(models.Model):
    title = models.CharField(max_length=255)
    slug = models.SlugField(max_length=300)

    @property
    def url_slug(self):
        return '{}-{}'.format(self.slug, self.id)

Then you don't have to worry about counts being off, race conditions, etc and you only have one write operation.

Brandon Taylor
  • 33,823
  • 15
  • 104
  • 144
  • I guess this is much like dm03514's suggestion in the comments above. Or at least, my second understanding in my reply to that comment. –  Jan 09 '14 at 16:07
  • Yes, except that you don't have to do the save override, or piece together the parameters in a url tag. – Brandon Taylor Jan 09 '14 at 16:36
  • Thanks. This seems most logical and practical, and indeed what I see on lots of other sites (including SO). –  Jan 10 '14 at 09:37
1

If You must have an counter inside slug, much better way will be:

class MyClass(models.Model):
    self.title = models.CharField()
    self.slug = models.SlugField(unique=True, blank=False)

    def save(self, *args, **kwargs):
        baseslug = slugify(self.title)
        try:
            last = MyClass.objects.filter(slug__regex="%s-\d+" % baseslug).latest('id')
            # safe, becouse slugify won't return regex special chars
            # also field with latest id will have highest counter in slug if it was populated that way

            trash, count = last.slug.rsplit('-', 1)            
            count = int(count)+1
        except:
            count = 1
        while count < MAXCOUNT:
            self.slug = "%s-%d" % (baseslug, count)
            try: 
                super(MyClass, self).save(*args, **kwargs)
                break
            except IntegrityError as exc:
                count += 1

it will query database only once to check what number put at the end of slug.

GwynBleidD
  • 20,081
  • 5
  • 46
  • 77
  • This is mentioned in the last paragraph in my question (pointing to a related question): between filtering the last slug and saving a new one, another entry could also query the database in that fraction of time (basically a race condition). The second one would still get the same slug counter, thus two identical slugs would be saved in the end. –  Jan 09 '14 at 17:24
  • Yeah, You have right... but we can combine it with Your solution from question. I've arleady edited my answer. You can avoid lot of queries with this solution. – GwynBleidD Jan 10 '14 at 08:58
0

You could do two writes in your save method:

  1. Save the instance with a temporary random slug (to satisfy uniqueness constraint)
  2. Calculate the slug using the PK and save again

Something like this:

class MyClass(models.Model):
    title = models.CharField(max_length=255)
    slug = models.SlugField(max_length=300, blank=True)

    def save(self, *args, **kwargs):
        if not self.slug:
            # Slug field must be unique, so give it a temporary throw-away value
            temp_slug = uuid.uuid4().hex
            self.slug = temp_slug
            super(MyClass, self).save(*args, **kwargs)
            self.slug = "{}-{}".format(slugify(self.title), self.pk)
        super(MyClass, self).save(*args, **kwargs)

This has the advantage of only requiring a max of two DB writes per save call.

Chris Lawlor
  • 47,306
  • 11
  • 48
  • 68