34
class Tag(models.Model):
  name = models.CharField(maxlength=100)

class Blog(models.Model):
  name = models.CharField(maxlength=100)
  tags =  models.ManyToManyField(Tag)

Simple models just to ask my question.

I wonder how can i query blogs using tags in two different ways.

  • Blog entries that are tagged with "tag1" or "tag2": Blog.objects.filter(tags_in=[1,2]).distinct()
  • Blog objects that are tagged with "tag1" and "tag2" : ?
  • Blog objects that are tagged with exactly "tag1" and "tag2" and nothing else : ??

Tag and Blog is just used for an example.

Josh Scholl
  • 143
  • 15
hamdiakoguz
  • 15,795
  • 9
  • 33
  • 27
  • Check out [this question](http://stackoverflow.com/q/12752601/1226722) with a really great answer. Might be helpful (I am aware this question is ~6 years old, but I still found it while searching for answers!) – gregoltsov Feb 26 '14 at 18:33
  • This is more of an issue of an or in the where clause rather than an actual SQL union. If you're looking for a union look at https://stackoverflow.com/questions/4411049/how-can-i-find-the-union-of-two-django-querysets – jocassid Jul 13 '17 at 22:30

4 Answers4

23

You could use Q objects for #1:

# Blogs who have either hockey or django tags.
from django.db.models import Q
Blog.objects.filter(
    Q(tags__name__iexact='hockey') | Q(tags__name__iexact='django')
)

Unions and intersections, I believe, are a bit outside the scope of the Django ORM, but its possible to to these. The following examples are from a Django application called called django-tagging that provides the functionality. Line 346 of models.py:

For part two, you're looking for a union of two queries, basically

def get_union_by_model(self, queryset_or_model, tags):
    """
    Create a ``QuerySet`` containing instances of the specified
    model associated with *any* of the given list of tags.
    """
    tags = get_tag_list(tags)
    tag_count = len(tags)
    queryset, model = get_queryset_and_model(queryset_or_model)

    if not tag_count:
        return model._default_manager.none()

    model_table = qn(model._meta.db_table)
    # This query selects the ids of all objects which have any of
    # the given tags.
    query = """
    SELECT %(model_pk)s
    FROM %(model)s, %(tagged_item)s
    WHERE %(tagged_item)s.content_type_id = %(content_type_id)s
      AND %(tagged_item)s.tag_id IN (%(tag_id_placeholders)s)
      AND %(model_pk)s = %(tagged_item)s.object_id
    GROUP BY %(model_pk)s""" % {
        'model_pk': '%s.%s' % (model_table, qn(model._meta.pk.column)),
        'model': model_table,
        'tagged_item': qn(self.model._meta.db_table),
        'content_type_id': ContentType.objects.get_for_model(model).pk,
        'tag_id_placeholders': ','.join(['%s'] * tag_count),
    }

    cursor = connection.cursor()
    cursor.execute(query, [tag.pk for tag in tags])
    object_ids = [row[0] for row in cursor.fetchall()]
    if len(object_ids) > 0:
        return queryset.filter(pk__in=object_ids)
    else:
        return model._default_manager.none()

For part #3 I believe you're looking for an intersection. See line 307 of models.py

def get_intersection_by_model(self, queryset_or_model, tags):
    """
    Create a ``QuerySet`` containing instances of the specified
    model associated with *all* of the given list of tags.
    """
    tags = get_tag_list(tags)
    tag_count = len(tags)
    queryset, model = get_queryset_and_model(queryset_or_model)

    if not tag_count:
        return model._default_manager.none()

    model_table = qn(model._meta.db_table)
    # This query selects the ids of all objects which have all the
    # given tags.
    query = """
    SELECT %(model_pk)s
    FROM %(model)s, %(tagged_item)s
    WHERE %(tagged_item)s.content_type_id = %(content_type_id)s
      AND %(tagged_item)s.tag_id IN (%(tag_id_placeholders)s)
      AND %(model_pk)s = %(tagged_item)s.object_id
    GROUP BY %(model_pk)s
    HAVING COUNT(%(model_pk)s) = %(tag_count)s""" % {
        'model_pk': '%s.%s' % (model_table, qn(model._meta.pk.column)),
        'model': model_table,
        'tagged_item': qn(self.model._meta.db_table),
        'content_type_id': ContentType.objects.get_for_model(model).pk,
        'tag_id_placeholders': ','.join(['%s'] * tag_count),
        'tag_count': tag_count,
    }

    cursor = connection.cursor()
    cursor.execute(query, [tag.pk for tag in tags])
    object_ids = [row[0] for row in cursor.fetchall()]
    if len(object_ids) > 0:
        return queryset.filter(pk__in=object_ids)
    else:
        return model._default_manager.none()
Clint Ecker
  • 1,522
  • 1
  • 10
  • 12
17

I've tested these out with Django 1.0:

The "or" queries:

Blog.objects.filter(tags__name__in=['tag1', 'tag2']).distinct()

or you could use the Q class:

Blog.objects.filter(Q(tags__name='tag1') | Q(tags__name='tag2')).distinct()

The "and" query:

Blog.objects.filter(tags__name='tag1').filter(tags__name='tag2')

I'm not sure about the third one, you'll probably need to drop to SQL to do it.

Ycros
  • 1,720
  • 2
  • 14
  • 10
  • Hrm, this "and" query looks like a handy trick, except you won't know at the outset how many times .filter will need to be applied. The user could be looking for dog+goat+cat, in which case you'd need .filter twice. – mlissner Jun 05 '11 at 23:37
  • Regarding dynamic application of the "and" query - just iterate through tags and accumulate filtering using: query = query.filter(tags__name='tagN') – Lukasz Aug 15 '16 at 18:58
  • I think that first example does the trick. I was pondering whether or not the distinct was needed. In SQL terms you would have 2 joins Blog to BlogTagLink and BlogTagLink to Tag a given Blog record would be listed multiple times in the result set. – jocassid Jul 13 '17 at 22:28
10

Please don't reinvent the wheel and use django-tagging application which was made exactly for your use case. It can do all queries you describe, and much more.

If you need to add custom fields to your Tag model, you can also take a look at my branch of django-tagging.

zuber
  • 3,449
  • 2
  • 24
  • 19
6

This will do the trick for you

Blog.objects.filter(tags__name__in=['tag1', 'tag2']).annotate(tag_matches=models.Count(tags)).filter(tag_matches=2)
amit
  • 61
  • 1
  • 1