Django documentation

  • 1.7
  • Documentation version: development

Custom lookups

New in Django 1.7.

By default Django offers a wide variety of built-in lookups for filtering (for example, exact and icontains). This documentation explains how to write custom lookups and how to alter the working of existing lookups.

A simple Lookup example

Let’s start with a simple custom lookup. We will write a custom lookup ne which works opposite to exact. Author.objects.filter(name__ne='Jack') will translate to the SQL:

"author"."name" <> 'Jack'

This SQL is backend independent, so we don’t need to worry about different databases.

There are two steps to making this work. Firstly we need to implement the lookup, then we need to tell Django about it. The implementation is quite straightforward:

from django.db.models import Lookup

class NotEqual(Lookup):
    lookup_name = 'ne'

    def as_sql(self, qn, connection):
        lhs, lhs_params = self.process_lhs(qn, connection)
        rhs, rhs_params = self.process_rhs(qn, connection)
        params = lhs_params + rhs_params
        return '%s <> %s' % (lhs, rhs), params

To register the NotEqual lookup we will just need to call register_lookup on the field class we want the lookup to be available. In this case, the lookup makes sense on all Field subclasses, so we register it with Field directly:

from django.db.models.fields import Field
Field.register_lookup(NotEqual)

We can now use foo__ne for any field foo. You will need to ensure that this registration happens before you try to create any querysets using it. You could place the implementation in a models.py file, or register the lookup in the ready() method of an AppConfig.

Taking a closer look at the implementation, the first required attribute is lookup_name. This allows the ORM to understand how to interpret name__ne and use NotEqual to generate the SQL. By convention, these names are always lowercase strings containing only letters, but the only hard requirement is that it must not contain the string __.

We then need to define the as_sql method. This takes a SQLCompiler object, called qn, and the active database connection. SQLCompiler objects are not documented, but the only thing we need to know about them is that they have a compile() method which returns a tuple containing a SQL string, and the parameters to be interpolated into that string. In most cases, you don’t need to use it directly and can pass it on to process_lhs() and process_rhs().

A Lookup works against two values, lhs and rhs, standing for left-hand side and right-hand side. The left-hand side is usually a field reference, but it can be anything implementing the query expression API. The right-hand is the value given by the user. In the example Author.objects.filter(name__ne='Jack'), the left-hand side is a reference to the name field of the Author model, and 'Jack' is the right-hand side.

We call process_lhs and process_rhs to convert them into the values we need for SQL using the qn object described before. These methods return tuples containing some SQL and the parameters to be interpolated into that SQL, just as we need to return from our as_sql method. In the above example, process_lhs returns ('"author"."name"', []) and process_rhs returns ('"%s"', ['Jack']). In this example there were no parameters for the left hand side, but this would depend on the object we have, so we still need to include them in the parameters we return.

Finally we combine the parts into a SQL expression with <>, and supply all the parameters for the query. We then return a tuple containing the generated SQL string and the parameters.

A simple transformer example

The custom lookup above is great, but in some cases you may want to be able to chain lookups together. For example, let’s suppose we are building an application where we want to make use of the abs() operator. We have an Experiment model which records a start value, end value and the change (start - end). We would like to find all experiments where the change was equal to a certain amount (Experiment.objects.filter(change__abs=27)), or where it did not exceed a certain amount (Experiment.objects.filter(change__abs__lt=27)).

Note

This example is somewhat contrived, but it demonstrates nicely the range of functionality which is possible in a database backend independent manner, and without duplicating functionality already in Django.

We will start by writing a AbsoluteValue transformer. This will use the SQL function ABS() to transform the value before comparison:

from django.db.models import Transform

class AbsoluteValue(Transform):
    lookup_name = 'abs'

    def as_sql(self, qn, connection):
        lhs, params = qn.compile(self.lhs)
        return "ABS(%s)" % lhs, params

Next, lets register it for IntegerField:

from django.db.models import IntegerField
IntegerField.register_lookup(AbsoluteValue)

We can now run the queries we had before. Experiment.objects.filter(change__abs=27) will generate the following SQL:

SELECT ... WHERE ABS("experiments"."change") = 27

By using Transform instead of Lookup it means we are able to chain further lookups afterwards. So Experiment.objects.filter(change__abs__lt=27) will generate the following SQL:

SELECT ... WHERE ABS("experiments"."change") < 27

Subclasses of Transform usually only operate on the left-hand side of the expression. Further lookups will work on the transformed value. Note that in this case where there is no other lookup specified, Django interprets change__abs=27 as change__abs__exact=27.

When looking for which lookups are allowable after the Transform has been applied, Django uses the output_type attribute. We didn’t need to specify this here as it didn’t change, but supposing we were applying AbsoluteValue to some field which represents a more complex type (for example a point relative to an origin, or a complex number) then we may have wanted to specify output_type = FloatField, which will ensure that further lookups like abs__lte behave as they would for a FloatField.

Writing an efficient abs__lt lookup

When using the above written abs lookup, the SQL produced will not use indexes efficiently in some cases. In particular, when we use change__abs__lt=27, this is equivalent to change__gt=-27 AND change__lt=27. (For the lte case we could use the SQL BETWEEN).

So we would like Experiment.objects.filter(change__abs__lt=27) to generate the following SQL:

SELECT .. WHERE "experiments"."change" < 27 AND "experiments"."change" > -27

The implementation is:

from django.db.models import Lookup

class AbsoluteValueLessThan(Lookup):
    lookup_name = 'lt'

    def as_sql(self, qn, connection):
        lhs, lhs_params = qn.compile(self.lhs.lhs)
        rhs, rhs_params = self.process_rhs(qn, connection)
        params = lhs_params + rhs_params + lhs_params + rhs_params
        return '%s < %s AND %s > -%s' % (lhs, rhs, lhs, rhs), params

AbsoluteValue.register_lookup(AbsoluteValueLessThan)

There are a couple of notable things going on. First, AbsoluteValueLessThan isn’t calling process_lhs(). Instead it skips the transformation of the lhs done by AbsoluteValue and uses the original lhs. That is, we want to get 27 not ABS(27). Referring directly to self.lhs.lhs is safe as AbsoluteValueLessThan can be accessed only from the AbsoluteValue lookup, that is the lhs is always an instance of AbsoluteValue.

Notice also that as both sides are used multiple times in the query the params need to contain lhs_params and rhs_params multiple times.

The final query does the inversion (27 to -27) directly in the database. The reason for doing this is that if the self.rhs is something else than a plain integer value (for example an F() reference) we can’t do the transformations in Python.

Note

In fact, most lookups with __abs could be implemented as range queries like this, and on most database backends it is likely to be more sensible to do so as you can make use of the indexes. However with PostgreSQL you may want to add an index on abs(change) which would allow these queries to be very efficient.

Writing alternative implementations for existing lookups

Sometimes different database vendors require different SQL for the same operation. For this example we will rewrite a custom implementation for MySQL for the NotEqual operator. Instead of <> we will be using != operator. (Note that in reality almost all databases support both, including all the official databases supported by Django).

We can change the behavior on a specific backend by creating a subclass of NotEqual with a as_mysql method:

class MySQLNotEqual(NotEqual):
    def as_mysql(self, qn, connection):
        lhs, lhs_params = self.process_lhs(qn, connection)
        rhs, rhs_params = self.process_rhs(qn, connection)
        params = lhs_params + rhs_params
        return '%s != %s' % (lhs, rhs), params
Field.register_lookup(MySQLNotExact)

We can then register it with Field. It takes the place of the original NotEqual class as it has the same lookup_name.

When compiling a query, Django first looks for as_%s % connection.vendor methods, and then falls back to as_sql. The vendor names for the in-built backends are sqlite, postgresql, oracle and mysql.

How Django determines the lookups and transforms which are used

In some cases you may which to dynamically change which Transform or Lookup is returned based on the name passed in, rather than fixing it. As an example, you could have a field which stores coordinates or an arbitrary dimension, and wish to allow a syntax like .filter(coords__x7=4) to return the objects where the 7th coordinate has value 4. In order to do this, you would override get_lookup with something like:

class CoordinatesField(Field):
    def get_lookup(self, lookup_name):
        if lookup_name.startswith('x'):
            try:
                dimension = int(lookup_name[1:])
            except ValueError:
                pass
            finally:
                return get_coordinate_lookup(dimension)
        return super(CoordinatesField, self).get_lookup(lookup_name)

You would then define get_coordinate_lookup appropriately to return a Lookup subclass which handles the relevant value of dimension.

There is a similarly named method called get_transform(). get_lookup() should always return a Lookup subclass, and get_transform() a Transform subclass. It is important to remember that Transform objects can be further filtered on, and Lookup objects cannot.

When filtering, if there is only one lookup name remaining to be resolved, we will look for a Lookup. If there are multiple names, it will look for a Transform. In the situation where there is only one name and a Lookup is not found, we look for a Transform and then the exact lookup on that Transform. All call sequences always end with a Lookup. To clarify:

  • .filter(myfield__mylookup) will call myfield.get_lookup('mylookup').
  • .filter(myfield__mytransform__mylookup) will call myfield.get_transform('mytransform'), and then mytransform.get_lookup('mylookup').
  • .filter(myfield__mytransform) will first call myfield.get_lookup('mytransform'), which will fail, so it will fall back to calling myfield.get_transform('mytransform') and then mytransform.get_lookup('exact').

Lookups and transforms are registered using the same API - register_lookup.

The Query Expression API

A lookup can assume that the lhs responds to the query expression API. Currently direct field references, aggregates and Transform instances respond to this API.

as_sql(qn, connection)

Responsible for producing the query string and parameters for the expression. The qn is a SQLCompiler object, which has a compile() method that can be used to compile other expressions. The connection is the connection used to execute the query.

Calling expression.as_sql() directly is usually incorrect - instead qn.compile(expression) should be used. The qn.compile() method will take care of calling vendor-specific methods of the expression.

as_vendorname(qn, connection)

Works like as_sql() method. When an expression is compiled by qn.compile(), Django will first try to call as_vendorname(), where vendorname is the vendor name of the backend used for executing the query. The vendorname is one of postgresql, oracle, sqlite or mysql for Django’s built-in backends.

get_lookup(lookup_name)

The get_lookup() method is used to fetch lookups. By default the lookup is fetched from the expression’s output type in the same way described in registering and fetching lookup documentation below. It is possible to override this method to alter that behavior.

get_transform(lookup_name)

The get_transform() method is used when a transform is needed rather than a lookup, or if a lookup is not found. This is a more complex situation which is useful when there arbitrary possible lookups for a field. Generally speaking, you will not need to override get_lookup() or get_transform(), and can use register_lookup() instead.

output_type

The output_type attribute is used by the get_lookup() method to check for lookups. The output_type should be a field.

Note that this documentation lists only the public methods of the API.

Lookup reference

class Lookup

In addition to the attributes and methods below, lookups also support as_sql and as_vendorname from the query expression API.

lhs

The lhs (left-hand side) of a lookup tells us what we are comparing the rhs to. It is an object which implements the query expression API. This is likely to be a field, an aggregate or a subclass of Transform.

rhs

The rhs (right-hand side) of a lookup is the value we are comparing the left hand side to. It may be a plain value, or something which compiles into SQL, for example an F() object or a Queryset.

lookup_name

This class level attribute is used when registering lookups. It determines the name used in queries to trigger this lookup. For example, contains or exact. This should not contain the string __.

process_lhs(qn, connection)

This returns a tuple of (lhs_string, lhs_params). In some cases you may wish to compile lhs directly in your as_sql methods using qn.compile(self.lhs).

process_rhs(qn, connection)

Behaves the same as process_lhs but acts on the right-hand side.

Transform reference

class Transform

In addition to implementing the query expression API Transforms have the following methods and attributes.

lhs

The lhs (left-hand-side) of a transform contains the value to be transformed. The lhs implements the query expression API.

lookup_name

This class level attribute is used when registering lookups. It determines the name used in queries to trigger this lookup. For example, year or dayofweek. This should not contain the string __.

Registering and fetching lookups

The lookup registration API is explained below.

classmethod register_lookup(lookup)

Registers the Lookup or Transform for the class. For example DateField.register_lookup(YearExact) will register YearExact for all DateFields in the project, but also for fields that are instances of a subclass of DateField (for example DateTimeField). You can register a Lookup or a Transform using the same class method.

get_lookup(lookup_name)

Django uses get_lookup(lookup_name) to fetch lookups. The implementation of get_lookup() looks for a subclass which is registered for the current class with the correct lookup_name.

get_transform(lookup_name)

Django uses get_transform(lookup_name) to fetch lookups. The implementation of get_transform() looks for a subclass which is registered for the current class with the correct transform_name.

The lookup registration API is available for Transform and Field classes.

Questions/Feedback

Having trouble? We'd like to help!

This document is for Django's development version, which can be significantly different from previous releases. For older releases, use the version selector floating in the bottom right corner of this page.