PostgreSQL specific model fields¶
All of these fields are available from the django.contrib.postgres.fields
module.
ArrayField
¶
-
class
ArrayField
(base_field, size=None, **options)[source]¶ A field for storing lists of data. Most field types can be used, you simply pass another field instance as the
base_field
. You may also specify asize
.ArrayField
can be nested to store multi-dimensional arrays.If you give the field a
default
, ensure it’s a callable such aslist
(for an empty default) or a callable that returns a list (such as a function). Incorrectly usingdefault=[]
creates a mutable default that is shared between all instances ofArrayField
.-
base_field
¶ This is a required argument.
Specifies the underlying data type and behavior for the array. It should be an instance of a subclass of
Field
. For example, it could be anIntegerField
or aCharField
. Most field types are permitted, with the exception of those handling relational data (ForeignKey
,OneToOneField
andManyToManyField
).It is possible to nest array fields - you can specify an instance of
ArrayField
as thebase_field
. For example:from django.db import models from django.contrib.postgres.fields import ArrayField class ChessBoard(models.Model): board = ArrayField( ArrayField( models.CharField(max_length=10, blank=True), size=8, ), size=8, )
Transformation of values between the database and the model, validation of data and configuration, and serialization are all delegated to the underlying base field.
-
size
¶ This is an optional argument.
If passed, the array will have a maximum size as specified. This will be passed to the database, although PostgreSQL at present does not enforce the restriction.
-
Note
When nesting ArrayField
, whether you use the size parameter or not,
PostgreSQL requires that the arrays are rectangular:
from django.contrib.postgres.fields import ArrayField
from django.db import models
class Board(models.Model):
pieces = ArrayField(ArrayField(models.IntegerField()))
# Valid
Board(pieces=[
[2, 3],
[2, 1],
])
# Not valid
Board(pieces=[
[2, 3],
[2],
])
If irregular shapes are required, then the underlying field should be made
nullable and the values padded with None
.
Querying ArrayField
¶
There are a number of custom lookups and transforms for ArrayField
.
We will use the following example model:
from django.db import models
from django.contrib.postgres.fields import ArrayField
class Post(models.Model):
name = models.CharField(max_length=200)
tags = ArrayField(models.CharField(max_length=200), blank=True)
def __str__(self): # __unicode__ on Python 2
return self.name
contains
¶
The contains
lookup is overridden on ArrayField
. The
returned objects will be those where the values passed are a subset of the
data. It uses the SQL operator @>
. For example:
>>> Post.objects.create(name='First post', tags=['thoughts', 'django'])
>>> Post.objects.create(name='Second post', tags=['thoughts'])
>>> Post.objects.create(name='Third post', tags=['tutorial', 'django'])
>>> Post.objects.filter(tags__contains=['thoughts'])
<QuerySet [<Post: First post>, <Post: Second post>]>
>>> Post.objects.filter(tags__contains=['django'])
<QuerySet [<Post: First post>, <Post: Third post>]>
>>> Post.objects.filter(tags__contains=['django', 'thoughts'])
<QuerySet [<Post: First post>]>
contained_by
¶
This is the inverse of the contains
lookup -
the objects returned will be those where the data is a subset of the values
passed. It uses the SQL operator <@
. For example:
>>> Post.objects.create(name='First post', tags=['thoughts', 'django'])
>>> Post.objects.create(name='Second post', tags=['thoughts'])
>>> Post.objects.create(name='Third post', tags=['tutorial', 'django'])
>>> Post.objects.filter(tags__contained_by=['thoughts', 'django'])
<QuerySet [<Post: First post>, <Post: Second post>]>
>>> Post.objects.filter(tags__contained_by=['thoughts', 'django', 'tutorial'])
<QuerySet [<Post: First post>, <Post: Second post>, <Post: Third post>]>
overlap
¶
Returns objects where the data shares any results with the values passed. Uses
the SQL operator &&
. For example:
>>> Post.objects.create(name='First post', tags=['thoughts', 'django'])
>>> Post.objects.create(name='Second post', tags=['thoughts'])
>>> Post.objects.create(name='Third post', tags=['tutorial', 'django'])
>>> Post.objects.filter(tags__overlap=['thoughts'])
<QuerySet [<Post: First post>, <Post: Second post>]>
>>> Post.objects.filter(tags__overlap=['thoughts', 'tutorial'])
<QuerySet [<Post: First post>, <Post: Second post>, <Post: Third post>]>
len
¶
Returns the length of the array. The lookups available afterwards are those
available for IntegerField
. For example:
>>> Post.objects.create(name='First post', tags=['thoughts', 'django'])
>>> Post.objects.create(name='Second post', tags=['thoughts'])
>>> Post.objects.filter(tags__len=1)
<QuerySet [<Post: Second post>]>
Index transforms¶
This class of transforms allows you to index into the array in queries. Any
non-negative integer can be used. There are no errors if it exceeds the
size
of the array. The lookups available after the
transform are those from the base_field
. For
example:
>>> Post.objects.create(name='First post', tags=['thoughts', 'django'])
>>> Post.objects.create(name='Second post', tags=['thoughts'])
>>> Post.objects.filter(tags__0='thoughts')
<QuerySet [<Post: First post>, <Post: Second post>]>
>>> Post.objects.filter(tags__1__iexact='Django')
<QuerySet [<Post: First post>]>
>>> Post.objects.filter(tags__276='javascript')
<QuerySet []>
Note
PostgreSQL uses 1-based indexing for array fields when writing raw SQL.
However these indexes and those used in slices
use 0-based indexing to be consistent with Python.
Slice transforms¶
This class of transforms allow you to take a slice of the array. Any two non-negative integers can be used, separated by a single underscore. The lookups available after the transform do not change. For example:
>>> Post.objects.create(name='First post', tags=['thoughts', 'django'])
>>> Post.objects.create(name='Second post', tags=['thoughts'])
>>> Post.objects.create(name='Third post', tags=['django', 'python', 'thoughts'])
>>> Post.objects.filter(tags__0_1=['thoughts'])
<QuerySet [<Post: First post>, <Post: Second post>]>
>>> Post.objects.filter(tags__0_2__contains=['thoughts'])
<QuerySet [<Post: First post>, <Post: Second post>]>
Note
PostgreSQL uses 1-based indexing for array fields when writing raw SQL.
However these slices and those used in indexes
use 0-based indexing to be consistent with Python.
Multidimensional arrays with indexes and slices
PostgreSQL has some rather esoteric behavior when using indexes and slices on multidimensional arrays. It will always work to use indexes to reach down to the final underlying data, but most other slices behave strangely at the database level and cannot be supported in a logical, consistent fashion by Django.
HStoreField
¶
-
class
HStoreField
(**options)[source]¶ A field for storing mappings of strings to strings. The Python data type used is a
dict
.To use this field, you’ll need to:
- Add
'django.contrib.postgres'
in yourINSTALLED_APPS
. - Setup the hstore extension in PostgreSQL.
You’ll see an error like
can't adapt type 'dict'
if you skip the first step, ortype "hstore" does not exist
if you skip the second.- Add
Note
On occasions it may be useful to require or restrict the keys which are
valid for a given field. This can be done using the
KeysValidator
.
Querying HStoreField
¶
In addition to the ability to query by key, there are a number of custom
lookups available for HStoreField
.
We will use the following example model:
from django.contrib.postgres.fields import HStoreField
from django.db import models
class Dog(models.Model):
name = models.CharField(max_length=200)
data = HStoreField()
def __str__(self): # __unicode__ on Python 2
return self.name
Key lookups¶
To query based on a given key, you simply use that key as the lookup name:
>>> Dog.objects.create(name='Rufus', data={'breed': 'labrador'})
>>> Dog.objects.create(name='Meg', data={'breed': 'collie'})
>>> Dog.objects.filter(data__breed='collie')
<QuerySet [<Dog: Meg>]>
You can chain other lookups after key lookups:
>>> Dog.objects.filter(data__breed__contains='l')
<QuerySet [<Dog: Rufus>, <Dog: Meg>]>
If the key you wish to query by clashes with the name of another lookup, you
need to use the hstorefield.contains
lookup instead.
Warning
Since any string could be a key in a hstore value, any lookup other than those listed below will be interpreted as a key lookup. No errors are raised. Be extra careful for typing mistakes, and always check your queries work as you intend.
contains
¶
The contains
lookup is overridden on
HStoreField
. The returned objects are
those where the given dict
of key-value pairs are all contained in the
field. It uses the SQL operator @>
. For example:
>>> Dog.objects.create(name='Rufus', data={'breed': 'labrador', 'owner': 'Bob'})
>>> Dog.objects.create(name='Meg', data={'breed': 'collie', 'owner': 'Bob'})
>>> Dog.objects.create(name='Fred', data={})
>>> Dog.objects.filter(data__contains={'owner': 'Bob'})
<QuerySet [<Dog: Rufus>, <Dog: Meg>]>
>>> Dog.objects.filter(data__contains={'breed': 'collie'})
<QuerySet [<Dog: Meg>]>
contained_by
¶
This is the inverse of the contains
lookup -
the objects returned will be those where the key-value pairs on the object are
a subset of those in the value passed. It uses the SQL operator <@
. For
example:
>>> Dog.objects.create(name='Rufus', data={'breed': 'labrador', 'owner': 'Bob'})
>>> Dog.objects.create(name='Meg', data={'breed': 'collie', 'owner': 'Bob'})
>>> Dog.objects.create(name='Fred', data={})
>>> Dog.objects.filter(data__contained_by={'breed': 'collie', 'owner': 'Bob'})
<QuerySet [<Dog: Meg>, <Dog: Fred>]>
>>> Dog.objects.filter(data__contained_by={'breed': 'collie'})
<QuerySet [<Dog: Fred>]>
has_key
¶
Returns objects where the given key is in the data. Uses the SQL operator
?
. For example:
>>> Dog.objects.create(name='Rufus', data={'breed': 'labrador'})
>>> Dog.objects.create(name='Meg', data={'breed': 'collie', 'owner': 'Bob'})
>>> Dog.objects.filter(data__has_key='owner')
<QuerySet [<Dog: Meg>]>
has_any_keys
¶
Returns objects where any of the given keys are in the data. Uses the SQL
operator ?|
. For example:
>>> Dog.objects.create(name='Rufus', data={'breed': 'labrador'})
>>> Dog.objects.create(name='Meg', data={'owner': 'Bob'})
>>> Dog.objects.create(name='Fred', data={})
>>> Dog.objects.filter(data__has_any_keys=['owner', 'breed'])
<QuerySet [<Dog: Rufus>, <Dog: Meg>]>
has_keys
¶
Returns objects where all of the given keys are in the data. Uses the SQL operator
?&
. For example:
>>> Dog.objects.create(name='Rufus', data={})
>>> Dog.objects.create(name='Meg', data={'breed': 'collie', 'owner': 'Bob'})
>>> Dog.objects.filter(data__has_keys=['breed', 'owner'])
<QuerySet [<Dog: Meg>]>
keys
¶
Returns objects where the array of keys is the given value. Note that the order
is not guaranteed to be reliable, so this transform is mainly useful for using
in conjunction with lookups on
ArrayField
. Uses the SQL function
akeys()
. For example:
>>> Dog.objects.create(name='Rufus', data={'toy': 'bone'})
>>> Dog.objects.create(name='Meg', data={'breed': 'collie', 'owner': 'Bob'})
>>> Dog.objects.filter(data__keys__overlap=['breed', 'toy'])
<QuerySet [<Dog: Rufus>, <Dog: Meg>]>
values
¶
Returns objects where the array of values is the given value. Note that the
order is not guaranteed to be reliable, so this transform is mainly useful for
using in conjunction with lookups on
ArrayField
. Uses the SQL function
avalues()
. For example:
>>> Dog.objects.create(name='Rufus', data={'breed': 'labrador'})
>>> Dog.objects.create(name='Meg', data={'breed': 'collie', 'owner': 'Bob'})
>>> Dog.objects.filter(data__values__contains=['collie'])
<QuerySet [<Dog: Meg>]>
JSONField
¶
-
class
JSONField
(**options)[source]¶ A field for storing JSON encoded data. In Python the data is represented in its Python native format: dictionaries, lists, strings, numbers, booleans and
None
.If you want to store other data types, you’ll need to serialize them first. For example, you might cast a
datetime
to a string. You might also want to convert the string back to adatetime
when you retrieve the data from the database. There are some third-partyJSONField
implementations which do this sort of thing automatically.If you give the field a
default
, ensure it’s a callable such asdict
(for an empty default) or a callable that returns a dict (such as a function). Incorrectly usingdefault={}
creates a mutable default that is shared between all instances ofJSONField
.
Note
PostgreSQL has two native JSON based data types: json
and jsonb
.
The main difference between them is how they are stored and how they can be
queried. PostgreSQL’s json
field is stored as the original string
representation of the JSON and must be decoded on the fly when queried
based on keys. The jsonb
field is stored based on the actual structure
of the JSON which allows indexing. The trade-off is a small additional cost
on writing to the jsonb
field. JSONField
uses jsonb
.
As a result, this field requires PostgreSQL ≥ 9.4 and Psycopg2 ≥ 2.5.4.
Querying JSONField
¶
We will use the following example model:
from django.contrib.postgres.fields import JSONField
from django.db import models
class Dog(models.Model):
name = models.CharField(max_length=200)
data = JSONField()
def __str__(self): # __unicode__ on Python 2
return self.name
Key, index, and path lookups¶
To query based on a given dictionary key, simply use that key as the lookup name:
>>> Dog.objects.create(name='Rufus', data={
... 'breed': 'labrador',
... 'owner': {
... 'name': 'Bob',
... 'other_pets': [{
... 'name': 'Fishy',
... }],
... },
... })
>>> Dog.objects.create(name='Meg', data={'breed': 'collie'})
>>> Dog.objects.filter(data__breed='collie')
<QuerySet [<Dog: Meg>]>
Multiple keys can be chained together to form a path lookup:
>>> Dog.objects.filter(data__owner__name='Bob')
<QuerySet [<Dog: Rufus>]>
If the key is an integer, it will be interpreted as an index lookup in an array:
>>> Dog.objects.filter(data__owner__other_pets__0__name='Fishy')
<QuerySet [<Dog: Rufus>]>
If the key you wish to query by clashes with the name of another lookup, use
the jsonfield.contains
lookup instead.
If only one key or index is used, the SQL operator ->
is used. If multiple
operators are used then the #>
operator is used.
Warning
Since any string could be a key in a JSON object, any lookup other than those listed below will be interpreted as a key lookup. No errors are raised. Be extra careful for typing mistakes, and always check your queries work as you intend.
Containment and key operations¶
JSONField
shares lookups relating to
containment and keys with HStoreField
.
contains
(accepts any JSON rather than just a dictionary of strings)contained_by
(accepts any JSON rather than just a dictionary of strings)has_key
has_any_keys
has_keys
Range Fields¶
There are five range field types, corresponding to the built-in range types in PostgreSQL. These fields are used to store a range of values; for example the start and end timestamps of an event, or the range of ages an activity is suitable for.
All of the range fields translate to psycopg2 Range objects in python, but also accept tuples as input if no bounds
information is necessary. The default is lower bound included, upper bound
excluded; that is, [)
.
IntegerRangeField
¶
-
class
IntegerRangeField
(**options)[source]¶ Stores a range of integers. Based on an
IntegerField
. Represented by anint4range
in the database and aNumericRange
in Python.Regardless of the bounds specified when saving the data, PostgreSQL always returns a range in a canonical form that includes the lower bound and excludes the upper bound; that is
[)
.
BigIntegerRangeField
¶
-
class
BigIntegerRangeField
(**options)[source]¶ Stores a range of large integers. Based on a
BigIntegerField
. Represented by anint8range
in the database and aNumericRange
in Python.Regardless of the bounds specified when saving the data, PostgreSQL always returns a range in a canonical form that includes the lower bound and excludes the upper bound; that is
[)
.
FloatRangeField
¶
-
class
FloatRangeField
(**options)[source]¶ Stores a range of floating point values. Based on a
FloatField
. Represented by anumrange
in the database and aNumericRange
in Python.
DateTimeRangeField
¶
-
class
DateTimeRangeField
(**options)[source]¶ Stores a range of timestamps. Based on a
DateTimeField
. Represented by atztsrange
in the database and aDateTimeTZRange
in Python.
DateRangeField
¶
-
class
DateRangeField
(**options)[source]¶ Stores a range of dates. Based on a
DateField
. Represented by adaterange
in the database and aDateRange
in Python.Regardless of the bounds specified when saving the data, PostgreSQL always returns a range in a canonical form that includes the lower bound and excludes the upper bound; that is
[)
.
Querying Range Fields¶
There are a number of custom lookups and transforms for range fields. They are available on all the above fields, but we will use the following example model:
from django.contrib.postgres.fields import IntegerRangeField
from django.db import models
class Event(models.Model):
name = models.CharField(max_length=200)
ages = IntegerRangeField()
start = models.DateTimeField()
def __str__(self): # __unicode__ on Python 2
return self.name
We will also use the following example objects:
>>> import datetime
>>> from django.utils import timezone
>>> now = timezone.now()
>>> Event.objects.create(name='Soft play', ages=(0, 10), start=now)
>>> Event.objects.create(name='Pub trip', ages=(21, None), start=now - datetime.timedelta(days=1))
and NumericRange
:
>>> from psycopg2.extras import NumericRange
Containment functions¶
As with other PostgreSQL fields, there are three standard containment
operators: contains
, contained_by
and overlap
, using the SQL
operators @>
, <@
, and &&
respectively.
contains
¶
>>> Event.objects.filter(ages__contains=NumericRange(4, 5))
<QuerySet [<Event: Soft play>]>
contained_by
¶
>>> Event.objects.filter(ages__contained_by=NumericRange(0, 15))
<QuerySet [<Event: Soft play>]>
The contained_by
lookup is also available on the non-range field types:
IntegerField
,
BigIntegerField
,
FloatField
, DateField
,
and DateTimeField
. For example:
>>> from psycopg2.extras import DateTimeTZRange
>>> Event.objects.filter(start__contained_by=DateTimeTZRange(
... timezone.now() - datetime.timedelta(hours=1),
... timezone.now() + datetime.timedelta(hours=1),
... )
<QuerySet [<Event: Soft play>]>
overlap
¶
>>> Event.objects.filter(ages__overlap=NumericRange(8, 12))
<QuerySet [<Event: Soft play>]>
Comparison functions¶
Range fields support the standard lookups: lt
, gt
,
lte
and gte
. These are not particularly helpful - they
compare the lower bounds first and then the upper bounds only if necessary.
This is also the strategy used to order by a range field. It is better to use
the specific range comparison operators.
fully_lt
¶
The returned ranges are strictly less than the passed range. In other words, all the points in the returned range are less than all those in the passed range.
>>> Event.objects.filter(ages__fully_lt=NumericRange(11, 15))
<QuerySet [<Event: Soft play>]>
fully_gt
¶
The returned ranges are strictly greater than the passed range. In other words, the all the points in the returned range are greater than all those in the passed range.
>>> Event.objects.filter(ages__fully_gt=NumericRange(11, 15))
<QuerySet [<Event: Pub trip>]>
not_lt
¶
The returned ranges do not contain any points less than the passed range, that is the lower bound of the returned range is at least the lower bound of the passed range.
>>> Event.objects.filter(ages__not_lt=NumericRange(0, 15))
<QuerySet [<Event: Soft play>, <Event: Pub trip>]>
not_gt
¶
The returned ranges do not contain any points greater than the passed range, that is the upper bound of the returned range is at most the upper bound of the passed range.
>>> Event.objects.filter(ages__not_gt=NumericRange(3, 10))
<QuerySet [<Event: Soft play>]>
adjacent_to
¶
The returned ranges share a bound with the passed range.
>>> Event.objects.filter(ages__adjacent_to=NumericRange(10, 21))
<QuerySet [<Event: Soft play>, <Event: Pub trip>]>
Querying using the bounds¶
There are three transforms available for use in queries. You can extract the lower or upper bound, or query based on emptiness.
startswith
¶
Returned objects have the given lower bound. Can be chained to valid lookups for the base field.
>>> Event.objects.filter(ages__startswith=21)
<QuerySet [<Event: Pub trip>]>
endswith
¶
Returned objects have the given upper bound. Can be chained to valid lookups for the base field.
>>> Event.objects.filter(ages__endswith=10)
<QuerySet [<Event: Soft play>]>
isempty
¶
Returned objects are empty ranges. Can be chained to valid lookups for a
BooleanField
.
>>> Event.objects.filter(ages__isempty=True)
<QuerySet []>
Defining your own range types¶
PostgreSQL allows the definition of custom range types. Django’s model and form
field implementations use base classes below, and psycopg2 provides a
register_range()
to allow use of custom range
types.