Optimisasi akses basisdata¶
Django’s database layer provides various ways to help developers get the most out of their databases. This document gathers together links to the relevant documentation, and adds various tips, organized under a number of headings that outline the steps to take when attempting to optimize your database usage.
Profil dahulu¶
As general programming practice, this goes without saying. Find out what queries you are doing and what they are costing you. You may also want to use an external project like django-debug-toolbar, or a tool that monitors your database directly.
Remember that you may be optimizing for speed or memory or both, depending on your requirements. Sometimes optimizing for one will be detrimental to the other, but sometimes they will help each other. Also, work that is done by the database process might not have the same cost (to you) as the same amount of work done in your Python process. It is up to you to decide what your priorities are, where the balance must lie, and profile all of these as required since this will depend on your application and server.
Dengan apapun yang mengikuti, ingat untuk menggambarkan raut muka setelah setiap perubahan untuk memastikan bahwa perubahan adalah menguntungkan, dan keuntungan cukup besar diberikan menurunkan dalam kesiapan kode anda. Semua dari saran-saran dibawah datang dengan surat keberatan yang dalam keadaan anda prinsip umum mungkin tidak berlaku, atau mungkin bahkan dibalikkan.
Gunakan teknik-teknik optimalisasi DB standar¶
...termasuk:
- Indexes. This is a number one priority, after you have determined from
profiling what indexes should be added. Use
Field.db_index
orMeta.index_together
to add these from Django. Consider adding indexes to fields that you frequently query usingfilter()
,exclude()
,order_by()
, etc. as indexes may help to speed up lookups. Note that determining the best indexes is a complex database-dependent topic that will depend on your particular application. The overhead of maintaining an index may outweigh any gains in query speed.
Penggunaan sesuai dari jenis-jenis bidang
We will assume you have done the obvious things above. The rest of this document focuses on how to use Django in such a way that you are not doing unnecessary work. This document also does not address other optimization techniques that apply to all expensive operations, such as general purpose caching.
Memahami QuerySet
¶
Memahami QuerySets adalah vital untuk mendapatkan penampilan baik dengan kode sederhana. Khususnya:
Memahami penilaian QuerySet
¶
Untuk menghindari masalah penampilan, sangatlah penting memahami:
itu QuerySets malas.
ketika mereka dinilai.
bagaimana data ditahan dalam memori.
Memahami atribut tembolok¶
As well as caching of the whole QuerySet
, there is caching of the result of
attributes on ORM objects. In general, attributes that are not callable will be
cached. For example, assuming the example Weblog models:
>>> entry = Entry.objects.get(id=1)
>>> entry.blog # Blog object is retrieved at this point
>>> entry.blog # cached version, no DB access
Tetapi secara umum, atribut callable menyebabkan pencarian DB setiap waktu:
>>> entry = Entry.objects.get(id=1)
>>> entry.authors.all() # query performed
>>> entry.authors.all() # query performed again
Hati-hati ketika membaca kode cetakan - sistem cetakan tidak mengizinkan penggunaan tanda kurun, tetapi akan memanggil callable secara otomatis, memnyembunyikan perbedaan diatas.
Be careful with your own custom properties - it is up to you to implement
caching when required, for example using the
cached_property
decorator.
Gunakan etiket cetakan with
¶
Untuk menggunakan perilaku cache dari QuerySet
, anda mungkin butuh menggunakan etiket cetakan with
.
Gunakan iterator()
¶
Ketika anda mempunyai banyak obyek, perilaku cache dari QuerySet
dapat menyebabkan sejumlah besar memori digunakan. Dalam kasus ini, iterator()
mungkin membantu.
Lakukan pekerjaan basisdata di basisdata daripada di Python¶
Sebagai contoh:
Pada paling tingkat dasar, gunakan filter and exclude melakukan penyaringan di basisdata
Gunakan
F expressions
untukmenyaring berdasarkan pada bidang-bidang lain dalam model sama.
Jika ini tidak cukup membangkitkan SQL yang anda butuhkan:
Gunakan RawSQL
¶
A less portable but more powerful method is the
RawSQL
expression, which allows some SQL
to be explicitly added to the query. If that still isn’t powerful enough:
Gunakan SQL mentah¶
Write your own custom SQL to retrieve data or populate models. Use django.db.connection.queries
to find out what Django
is writing for you and start from there.
Mengambil obyek tersendiri menggukan sebuah unik, kolom indeks¶
There are two reasons to use a column with
unique
or
db_index
when using
get()
to retrieve individual objects.
First, the query will be quicker because of the underlying database index.
Also, the query could run much slower if multiple objects match the lookup;
having a unique constraint on the column guarantees this will never happen.
Jadi menggunakan example Weblog models:
>>> entry = Entry.objects.get(id=10)
akan lebih cepat daripada:
>>> entry = Entry.objects.get(headline="News Item Title")
karena id
diindeks oleh basisdata dan dijamin menjadi unik.
Melakukan berikut berpotensi sangat lambat:
>>> entry = Entry.objects.get(headline__startswith="News")
First of all, headline
is not indexed, which will make the underlying
database fetch slower.
Second, the lookup doesn’t guarantee that only one object will be returned. If the query matches more than one object, it will retrieve and transfer all of them from the database. This penalty could be substantial if hundreds or thousands of records are returned. The penalty will be compounded if the database lives on a separate server, where network overhead and latency also play a factor.
Mengambil semuanya sekaligus jika anda mengetahui anda akan membutuhkannya¶
Hitting the database multiple times for different parts of a single ‘set’ of data that you will need all parts of is, in general, less efficient than retrieving it all in one query. This is particularly important if you have a query that is executed in a loop, and could therefore end up doing many database queries, when only one was needed. So:
Jangan mengambil hal yang tidak anda butuhkan¶
Gunakan QuerySet.values()
dan values_list()
¶
When you just want a dict
or list
of values, and don’t need ORM model
objects, make appropriate usage of
values()
.
These can be useful for replacing model objects in template code - as long as
the dicts you supply have the same attributes as those used in the template,
you are fine.
Gunakan QuerySet.defer()
dan only()
¶
Use defer()
and
only()
if there are database columns
you know that you won’t need (or won’t need in most cases) to avoid loading
them. Note that if you do use them, the ORM will have to go and get them in
a separate query, making this a pessimization if you use it inappropriately.
Also, be aware that there is some (small extra) overhead incurred inside
Django when constructing a model with deferred fields. Don’t be too aggressive
in deferring fields without profiling as the database has to read most of the
non-text, non-VARCHAR data from the disk for a single row in the results, even
if it ends up only using a few columns. The defer()
and only()
methods
are most useful when you can avoid loading a lot of text data or for fields
that might take a lot of processing to convert back to Python. As always,
profile first, then optimize.
Gunakan QuerySet.count()
¶
..jika anda hanya ingin menghitung, daripada melakukan len(queryset)
.
Gunakan QuerySet.exists()
¶
...if you only want to find out if at least one result exists, rather than if
queryset
.
Tetapi:
Jangan berlebihan count()
dan exists()
¶
If you are going to need other data from the QuerySet, just evaluate it.
For example, assuming an Email model that has a body
attribute and a
many-to-many relation to User, the following template code is optimal:
{% if display_inbox %}
{% with emails=user.emails.all %}
{% if emails %}
<p>You have {{ emails|length }} email(s)</p>
{% for email in emails %}
<p>{{ email.body }}</p>
{% endfor %}
{% else %}
<p>No messages today.</p>
{% endif %}
{% endwith %}
{% endif %}
DIa optimal karena:
- Since QuerySets are lazy, this does no database queries if ‘display_inbox’ is False.
- Use of
with
means that we storeuser.emails.all
in a variable for later use, allowing its cache to be re-used. - The line
{% if emails %}
causesQuerySet.__bool__()
to be called, which causes theuser.emails.all()
query to be run on the database, and at the least the first line to be turned into an ORM object. If there aren’t any results, it will return False, otherwise True. - The use of
{{ emails|length }}
callsQuerySet.__len__()
, filling out the rest of the cache without doing another query. - The
for
loop iterates over the already filled cache.
In total, this code does either one or zero database queries. The only
deliberate optimization performed is the use of the with
tag. Using
QuerySet.exists()
or QuerySet.count()
at any point would cause
additional queries.
Gunakan QuerySet.update()
dan delete()
¶
Rather than retrieve a load of objects, set some values, and save them individual, use a bulk SQL UPDATE statement, via QuerySet.update(). Similarly, do bulk deletes where possible.
Note, however, that these bulk update methods cannot call the save()
or
delete()
methods of individual instances, which means that any custom
behavior you have added for these methods will not be executed, including
anything driven from the normal database object signals.
Menggunakan nilai foreign key secara langsung¶
If you only need a foreign key value, use the foreign key value that is already on the object you’ve got, rather than getting the whole related object and taking its primary key. i.e. do:
entry.blog_id
dari pada:
entry.blog.id
Jangan urutkan hasil jika anda tidak peduli¶
Pengurutan tidak bebas; setiap bidang yang diurutkan adalah sebuah operai basisdata harus dilakukan. Jika sebuah model awal mengurutkan (Meta.ordering
) dan anda tidak membutuhkannya, pindahkan dia pada QuerySet
dengan memanggil order_by()
dengan tidak ada parameter.
Menambahkan indeks ke basisdata anda mungkin membantu meningkatkan penampilan pengurutan.
Insert in bulk¶
When creating objects, where possible, use the
bulk_create()
method to reduce the
number of SQL queries. For example:
Entry.objects.bulk_create([
Entry(headline='This is a test'),
Entry(headline='This is only a test'),
])
...is preferable to:
Entry.objects.create(headline='This is a test')
Entry.objects.create(headline='This is only a test')
Catat bahwa ada angka dari caveats to this method
, jadi pastikan dia sesuai untuk kasus digunakan anda.
Ini juga berlaku pada ManyToManyFields
, demikian:
my_band.members.add(me, my_friend)
...is preferable to:
my_band.members.add(me)
my_band.members.add(my_friend)
...where Bands
and Artists
have a many-to-many relationship.