Porting to Python 3¶
Django 1.5 is the first version of Django to support Python 3. The same code runs both on Python 2 (≥ 2.6.5) and Python 3 (≥ 3.2), thanks to the six compatibility layer.
This document is primarily targeted at authors of pluggable application who want to support both Python 2 and 3. It also describes guidelines that apply to Django’s code.
This document assumes that you are familiar with the changes between Python 2 and Python 3. If you aren’t, read Python’s official porting guide first. Refreshing your knowledge of unicode handling on Python 2 and 3 will help; the Pragmatic Unicode presentation is a good resource.
Django uses the Python 2/3 Compatible Source strategy. Of course, you’re free to chose another strategy for your own code, especially if you don’t need to stay compatible with Python 2. But authors of pluggable applications are encouraged to use the same porting strategy as Django itself.
Writing compatible code is much easier if you target Python ≥ 2.6. Django 1.5 introduces compatibility tools such as django.utils.six, which is a customized version of the six module. For convenience, forwards-compatible aliases were introduced in Django 1.4.2. If your application takes advantage of these tools, it will require Django ≥ 1.4.2.
Obviously, writing compatible source code adds some overhead, and that can cause frustration. Django’s developers have found that attempting to write Python 3 code that’s compatible with Python 2 is much more rewarding than the opposite. Not only does that make your code more future-proof, but Python 3’s advantages (like the saner string handling) start shining quickly. Dealing with Python 2 becomes a backwards compatibility requirement, and we as developers are used to dealing with such constraints.
Porting tools provided by Django are inspired by this philosophy, and it’s reflected throughout this guide.
This step consists in:
- Adding from __future__ import unicode_literals at the top of your Python modules – it’s best to put it in each and every module, otherwise you’ll keep checking the top of your files to see which mode is in effect;
- Removing the u prefix before unicode strings;
- Adding a b prefix before bytestrings.
Performing these changes systematically guarantees backwards compatibility.
However, Django applications generally don’t need bytestrings, since Django only exposes unicode interfaces to the programmer. Python 3 discourages using bytestrings, except for binary data or byte-oriented interfaces. Python 2 makes bytestrings and unicode strings effectively interchangeable, as long as they only contain ASCII data. Take advantage of this to use unicode strings wherever possible and avoid the b prefixes.
Python 2’s u prefix is a syntax error in Python 3.2 but it will be allowed again in Python 3.3 thanks to PEP 414. Thus, this transformation is optional if you target Python ≥ 3.3. It’s still recommended, per the “write Python 3 code” philosophy.
Django also contains several string related classes and functions in the django.utils.encoding and django.utils.safestring modules. Their names used the words str, which doesn’t mean the same thing in Python 2 and Python 3, and unicode, which doesn’t exist in Python 3. In order to avoid ambiguity and confusion these concepts were renamed bytes and text.
Here are the name changes in django.utils.encoding:
|Old name||New name|
For backwards compatibility, the old names still work on Python 2. Under Python 3, smart_str is an alias for smart_text.
For forwards compatibility, the new names work as of Django 1.4.2.
django.utils.encoding was deeply refactored in Django 1.5 to provide a more consistent API. Check its documentation for more information.
|Old name||New name|
For backwards compatibility, the old names still work on Python 2. Under Python 3, EscapeString and SafeString are aliases for EscapeText and SafeText respectively.
For forwards compatibility, the new names work as of Django 1.4.2.
__str__() and __unicode__() methods¶
The print statement and the str() built-in call __str__() to determine the human-readable representation of an object. The unicode() built-in calls __unicode__() if it exists, and otherwise falls back to __str__() and decodes the result with the system encoding. Conversely, the Model base class automatically derives __str__() from __unicode__() by encoding to UTF-8.
In Python 3, there’s simply __str__(), which must return str (text).
(It is also possible to define __bytes__(), but Django application have little use for that method, because they hardly ever deal with bytes.)
Django provides a simple way to define __str__() and __unicode__() methods that work on Python 2 and 3: you must define a __str__() method returning text and to apply the python_2_unicode_compatible() decorator.
from __future__ import unicode_literals from django.utils.encoding import python_2_unicode_compatible @python_2_unicode_compatible class MyClass(object): def __str__(self): return "Instance of my class"
This technique is the best match for Django’s porting philosophy.
For forwards compatibility, this decorator is available as of Django 1.4.2.
Finally, note that __repr__() must return a str on all versions of Python.
six provides compatibility functions to work around this change: iterkeys(), iteritems(), and itervalues(). It also contains an undocumented iterlists function that works well for django.utils.datastructures.MultiValueDict and its subclasses.
HttpRequest and HttpResponse objects¶
According to PEP 3333:
- headers are always str objects,
- input and output streams are always bytes objects.
Specifically, HttpResponse.content contains bytes, which may become an issue if you compare it with a str in your tests. The preferred solution is to rely on assertContains() and assertNotContains(). These methods accept a response and a unicode string as arguments.
The following guidelines are enforced in Django’s source code. They’re also recommended for third-party application who follow the same porting strategy.
In Python 3, all strings are considered Unicode by default. The unicode type from Python 2 is called str in Python 3, and str becomes bytes.
You mustn’t use the u prefix before a unicode string literal because it’s a syntax error in Python 3.2. You must prefix byte strings with b.
In order to enable the same behavior in Python 2, every module must import unicode_literals from __future__:
from __future__ import unicode_literals my_string = "This is an unicode literal" my_bytestring = b"This is a bytestring"
If you need a byte string literal under Python 2 and a unicode string literal under Python 3, use the str() builtin:
In Python 3, there aren’t any automatic conversions between str and bytes, and the codecs module became more strict. str.encode() always returns bytes, and bytes.decode always returns str. As a consequence, the following pattern is sometimes necessary:
value = value.encode('ascii', 'ignore').decode('ascii')
Be cautious if you have to index bytestrings.
Use the patterns below to handle magic methods renamed in Python 3.
class MyIterator(six.Iterator): def __iter__(self): return self # implement some logic here def __next__(self): raise StopIteration # implement some logic here
class MyBoolean(object): def __bool__(self): return True # implement some logic here def __nonzero__(self): # Python 2 compatibility return type(self).__bool__(self)
class MyDivisible(object): def __truediv__(self, other): return self / other # implement some logic here def __div__(self, other): # Python 2 compatibility return type(self).__truediv__(self, other) def __itruediv__(self, other): return self // other # implement some logic here def __idiv__(self, other): # Python 2 compatibility return type(self).__itruediv__(self, other)
Writing compatible code with six¶
six is the canonical compatibility library for supporting Python 2 and 3 in a single codebase. Read its documentation!
A customized version of six is bundled with Django as of version 1.4.2. You can import it as django.utils.six.
Here are the most common changes required to write compatible code.
The basestring and unicode types were removed in Python 3, and the meaning of str changed. To test these types, use the following idioms:
isinstance(myvalue, six.string_types) # replacement for basestring isinstance(myvalue, six.text_type) # replacement for unicode isinstance(myvalue, bytes) # replacement for str
Python ≥ 2.6 provides bytes as an alias for str, so you don’t need six.binary_type.
The long type no longer exists in Python 3. 1L is a syntax error. Use six.integer_types check if a value is an integer or a long:
isinstance(myvalue, six.integer_types) # replacement for (int, long)
Import six.moves.xrange wherever you use xrange.
Some modules were renamed in Python 3. The django.utils.six.moves module (based on the six.moves module) provides a compatible location to import them.
The urllib, urllib2 and urlparse modules were reworked in depth and django.utils.six.moves doesn’t handle them. Django explicitly tries both locations, as follows:
try: from urllib.parse import urlparse, urlunparse except ImportError: # Python 2 from urlparse import urlparse, urlunparse
Django customized version of six¶
The version of six bundled with Django (django.utils.six) includes a few extras.
- assertRaisesRegex(testcase, *args, **kwargs)¶
This replaces testcase.assertRaisesRegexp on Python 2, and testcase.assertRaisesRegex on Python 3. assertRaisesRegexp still exists in current Python3 versions, but issues a warning.
In addition to six’ defaults moves, Django’s version provides thread as _thread and dummy_thread as _dummy_thread.