Porting to Python 3¶
Django 1.5 is the first version of Django to support Python 3. The same code runs both on Python 2 (≥ 2.6.5) and Python 3 (≥ 3.2), thanks to the six compatibility layer.
This document is primarily targeted at authors of pluggable applications who want to support both Python 2 and 3. It also describes guidelines that apply to Django’s code.
Philosophy¶
This document assumes that you are familiar with the changes between Python 2 and Python 3. If you aren’t, read Python’s official porting guide first. Refreshing your knowledge of unicode handling on Python 2 and 3 will help; the Pragmatic Unicode presentation is a good resource.
Django uses the Python 2/3 Compatible Source strategy. Of course, you’re free to chose another strategy for your own code, especially if you don’t need to stay compatible with Python 2. But authors of pluggable applications are encouraged to use the same porting strategy as Django itself.
Writing compatible code is much easier if you target Python ≥ 2.6. Django 1.5
introduces compatibility tools such as django.utils.six
, which is a
customized version of the six module
. For convenience,
forwards-compatible aliases were introduced in Django 1.4.2. If your
application takes advantage of these tools, it will require Django ≥ 1.4.2.
Obviously, writing compatible source code adds some overhead, and that can cause frustration. Django’s developers have found that attempting to write Python 3 code that’s compatible with Python 2 is much more rewarding than the opposite. Not only does that make your code more future-proof, but Python 3’s advantages (like the saner string handling) start shining quickly. Dealing with Python 2 becomes a backwards compatibility requirement, and we as developers are used to dealing with such constraints.
Porting tools provided by Django are inspired by this philosophy, and it’s reflected throughout this guide.
Porting tips¶
Unicode literals¶
This step consists in:
- Adding
from __future__ import unicode_literals
at the top of your Python modules – it’s best to put it in each and every module, otherwise you’ll keep checking the top of your files to see which mode is in effect; - Removing the
u
prefix before unicode strings; - Adding a
b
prefix before bytestrings.
Performing these changes systematically guarantees backwards compatibility.
However, Django applications generally don’t need bytestrings, since Django
only exposes unicode interfaces to the programmer. Python 3 discourages using
bytestrings, except for binary data or byte-oriented interfaces. Python 2
makes bytestrings and unicode strings effectively interchangeable, as long as
they only contain ASCII data. Take advantage of this to use unicode strings
wherever possible and avoid the b
prefixes.
Note
Python 2’s u
prefix is a syntax error in Python 3.2 but it will be
allowed again in Python 3.3 thanks to PEP 414. Thus, this
transformation is optional if you target Python ≥ 3.3. It’s still
recommended, per the “write Python 3 code” philosophy.
String handling¶
Python 2’s unicode type was renamed str
in Python 3,
str()
was renamed bytes
, and basestring disappeared.
six provides tools to deal with these
changes.
Django also contains several string related classes and functions in the
django.utils.encoding
and django.utils.safestring
modules. Their
names used the words str
, which doesn’t mean the same thing in Python 2
and Python 3, and unicode
, which doesn’t exist in Python 3. In order to
avoid ambiguity and confusion these concepts were renamed bytes
and
text
.
Here are the name changes in django.utils.encoding
:
Old name | New name |
---|---|
smart_str |
smart_bytes |
smart_unicode |
smart_text |
force_unicode |
force_text |
For backwards compatibility, the old names still work on Python 2. Under
Python 3, smart_str
is an alias for smart_text
.
For forwards compatibility, the new names work as of Django 1.4.2.
Note
django.utils.encoding
was deeply refactored in Django 1.5 to
provide a more consistent API. Check its documentation for more
information.
django.utils.safestring
is mostly used via the
mark_safe()
and
mark_for_escaping()
functions, which didn’t
change. In case you’re using the internals, here are the name changes:
Old name | New name |
---|---|
EscapeString |
EscapeBytes |
EscapeUnicode |
EscapeText |
SafeString |
SafeBytes |
SafeUnicode |
SafeText |
For backwards compatibility, the old names still work on Python 2. Under
Python 3, EscapeString
and SafeString
are aliases for EscapeText
and SafeText
respectively.
For forwards compatibility, the new names work as of Django 1.4.2.
__str__()
and ` __unicode__()`_ methods¶
In Python 2, the object model specifies __str__()
and
` __unicode__()`_ methods. If these methods exist, they must return
str
(bytes) and unicode
(text) respectively.
The print
statement and the str
built-in call
__str__()
to determine the human-readable representation of an
object. The unicode
built-in calls ` __unicode__()`_ if it
exists, and otherwise falls back to __str__()
and decodes the
result with the system encoding. Conversely, the
Model
base class automatically derives
__str__()
from ` __unicode__()`_ by encoding to UTF-8.
In Python 3, there’s simply __str__()
, which must return str
(text).
(It is also possible to define __bytes__()
, but Django applications
have little use for that method, because they hardly ever deal with bytes
.)
Django provides a simple way to define __str__()
and
` __unicode__()`_ methods that work on Python 2 and 3: you must
define a __str__()
method returning text and to apply the
python_2_unicode_compatible()
decorator.
On Python 3, the decorator is a no-op. On Python 2, it defines appropriate
` __unicode__()`_ and __str__()
methods (replacing the
original __str__()
method in the process). Here’s an example:
from __future__ import unicode_literals
from django.utils.encoding import python_2_unicode_compatible
@python_2_unicode_compatible
class MyClass(object):
def __str__(self):
return "Instance of my class"
This technique is the best match for Django’s porting philosophy.
For forwards compatibility, this decorator is available as of Django 1.4.2.
Finally, note that __repr__()
must return a str
on all
versions of Python.
dict
and dict
-like classes¶
dict.keys()
, dict.items()
and dict.values()
return lists in
Python 2 and iterators in Python 3. QueryDict
and the
dict
-like classes defined in django.utils.datastructures
behave likewise in Python 3.
six provides compatibility functions to work around this change:
iterkeys()
, iteritems()
, and itervalues()
.
It also contains an undocumented iterlists
function that works well for
django.utils.datastructures.MultiValueDict
and its subclasses.
HttpRequest
and HttpResponse
objects¶
According to PEP 3333:
- headers are always
str
objects, - input and output streams are always
bytes
objects.
Specifically, HttpResponse.content
contains bytes
, which may become an issue if you compare it with a
str
in your tests. The preferred solution is to rely on
assertContains()
and
assertNotContains()
. These methods accept a
response and a unicode string as arguments.
Coding guidelines¶
The following guidelines are enforced in Django’s source code. They’re also recommended for third-party applications that follow the same porting strategy.
Syntax requirements¶
Unicode¶
In Python 3, all strings are considered Unicode by default. The unicode
type from Python 2 is called str
in Python 3, and str
becomes
bytes
.
You mustn’t use the u
prefix before a unicode string literal because it’s
a syntax error in Python 3.2. You must prefix byte strings with b
.
In order to enable the same behavior in Python 2, every module must import
unicode_literals
from __future__
:
from __future__ import unicode_literals
my_string = "This is an unicode literal"
my_bytestring = b"This is a bytestring"
If you need a byte string literal under Python 2 and a unicode string literal
under Python 3, use the str
builtin:
str('my string')
In Python 3, there aren’t any automatic conversions between str
and
bytes
, and the codecs
module became more strict. str.encode()
always returns bytes
, and bytes.decode
always returns str
. As a
consequence, the following pattern is sometimes necessary:
value = value.encode('ascii', 'ignore').decode('ascii')
Be cautious if you have to index bytestrings.
Exceptions¶
When you capture exceptions, use the as
keyword:
try:
...
except MyException as exc:
...
This older syntax was removed in Python 3:
try:
...
except MyException, exc: # Don't do that!
...
The syntax to reraise an exception with a different traceback also changed.
Use six.reraise()
.
Magic methods¶
Use the patterns below to handle magic methods renamed in Python 3.
Iterators¶
class MyIterator(six.Iterator):
def __iter__(self):
return self # implement some logic here
def __next__(self):
raise StopIteration # implement some logic here
Boolean evaluation¶
class MyBoolean(object):
def __bool__(self):
return True # implement some logic here
def __nonzero__(self): # Python 2 compatibility
return type(self).__bool__(self)
Division¶
class MyDivisible(object):
def __truediv__(self, other):
return self / other # implement some logic here
def __div__(self, other): # Python 2 compatibility
return type(self).__truediv__(self, other)
def __itruediv__(self, other):
return self // other # implement some logic here
def __idiv__(self, other): # Python 2 compatibility
return type(self).__itruediv__(self, other)
Special methods are looked up on the class and not on the instance to reflect the behavior of the Python interpreter.
Writing compatible code with six¶
six is the canonical compatibility library for supporting Python 2 and 3 in a single codebase. Read its documentation!
A customized version of six
is bundled with Django
as of version 1.4.2. You can import it as django.utils.six
.
Here are the most common changes required to write compatible code.
String handling¶
The basestring
and unicode
types were removed in Python 3, and the
meaning of str
changed. To test these types, use the following idioms:
isinstance(myvalue, six.string_types) # replacement for basestring
isinstance(myvalue, six.text_type) # replacement for unicode
isinstance(myvalue, bytes) # replacement for str
Python ≥ 2.6 provides bytes
as an alias for str
, so you don’t need
six.binary_type
.
long
¶
The long
type no longer exists in Python 3. 1L
is a syntax error. Use
six.integer_types
check if a value is an integer or a long:
isinstance(myvalue, six.integer_types) # replacement for (int, long)
xrange
¶
If you use xrange
on Python 2, import six.moves.range
and use that
instead. You can also import six.moves.xrange
(it’s equivalent to
six.moves.range
) but the first technique allows you to simply drop the
import when dropping support for Python 2.
Moved modules¶
Some modules were renamed in Python 3. The django.utils.six.moves
module (based on the six.moves module
) provides a
compatible location to import them.
PY2¶
If you need different code in Python 2 and Python 3, check six.PY2
:
if six.PY2:
# compatibility code for Python 2
This is a last resort solution when six
doesn’t provide an appropriate
function.
Django customized version of six¶
The version of six bundled with Django (django.utils.six
) includes a few
customizations for internal use only.