Tuesday, June 29, 2010

From List Comprehensions to Generator Expressions

List comprehensions were added in Python 2.0. This feature originated as a set of patches by Greg Ewing with contributions by Skip Montanaro and Thomas Wouters. (IIRC Tim Peters also strongly endorsed the idea.) Essentially, they are a Pythonic interpretation of a well-known notation for sets used by mathematicians. For example, it is commonly understood that this:
{x | x > 10}

refers to the set of all x such that x > 10. In math, this form implies a universal set that is understood by the reader (for example, the set of all reals, or the set of all integers, depending on the context). In Python, there is no concept of a universal set, and in Python 2.0, there were no sets. (Sets are an interesting story, of which more in a future blog post.)

This and other considerations led to the following notation in Python:
[f(x) for x in S if P(x)]

This produces a list containing the values of the sequence S selected by the predicate P and mapped by the function f. The if-clause is optional, and multiple for-clauses may be present, each with their own optional if-clause, to represent nested loops (the latter feature is rarely used though, since it typically maps a multi-dimensional entity to a one-dimensional list).

List comprehensions provide an alternative to using the built-in map() and filter() functions. map(f, S) is equivalent to [f(x) for x in S] while filter(P, S) is equivalent to [x for x in S if P(x)]. One would think that list comprehensions have little to recommend themselves over the seemingly more compact map() and filter() notations. However, the picture changes if one looks at a more realistic example. Suppose we want to add 1 to the elements of a list, producing a new list. The list comprehension solution is [x+1 for x in S]. The solution using map() is map(lambda x: x+1, S). The part “lambda x: x+1” is Python’s notation for an anonymous function defined in-line.

It has been argued that the real problem here is that Python’s lambda notation is too verbose, and that a more concise notation for anonymous functions would make map() more attractive. Personally, I disagree—I find the list comprehension notation much easier to read than the functional notation, especially as the complexity of the expression to be mapped increases. In addition, the list comprehension executes much faster than the solution using map and lambda. This is because calling a lambda function creates a new stack frame while the expression in the list comprehension is evaluated without creating a new stack frame.

Given the success of list comprehensions, and enabled by the invention of generators (of which more in a future episode), Python 2.4 added a similar notation that represents a sequence of results without turning it into a concrete list. The new feature is called a “generator expression”. For example:
sum(x**2 for x in range(1, 11))

This calls the built-in function sum() with as its argument a generator expression that yields the squares of the numbers from 1 through 10 inclusive. The sum() function adds up the values in its argument resulting in an answer of 385. The advantage over sum([x**2 for x in range(1, 11)]) should be obvious. The latter creates a list containing all the squares, which is then iterated over once before it is thrown away. For large collections these savings in memory usage are an important consideration.

I should add that the differences between list comprehensions and generator expressions are fairly subtle. For example, in Python 2, this is a valid list comprehension:
[x**2 for x in 1, 2, 3]

However this is not a valid generator expression:
(x**2 for x in 1, 2, 3)

We can fix it by adding parentheses around the "1, 2, 3" part:
(x**2 for x in (1, 2, 3))

In Python 3, you also have to use these parentheses for the list comprehension:
[x**2 for x in (1, 2, 3)]

However, in a "regular" or "explicit" for-loop, you can still omit them:
for x in 1, 2, 3: print(x**2)

Why the differences, and why the changes to a more restrictive list comprehension in Python 3? The factors affecting the design were backwards compatibility, avoiding ambiguity, the desire for equivalence, and evolution of the language. Originally, Python (before it even had a version :-) only had the explicit for-loop. There is no ambiguity here for the part that comes after 'in': it is always followed by a colon. Therefore, I figured that if you wanted to loop over a bunch of known values, you shouldn't be bothered with having to put parentheses around them. This also reminded me of Algol-60, where you can write:
for i := 1, 2, 3 do Statement

except that in Algol-60 you can also replace each expression with step-until clause, like this:
for i := 1 step 1 until 10, 12 step 2 until 50, 55 step 5 until 100 do Statement

(In retrospect it would have been cool if Python for-loops had the ability to iterate over multiple sequences as well. Alas...)

When we added list comprehensions in Python 2.0, the same reasoning applied: the sequence expression could only be followed by a close bracket ']' or by a 'for' or 'if' keyword. And it was good.

But when we added generator expressions in Python 2.4, we ran into a problem with ambiguity: the parentheses around a generator expression are not technically part of the generator expression syntax. For example, in this example:
sum(x**2 for x in range(10))

the outer parentheses are part of the call to sum(), and a "bare" generator expression occurs as the first argument. So in theory there would be two interpretations for something like this:
sum(x**2 for x in a, b)

This could either be intended as:
sum(x**2 for x in (a, b))

or as:
sum((x**2 for x in a), b)

After a lot of hemming and hawing (IIRC) we decided not to guess in this case, and the generator comprehension was required to have a single expression (evaluating to an iterable, of course) after its 'in' keyword. But at the time we didn't want to break existing code using the (already hugely popular) list comprehensions.

Then when we were designing Python 3, we decided that we wanted the list comprehension:
[f(x) for x in S if P(x)]

to be fully equivalent to the following expansion using the built-in list() function applied to a generator expression:
list(f(x) for x in S if P(x))

Thus we decided to use the slightly more restrictive syntax of generator expressions for list comprehensions as well.

We also made another change in Python 3, to improve equivalence between list comprehensions and generator expressions. In Python 2, the list comprehension "leaks" the loop control variable into the surrounding scope:
x = 'before'
a = [x for x in 1, 2, 3]
print x # this prints '3', not 'before'

This was an artifact of the original implementation of list comprehensions; it was one of Python's "dirty little secrets" for years. It started out as an intentional compromise to make list comprehensions blindingly fast, and while it was not a common pitfall for beginners, it definitely stung people occasionally. For generator expressions we could not do this. Generator expressions are implemented using generators, whose execution requires a separate execution frame. Thus, generator expressions (especially if they iterate over a short sequence) were less efficient than list comprehensions.

However, in Python 3, we decided to fix the "dirty little secret" of list comprehensions by using the same implementation strategy as for generator expressions. Thus, in Python 3, the above example (after modification to use print(x) :-) will print 'before', proving that the 'x' in the list comprehension temporarily shadows but does not override the 'x' in the surrounding scope.

And before you start worrying about list comprehensions becoming slow in Python 3: thanks to the enormous implementation effort that went into Python 3 to speed things up in general, both list comprehensions and generator expressions in Python 3 are actually faster than they were in Python 2! (And there is no longer a speed difference between the two.)

UPDATE: Of course, I forgot to mention that Python 3 also supports set comprehensions and dictionary comprehensions. These are straightforward extensions of the list comprehension idea.

Wednesday, June 23, 2010

Method Resolution Order

In languages that use multiple inheritance, the order in which base classes are searched when looking for a method is often called the Method Resolution Order, or MRO. (In Python this also applies to other attributes.) For languages that support single inheritance only, the MRO is uninteresting; but when multiple inheritance comes into play, the choice of an MRO algorithm can be remarkably subtle. Python has known at least three different MRO algorithms: classic, Python 2.2 new-style, and Python 2.3 new-style (a.k.a. C3). Only the latter survives in Python 3.

Classic classes used a simple MRO scheme: when looking up a method, base classes were searched using a simple depth-first left-to-right scheme. The first matching object found during this search would be returned. For example, consider these classes:
class A:
def save(self): pass

class B(A): pass

class C:
def save(self): pass

class D(B, C): pass
If we created an instance x of class D, the classic method resolution order would order the classes as D, B, A, C. Thus, a search for the method x.save() would produce A.save() (and not C.save()). This scheme works fine for simple cases, but has problems that become apparent when one considers more complicated uses of multiple inheritance. One problem concerns method lookup under "diamond inheritance." For example:
class A:
def save(self): pass

class B(A): pass

class C(A):
def save(self): pass

class D(B, C): pass
Here, class D inherits from B and C, both of which inherit from class A. Using the classic MRO, methods would be found by searching the classes in the order D, B, A, C, A. Thus, a reference to x.save() will call A.save() as before. However, this is unlikely what you want in this case! Since both B and C inherit from A, one can argue that the redefined method C.save() is actually the method that you want to call, since it can be viewed as being "more specialized" than the method in A (in fact, it probably calls A.save() anyways). For instance, if the save() method is being used to save the state of an object, not calling C.save() would break the program since the state of C would be ignored.

Although this kind of multiple inheritance was rare in existing code, new-style classes would make it commonplace. This is because all new-style classes were defined by inheriting from a base class object. Thus, any use of multiple inheritance in new-style classes would always create the diamond relationship described above. For example:
class B(object): pass

class C(object):
def __setattr__(self, name, value): pass

class D(B, C): pass
Moreover, since object defined a number of methods that are sometimes extended by subtypes (e.g., __setattr__()), the resolution order becomes critical. For example, in the above code, the method C.__setattr__ should apply to instances of class D.

To fix the method resolution order for new-style classes in Python 2.2, I adopted a scheme where the MRO would be pre-computed when a class was defined and stored as an attribute of each class object. The computation of the MRO was officially documented as using a depth-first left-to-right traversal of the classes as before. If any class was duplicated in this search, all but the last occurrence would be deleted from the MRO list. So, for our earlier example, the search order would be D, B, C, A (as opposed to D, B, A, C, A with classic classes).

In reality, the computation of the MRO was more complex than this. I discovered a few cases where this new MRO algorithm didn't seem to work. Thus, there was a special case to deal with a situation when two bases classes occurred in a different order in the inheritance list of two different derived classes, and both of those classes are inherited by yet another class. For example:
class A(object): pass
class B(object): pass
class X(A, B): pass
class Y(B, A): pass
class Z(X, Y): pass
Using the tentative new MRO algorithm, the MRO for these classes would be Z, X, Y, B, A, object. (Here 'object' is the universal base class.) However, I didn't like the fact that B and A were in reversed order. Thus, the real MRO would interchange their order to produce Z, X, Y, A, B, object. Intuitively, this algorithm tried to preserve the order of classes for bases that appeared first in the search process. For instance, on class Z, the base class X would be checked first because it was first in the inheritance list. Since X inherited from A and B, the MRO algorithm would try to preserve that ordering. This is what I implemented for Python 2.2, but I documented the earlier algorithm (naïvely thinking it didn't matter much).

However, shortly after the introduction of new-style classes in Python 2.2, Samuele Pedroni discovered an inconsistency between the documented MRO algorithm and the results that were actually observed in real-code. Moreover, inconsistencies were occurring even in code that did not fall under the special case observed above. After much discussion, it was decided that the MRO adopted for Python 2.2 was broken and that Python should adopt the C3 Linearization algorithm described in the paper "A Monotonic Superclass Linearization for Dylan" (K. Barrett, et al, presented at OOPSLA'96).

Essentially, the main problem in the Python 2.2 MRO algorithm concerned the issue of monotonicity. In a complex inheritance hierarchy, each inheritance relationship defines a simple set of rules concerning the order in which classes should be checked. Specifically, if a class A inherits from class B, then the MRO should obviously check A before B. Likewise, if a class B uses multiple inheritance to inherit from C and D, then B should be checked before C and C should be checked before D.

Within a complex inheritance hierarchy, you want to be able to satisfy all of these possible rules in a way that is monotonic. That is, if you have already determined that class A should be checked before class B, then you should never encounter a situation that requires class B to be checked before class A (otherwise, the result is undefined and the inheritance hierarchy should be rejected). This is where the original MRO got it wrong and where the C3 algorithm comes into play. Basically, the idea behind C3 is that if you write down all of the ordering rules imposed by inheritance relationships in a complex class hierarchy, the algorithm will determine a monotonic ordering of the classes that satisfies all of them. If such an ordering can not be determined, the algorithm will fail.

Thus, in Python 2.3, we abandoned my home-grown 2.2 MRO algorithm in favor of the academically vetted C3 algorithm. One outcome of this is that Python will now reject any inheritance hierarchy that has an inconsistent ordering of base classes. For instance, in the previous example, there is an ordering conflict between class X and Y. For class X, there is a rule that says class A should be checked before class B. However, for class Y, the rule says that class B should be checked before A. In isolation, this discrepancy is fine, but if X and Y are ever combined together in the same inheritance hierarchy for another class (such as in the definition of class Z), that class will be rejected by the C3 algorithm. This, of course, matches the Zen of Python's "errors should never pass silently" rule.

Monday, June 21, 2010

import antigravity

The antigravity module, referencing the XKCD comic mentioning Python, was added to Python 3 by Skip Montanaro. You can read more about it here, one of the first spottings that I know of: http://sciyoshi.com/blog/2008/dec/30/import-antigravity/).

But it really originated in Google App Engine! It was a last-minute addition when we launched App Engine on April 7, 2008. A few weeks before launch, when most code was already frozen, the App Engine team at Google decided we wanted an easter egg. After a great many suggestions, some too complex, some too obscure, some too risky, we chose the "antigravity" module. The App Engine version is a little more elaborate than the Python 3 version: it defines a fly() function while can randomly do one of two things: with a probability of 10%, it redirects to the XKCD comic; otherwise, it simply renders the text of the comic in HTML (with a link to the comic on the last line). To invoke it in your App Engine app, you'd have to write a little main program, like this:
import antigravity

def main():
antigravity.fly()

if __name__ == '__main__':
main()


Update: The Python 3 stdlib version has an easter egg inside the easter egg, if you inspect the source code: it defines a function that purports to implement the XKCD geohashing algorithm.

import this and The Zen of Python

Barry Warsaw posted an interesting blog that tells an obscure part of Python history (the kind I like to keep alive): http://www.wefearchange.org/2010/06/import-this-and-zen-of-python.html

The Inside Story on New-Style Classes

[Warning, this post is long and gets very technical.]

On the surface, new-style classes appear very similar to the original class implementation. However, new-style classes also introduced a number of new concepts:
  • low-level constructors named __new__()
  • descriptors, a generalized way to customize attribute access
  • static methods and class methods
  • properties (computed attributes)
  • decorators (introduced in Python 2.4)
  • slots
  • a new Method Resolution Order (MRO)
In the next few sections, I will try to shine some light on these concepts.

Low-level constructors and __new__()

Traditionally, classes defined an __init__() method which defined how new instances are initialized after their creation. However, in some cases, the author of a class may want to customize how instances are created—for example, if an object was being restored from a persistent database. Old-style classes never really provided a hook for customizing the creation of objects although there were library modules allowed certain kinds of objects to be created in non-standard ways (e.g., the "new" module).

New-style classes introduced a new class method __new__() that lets the class author customize how new class instances are created. By overriding __new__() a class author can implement patterns like the Singleton Pattern, return a previously created instance (e.g., from a free list), or to return an instance of a different class (e.g., a subclass). However, the use of __new__ has other important applications. For example, in the pickle module, __new__ is used to create instances when unserializing objects. In this case, instances are created, but the __init__ method is not invoked.

Another use of __new__ is to help with the subclassing of immutable types. By the nature of their immutability, these kinds of objects can not be initialized through a standard __init__() method. Instead, any kind of special initialization must be performed as the object is created; for instance, if the class wanted to modify the value being stored in the immutable object, the __new__ method can do this by passing the modified value to the base class __new__ method.

Descriptors

Descriptors are a generalization of the concept of bound methods, which was central to the implementation of classic classes. In classic classes, when an instance attribute is not found in the instance dictionary, the search continues with the class dictionary, then the dictionaries of its base classes, and so on recursively. When the attribute is found in a class dictionary (as opposed to in the instance dictionary), the interpreter checks if the object found is a Python function object. If so, the returned value is not the object found, but a wrapper object that acts as a currying function. When the wrapper is called, it calls the original function object after inserting the instance in front of the argument list.

For example, consider an instance x of a class C. Now, suppose that one makes a method call x.f(0). This operation looks up the attribute named "f" on x and calls it with an argument of 0. If "f" corresponds to a method defined in the class, the attribute request returns a wrapper function that behaves approximately like the function in this Python pseudocode:
def bound_f(arg):
return f(x, arg)
When the wrapper is called with an argument of 0, it calls "f" with two arguments: x and 0. This is the fundamental mechanism by which methods in classes obtain their "self" argument.

Another way to access the function object f (without wrapping it) is to ask the class C for an attribute named “f”. This kind of search does not return a wrapper but simply returns the function f. In other words, x.f(0) is equivalent to C.f(x, 0). This is a pretty fundamental equivalence in Python.

For classic classes, if the attribute search finds any other kind of object, no wrapper is created and the value found in the class dictionary is returned unchanged. This makes it possible to use class attributes as “default” values for instance variables. For example, in the above example, if class C has an attribute named “a” whose value is 1, and there is no key “a” in x’s instance dictionary, then x.a equals 1. Assignment to x.a will create an key “a” in x’s instance dictionary whose value will shadow the class attribute (by virtue of the attribute search order). Deleting x.a will reveal the shadowed value (1) again.

Unfortunately, some Python developers were discovering the limitation of this scheme. One limitation was that it was prevented the creation of “hybrid” classes that had some methods implemented in Python and others in C, because only Python functions were being wrapped in such a way as to provide the method with access to the instance, and this behavior was hard-coded in the language. There was also no obvious way to define different kinds of methods such as a static member functions familiar to C++ and Java programmers.

To address this issue, Python 2.2 introduced a straightforward generalization of the above wrapping behavior. Instead of hard-coding the behavior that Python function objects are wrapped and other objects aren’t, the wrapping is now left up to the object found by the attribute search (the function f in the above example). If the object found by an attribute search happens to have a special method named __get__, it is considered to be a "descriptor" object. The __get__ method is then called and whatever is returned is used to produce the result of the attribute search. If the object has no __get__ method, it is returned unchanged. To obtain the original behavior (wrapping function objects) without special-casing function objects in the instance attribute lookup code, function objects now have a __get__ method that returns a wrapper as before. However, users are free to define other classes with methods named __get__, and their instances, when found in a class dictionary during an instance attribute lookup, can also wrap themselves in any way they like.

In addition to generalizing the concept of attribute lookup, it also made sense to extend this idea for the operations of setting or deleting an attribute. Thus, a similar scheme is used for assignment operations such as x.a = 1 or del x.a. In these cases, if the attribute "a" is found in the instance's class dictionary (not in the instance dictionary), the object stored in the class dictionary is checked to see if it has a __set__ and __delete__ special method respectively. (Remember that __del__ has a completely different meaning already.) Thus, by redefining these methods, a descriptor object can have complete control over what it means to get, set, and delete an attribute. However, it's important emphasize that this customization only applies when a descriptor instance appears in a class dictionary—not the instance dictionary of an object.

staticmethod, classmethod, and property

Python 2.2 added three predefined classes: classmethod, staticmethod, and property, that utilized the new descriptor machinery. classmethod and staticmethod were simply wrappers for function objects, implementing different __get__ methods to return different kinds of wrappers for calling the underlying function. For instance, the staticmethod wrapper calls the function without modifying the argument list at all. The classmethod wrapper calls the function with the instance's class object set as the first argument instead of the instance itself. Both can be called via an instance or via the class and the arguments will be the same.

The property class is a wrapper that turned a pair of methods for getting and setting a value into an "attribute." For example, if you have a class like this,
class C(object):
def set_x(self, value):
self.__x = value
def get_x(self):
return self.__x
a property wrapper could be used to make an attribute "x" that when accessed, would implicitly call the get_x and set_x methods.

When first introduced, there was no special syntax for using the classmethod, staticmethod, and property descriptors. At the time, it was deemed too controversial to simultaneously introduce a major new feature along with new syntax (which always leads to a heated debate). Thus, to use these features, you would define your class and methods normally, but add extra statements that would "wrap" the methods. For example:
class C:
def foo(cls, arg):
...
foo = classmethod(foo)
def bar(arg):
...
bar = staticmethod(bar)
For properties, a similar scheme was used:
class C:
def set_x(self, value):
self.__x = value
def get_x(self):
return self.__x
x = property(get_x, set_x)
Decorators

A downside of this approach is that the reader of a class had to read all the way till the end of a method declaration before finding out whether it was a class or static method (or some user-defined variation). In Python 2.4, new syntax was finally introduced, allowing one to write the following instead:
class C:
@classmethod
def foo(cls, arg):
...
@staticmethod
def bar(arg):
...
The construct @expression, on a line by itself before a function declaration, is called a decorator. (Not to be confused with descriptor, which refers to a wrapper implementing __get__; see above.) The particular choice of decorator syntax (derived from Java's annotations) was debated endlessly before it was decided by “BDFL pronouncement”. (David Beazley wrote a piece about the history of the term BDFL that I'll publish separately.)

The decorator feature has become one of the more successful language features, and the use of custom decorators has exceeded my widest expectations. Especially web frameworks have found lots of uses for them. Based on this success, in Python 2.6, the decorator syntax was extended from function definitions to include class definitions.

Slots

Another enhancement made possible with descriptors was the introduction of the __slots__ attribute on classes. For example, a class could be defined like this:
class C:
__slots__ = ['x','y']
...
The presence of __slots__ does several things. First, it restricts the valid set of attribute names on an object to exactly those names listed. Second, since the attributes are now fixed, it is no longer necessary to store attributes in an instance dictionary, so the __dict__ attribute is removed (unless a base class already has it; it can also be added back by a subclass that doesn't use __slots__). Instead, the attributes can be stored in predetermined locations within an array. Thus, every slot attribute is actually a descriptor object that knows how to set/get each attribute using an array index. Underneath the covers, the implementation of this feature is done entirely in C and is highly efficient.

Some people mistakenly assume that the intended purpose of __slots__ is to increase code safety (by restricting the attribute names). In reality, my ultimate goal was performance. Not only was __slots__ an interesting application of descriptors, I feared that all of the changes in the class system were going to have a negative impact on performance. In particular, in order to make data descriptors work properly, any manipulation of an object's attributes first involved a check of the class dictionary to see if that attribute was, in fact, a data descriptor. If so, the descriptor was used to handle the attribute access instead of manually manipulating the instance dictionary as is normally the case. However, this extra check also meant that an extra lookup would be performed prior to inspecting the dictionary of each instance. Thus the use of __slots__ was a way to optimize the lookup of data attributes—a fallback, if you will, in case people were disappointed with the performance impact of the new class system. This turned out unnecessary, but by that time it was of course too late to remove __slots__. Of course, used properly, slots really can increase performance, especially by reducing memory footprint when many small objects are created.

I'll leave the history of Python's Method Resolution Order (MRO) to the next post.

New-style Classes

[After a long hiatus, this blog series is back! I will continue where I left off last year. I'll try to keep the frequency up.]

Earlier, I described how the addition of classes to Python was essentially an afterthought. The implementation chosen was definitely an example of Python's "cut corners" philosophy. However, as Python evolved, various problems with the class implementation became a popular topic of griping from expert Python users.

One problem with the class implementation was that there was no way to subclass built-in types. For example, lists, dictionaries, strings, and other objects were somehow "special" and could not be specialized via subclassing. This limitation seemed rather odd for a language that claimed to be "object oriented."

Another problem was that whole type system just seemed to be "wrong" with user defined classes. For example, if you created two objects a and b, statements such as type(a) == type(b) would evaluate as True even if a and b were instances of completely unrelated classes. Needless to say, developers who were familiar with languages such as C++ and Java found this be rather odd since in those languages, classes were tightly integrated with the underlying type system.

In Python 2.2, I finally took the time to reimplement classes and "do it right." This change was, by far, the most ambitious rewrite of a major Python subsystem to date and one could certainly accuse me of a certain amount of "second-system syndrome" in this effort. Not only did I address the immediate problem of allowing built-in types to be subclassed, I also added support for true metaclasses, attempted to fix the naïve method resolution order for multiple inheritance, and added a variety of other features. A major influence on this work was the book "Putting Metaclasses to Work" by Ira Forman and Scott Danforth, which provided me with a specific notion of what a metaclass is, different from a similar concept in Smalltalk.

An interesting aspect of the class rewrite was that new-style classes were introduced as a new language feature, not as a strict replacement for old-style classes. In fact, for backwards compatibility, the old class implementation remains the default class creation protocol in Python 2. To create a new-style class, you simply have to subclass an existing new-style class such as object (which is the root for the new-style class hierarchy). For example:
class A(object):
statements
...
The change to new-style classes has been a major success. The new metaclasses have become popular amongst framework writers and explaining classes has actually become easier since there were fewer exceptions. The fact that backwards compatibility was maintained meant that old code has continued to work as new-style classes have evolved. Finally, although the old-style classes will eventually be removed from the languages, users are getting used to writing "class MyClass(object)" to declare a class, which isn't so bad.