Tuesday, January 13, 2009

Python's Design Philosophy

Later blog entries will dive into the gory details of Python's history. However, before I do that, I would like to elaborate on the philosophical guidelines that helped me make decisions while designing and implementing Python.

First of all, Python was originally conceived as a one-person “skunkworks” project – there was no official budget, and I wanted results quickly, in part so that I could convince management to support the project (in which I was fairly successful). This led to a number of timesaving rules:
  • Borrow ideas from elsewhere whenever it makes sense.
  • “Things should be as simple as possible, but no simpler.” (Einstein)
  • Do one thing well (The "UNIX philosophy").
  • Don’t fret too much about performance--plan to optimize later when needed.
  • Don’t fight the environment and go with the flow.
  • Don’t try for perfection because “good enough” is often just that.
  • (Hence) it’s okay to cut corners sometimes, especially if you can do it right later.
Other principles weren’t intended as timesavers. Sometimes they were quite the opposite:
  • The Python implementation should not be tied to a particular platform. It’s okay if some functionality is not always available, but the core should work everywhere.
  • Don’t bother users with details that the machine can handle (I didn’t always follow this rule and some of the of the disastrous consequences are described in later sections).
  • Support and encourage platform-independent user code, but don’t cut off access to platform capabilities or properties (This is in sharp contrast to Java.)
  • A large complex system should have multiple levels of extensibility. This maximizes the opportunities for users, sophisticated or not, to help themselves.
  • Errors should not be fatal. That is, user code should be able to recover from error conditions as long as the virtual machine is still functional.
  • At the same time, errors should not pass silently (These last two items naturally led to the decision to use exceptions throughout the implementation.)
  • A bug in the user’s Python code should not be allowed to lead to undefined behavior of the Python interpreter; a core dump is never the user’s fault.
Finally, I had various ideas about good programming language design, which were largely imprinted on me by the ABC group where I had my first real experience with language implementation and design. These ideas are the hardest to put into words, as they mostly revolved around subjective concepts like elegance, simplicity and readability.

Although I will discuss more of ABC's influence on Python a little later, I’d like to mention one readability rule specifically: punctuation characters should be used conservatively, in line with their common use in written English or high-school algebra. Exceptions are made when a particular notation is a long-standing tradition in programming languages, such as “x*y” for multiplication, “a[i]” for array subscription, or “x.foo” for attribute selection, but Python does not use “$” to indicate variables, nor “!” to indicate operations with side effects.

Tim Peters, a long time Python user who eventually became its most prolific and tenacious core developer, attempted to capture my unstated design principles in what he calls the “Zen of Python.” I quote it here in its entirety:
  • Beautiful is better than ugly.
  • Explicit is better than implicit.
  • Simple is better than complex.
  • Complex is better than complicated.
  • Flat is better than nested.
  • Sparse is better than dense.
  • Readability counts.
  • Special cases aren't special enough to break the rules.
  • Although practicality beats purity.
  • Errors should never pass silently.
  • Unless explicitly silenced.
  • In the face of ambiguity, refuse the temptation to guess.
  • There should be one-- and preferably only one --obvious way to do it.
  • Although that way may not be obvious at first unless you're Dutch.
  • Now is better than never.
  • Although never is often better than right now.
  • If the implementation is hard to explain, it's a bad idea.
  • If the implementation is easy to explain, it may be a good idea.
  • Namespaces are one honking great idea -- let's do more of those!
Although my experience with ABC greatly influenced Python, the ABC group had a few design principles that were radically different from Python’s. In many ways, Python is a conscious departure from these:

  • The ABC group strived for perfection. For example, they used tree-based data structure algorithms that were proven to be optimal for asymptotically large collections (but were not so great for small collections).
  • The ABC group wanted to isolate the user, as completely as possible, from the “big, bad world of computers” out there. Not only should there be no limit on the range of numbers, the length of strings, or the size of collections (other than the total memory available), but users should also not be required to deal with files, disks, “saving”, or other programs. ABC should be the only tool they ever needed. This desire also caused the ABC group to create a complete integrated editing environment, unique to ABC (There was an escape possible from ABC’s environment, for sure, but it was mostly an afterthought, and not accessible directly from the language.)
  • The ABC group assumed that the users had no prior computer experience (or were willing to forget it). Thus, alternative terminology was introduced that was considered more “newbie-friendly” than standard programming terms. For example, procedures were called “how-tos” and variables “locations”.
  • The ABC group designed ABC without an evolutionary path in mind, and without expecting user participation in the design of the language. ABC was created as a closed system, as flawless as its designers could make it. Users were not encouraged to “look under the hood”. Although there was talk of opening up parts of the implementation to advanced users in later stages of the project, this was never realized.
In many ways, the design philosophy I used when creating Python is probably one of the main reasons for its ultimate success. Rather than striving for perfection, early adopters found that Python worked "well enough" for their purposes. As the user-base grew, suggestions for improvement were gradually incorporated into the language. As we will seen in later sections, many of these improvements have involved substantial changes and reworking of core parts of the language. Even today, Python continues to evolve.

19 comments:

  1. I remember taking a look at ABC during its brief history and being somewhat confused as its concepts were unlike the Fortran I had used earlier. When I later discovered Python, I found it more to my liking. But I didn't know then how big it would become! It's still my favourite programming language.

    ReplyDelete
  2. Hi Guido,

    I'm not sure if this is a good place to ask questions about the design of Python, but since you are reviewing its history of design, I guess I'd better have a try. The question was originally posted on Google Moderator - Ask a Google engineer.

    You mentioned the "readability rule specifically: punctuation characters should be used conservatively". Then why do we still need colons (:) after if's, def's and for's when we have newlines and indentation to indicate the code block?

    I asked the question while I was marking programming assignments for an introductory Python course. I noticed a common error of missing colons after if's and else's.

    IMHO, colons are very similar to semicolons in the way that both can be used if one want to compact several lines into one, like:

    if CONDITION: STATEMENT 1; STATEMENT 2; ...

    But just as semicolons are unnecessary if you write multiple statements in separate lines, so should be colons. For example, the above one-liner can be re-written as

    if CONDITION
    STATEMENT 1
    STATEMENT 2

    I think this satisfies your readability rule. However, currently a colon is required even though the if statement stands in its own line, which I think causes problems for many newbies to Python, as most other programming languages do not require colons to end if's and else's. (Confession: I consider myself a senior Python user, but I sometimes make the same mistake, too).

    I'd like to hear your opinion about this. Thank you very much!

    ReplyDelete
  3. Hopefully Guido will correct me if I'm wrong on this. ;)

    The idea is that a colon serves as a visual cue. Although the statements are unambiguous to the parser, the human brain might take significantly longer to decipher the "simpler" form.

    That's just Guido's instinctual opinion though. Proving or disproving it would require proper studies, measuring the response time of human subjects, as well as maybe using MRI to monitor effort of the brain.

    ReplyDelete
  4. Hi rhamphoryncus,

    I initially thought the same as you. But later I found this common error of missing colons especially after if's and else's, which led me to think otherwise.

    If the "visual cue" is true, then semicolon should also live, because it is indeed a visual cue, too. Considering newline and indentation, all if's and else's stand out themselves even better than most other statement needing a semicolon.

    Also note most other programming languages do not have such "visual cue" after all. Java and C-based languages do not have the concept. Ruby's "then" seems optional. Heck even Perl (the punctuation language? :P ) does not have it! So I take it that most programmers are used to the situation without such "visual cues".

    Thus the "visual cue" idea must be false.

    Meanwhile, I think it is actually mind-taxing to enforce the colon requirement, esp. after if's. I found this one day when I was trying to describe a particular Python script I wrote before. It's like saying "if some condition is true [COLON], do this and this ...". Note how one have to mentally insert the COLON in the process. When you are actually writing the code, it takes a physical (SHIFT-;) as well as a mental break after finishing the CONDITION part to continue with the indented block. I guess newbies to Python must skip that two breaks so mistakes occur.

    I guess Guido must have something different in mind ...

    ReplyDelete
  5. Hi Riobard,

    I think the Zen of Python can help you here.
    It says "Explicit is better than Implicit". If you omit the colon, the start of the next block of code is implicit, but it is explicit if you have the colon.

    I also find the colon giving a clear visual cue when I read the code that this means start of a dependent block of code in the next line - "Readability counts". The colon provides a clear visual clue to the mental "stop and continue from here for the next dependent block" idea it is used for.

    Just my 2 cents.

    ReplyDelete
  6. "But just as semicolons are unnecessary if you write multiple statements in separate lines, so should be colons."

    I agree with Riobard. Colons tend to disrupt my coding process.

    Another thing that bothers me is the behavior of raw strings:

    Quoting from the Python Reference Manual:

    ----
    Unless an "r" or "R" prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C.
    ----

    So I understand that no matter what I include in a string, it will not be escaped. But later on it reads:

    ----
    r"\" is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw string cannot end in a single backslash (since the backslash would escape the following quote character).
    ----

    Uhm? Why would the bachslash escape the following quote character? Isn't this supposed to be "raw" string?

    I would be glad to read the rationale behind this behavior.

    Thanks for your work Guido!

    ReplyDelete
  7. Hi Anand,

    I think you missed my point in the previous post. The sole purpose of newlines and indentation is to make code explicit, which they do a fantastic job.

    But do we need colons to make code block even more explicit? I doubt it. The same reasoning, if it holds, applies very well to semicolons, in the sense that semicolons make statements explicit (and better than implicit).

    So in short, such "visual cue" is already there by starting a newline and indenting to indicate the dependent code block. Extra colons seem to be redundant. They should go just as semicolons should go -- they stand in the way programmers think, interrupting the mental flow, and easy to forget (which leads to the common error I found for 2nd year CS students new to Python).

    Well, unless Guido has something else in his mind ...

    ReplyDelete
  8. Thanks Guido, i love reading the history of my favourite programming language.

    ReplyDelete
  9. spelling mistake: "As we will seen in later sections".

    nice reading!

    ReplyDelete
  10. @Riobard:

    There have been discussions on the dev list about removing the colon that you might like to read. If you want to use semicolons, there is nothing preventing you, except that the community considers them to be visual noise. However, the colon fits in with its usage in English and reads very naturally. It's hard to draw from code written in a beginners class which is more readable- it comes from extensive reading of real world code, and controlled studies.

    ReplyDelete
  11. Hi verte,

    I tried googling "remove colons python" before I asked the original question, and got nothing in the first page. Since you pointed out that there were discussions about this, I tried again. It turned out that I should use "drop colons python" instead to reach the link on python-dev mailing list. So thanks for mentioning it! (otherwise I'll miss it again)

    Here is the link in case other people are interested: http://mail.python.org/pipermail/python-list/2006-November/413127.html

    However, after finishing the thread I am somehow disappointed -- I have the same feeling as the guy, Michael Hobbs, who originally proposed the question. To quote him

    "To clarify my position, I'm not intentionally being contradictory. In fact, when I first posed my question, I asked if anyone had a good reason for why the redundancy should continue to exist. Expecting to get a nice grammatical counter-example, the only viable answer that anyone could come up with is the FAQ answer that it improves readability. Since then, I've been fighting my point that the colon really doesn't improve readability all that much.

    In the end, I have to admit that I really couldn't give a flying frog if the colon is there or not. It's just a colon, after all. I *was* hoping that I could convince someone to honestly think about it and consider if the colon is really that noticeable. But so far, the only response that I've received is that there's that ABC study somewhere and that settles that."

    So basically, the "extensive reading of real world code and controlled studies" seem to be vapor. I strongly doubt it, especially when the majority of programming languages do not suffer from lack of colons.



    I agree with you that semicolons are visual noises. However, I think colons are noises too; they seem read naturally because we are so used to them that we ignore them automatically--it doesn't change the fact that they are noises. As I said in previous posts, little if any is improved due to the existence of colons; readability of code blocks is contributed primarily by newlines and indentation.

    I also agree that it is unfair to decide the syntax of Python merely based on how beginners feel. However I've used Python for several years but still make the same mistake sometimes, which lead me to think this is a non-trivial issue. If doable, I think it would be appropriate to do a survey among intermediate or senior Python users to see how often they forget to type colons and the interpreter complains.

    Even if senior guys feel colons are OK, there are some other people feel it awkward, particularly people switched from other languages because most other languages do not have colons. Python's No.1 competitor Ruby (I think) seems to be much cleaner on this aspect (though other parts of it appear to be a mess to me). If colons are optional, I think there might be one fewer reason to reject Python if they are to choose. Remember Python tries to be an easy language for beginners.

    Given that colons add little to readability, that they are easy to forget, that they require more keystrokes, and that they interrupt mental process, I think it is pretty harmless to make colons optional. Well, what could we possibly lose? If you like them you can continue use them, but if you don't like them, it should be possible to avoid them.

    ReplyDelete
  12. I feel the python principle is the principle of playing. It is about making it easy for people/things to play with each other. With this goal, you have to do those things that make python python.
    http://freestone.wordpress.com/2007/12/13/python-principle/

    http://freestone.wordpress.com/2009/01/18/python-design-philosophy-and-the-principle-of-playing/

    http://freestone.wordpress.com/2008/03/20/build-a-better-playground/

    ReplyDelete
  13. i must say that this is really, really cool. please, tell me more!

    ReplyDelete
  14. If you want to discuss colons, please do it in python-ideas@python.org.

    ReplyDelete
  15. @oscar, in your raw string, what would happen if you wanted to have both a single and a double quote? without any means to escape the one which started the string. For me, this enough of a rationale.

    ReplyDelete
  16. I've learned a lot from your website sir. It helps me doing my report regarding some backgrounds and history on python PL. As a matter of fact, we've been using python in creating simple programs. We are assigned as a group to do research on python, and as of now our end term requirement is to do a simple application using python. We decided to do a calendar that caters setting of reminders, with alarms. The problem is that, we don't know yet how to set reminders on the numbers of the calendar. We have to study more about it. If you have time to answer this sir, well it's our pleasure. :-) tnx!

    ReplyDelete

Note: Only a member of this blog may post a comment.