Tuesday, January 13, 2009

Introduction and Overview


Python is currently one of the most popular dynamic programming languages, along with Perl, Tcl, PHP, and newcomer Ruby. Although it is often viewed as a "scripting" language, it is really a general purpose programming language along the lines of Lisp or Smalltalk (as are the others, by the way). Today, Python is used for everything from throw-away scripts to large scalable web servers that provide uninterrupted service 24x7. It is used for GUI and database programming, client- and server-side web programming, and application testing. It is used by scientists writing applications for the world's fastest supercomputers and by children first learning to program.
In this blog, I will shine the spotlight on Python's history. In particular, how Python was developed, major influences in its design, mistakes made, lessons learned, and future directions for the language.

Acknowledgment: I am indebted to Dave Beazley for many of the better sentences in this blog. (For more on the origins of this blog, see my other blog.)

A Bird's Eye View of Python

When one is first exposed to Python, they are often struck by way that Python code looks, at least on the surface, similar to code written in other conventional programming languages such as C or Pascal. This is no accident---the syntax of Python borrows heavily from C. For instance, many of Python's keywords (if, else, while, for, etc.) are the same as in C, Python identifiers have the same naming rules as C, and most of the standard operators have the same meaning as C. Of course, Python is obviously not C and one major area where it differs is that instead of using braces for statement grouping, it uses indentation. For example, instead of writing statements in C like this
if (a < b) {
max = b;
} else {
max = a;
Python just dispenses with the braces altogether (along with the trailing semicolons for good measure) and uses the following structure
if a < b:
max = b
max = a
The other major area where Python differs from C-like languages is in its use of dynamic typing. In C, variables must always be explicitly declared and given a specific type such as int or double. This information is then used to perform static compile-time checks of the program as well as for allocating memory locations used for storing the variable’s value. In Python, variables are simply names that refer to objects. Variables do not need to be declared before they are assigned and they can even change type in the middle of a program. Like other dynamic languages, all type-checking is performed at run-time by an interpreter instead of during a separate compilation step.

Python’s primitive built-in data types include Booleans, numbers (machine integers, arbitrary-precision integers, and real and complex floating point numbers), and strings (8-bit and Unicode). These are all immutable types, meaning that values are represented by objects that cannot be modified after their creation. Compound built-in data types include tuples (immutable arrays), lists (resizable arrays) and dictionaries (hash tables).

For organizing programs, Python supports packages (groups of modules and/or packages), modules (related code grouped together in a single source file), classes, methods and functions. For flow control, it provides if/else, while, and a high-level for statement that loops over any “iterable” object. For error handling, Python uses (non-resumable) exceptions. A raise statement throws an exception, and try/except/finally statements specify exception handlers. Built-in operations throw exceptions when error conditions are encountered.

In Python, all objects that can be named are said to be "first class." This means that functions, classes, methods, modules, and all other named objects can be freely passed around, inspected, and placed in various data structures (e.g., lists or dictionaries) at run-time. And speaking of objects, Python also has full support for object-oriented programming including user-defined classes, inheritance, and run-time binding of methods.

Python has an extensive standard library, which is one of the main reasons for its popularity. The standard library has more than 100 modules and is always evolving. Some of these modules include regular expression matching, standard mathematical functions, threads, operating systems interfaces, network programming, standard internet protocols (HTTP,FTP, SMTP, etc.), email handling, XML processing, HTML parsing, and a GUI toolkit (Tcl/Tk).

In addition, there is a very large supply of third-party modules and packages, most of which are also open source. Here one finds web frameworks (too many to list!), more GUI toolkits, efficient numerical libraries (including wrappers for many popular Fortran packages), interfaces to relational databases (Oracle, MySQL, and others), SWIG (a tool for making arbitrary C++ libraries available as Python modules), and much more.

A major appeal of Python (and other dynamic programming languages for that matter) is that seemingly complicated tasks can often be expressed with very little code. As an example, here is a simple Python script that fetches a web page, scans it looking for URL references, and prints the first 10 of those.
# Scan the web looking for references

import re
import urllib

regex = re.compile(r'href="([^"]+)"')

def matcher(url, max=10):
"Print the first several URL references in a given url."
data = urllib.urlopen(url).read()
hits = regex.findall(data)
for hit in hits[:max]:
print urllib.basejoin(url, hit)

This program can easily be modified to make a web crawler, and indeed Scott Hassan has told me that he wrote Google’s first web crawler in Python. Today, Google employs millions of lines of Python code to manage many aspects of its operations, from build automation to ad management (Disclaimer: I am currently a Google employee.)

Underneath the covers, Python is typically implemented using a combination of a bytecode compiler and interpreter. Compilation is implicitly performed as modules are loaded, and several language primitives require the compiler to be available at run-time. Although Python’s de-facto standard implementation is written in C, and available for every imaginable hardware/software platform, several other implementations have become popular. Jython is a version that runs on the JVM and has seamless Java integration. IronPython is a version for the Microsoft .NET platform that has similar integration with other languages running on .NET. PyPy is an optimizing Python compiler/interpreter written in Python (still a research project, being undertaken with EU funding). There’s also Stackless Python, a variant of the C implementation that reduces reliance on the C stack for function/method calls, to allow co-routines, continuations, and microthreads.


  1. Very interesting so far. I'd also be curious to find out what elements of Python's syntax were borrowed from other languages. As you note, "syntax of Python borrows heavily from C." But I know that it was influenced by more than just C. Any chance you could address this in a future post?

  2. Thanks for taking the time to do this. I'm looking forward to it.

  3. Nice, I'm looking forward to reading the whole series.

  4. Great start. I guess this will add more popularity to the Python "lore".

    Perhaps Python should include something like a "past" module, similar to the Zen of Python, where one can see a capsule of the history when it is imported :)

    Btw, will this blog comment only on the past or will it give some indications to the future of Python too ?

  5. It appears the first Google crawler was written in Java:


  6. Waiting for more. Thanks for Python ;)

  7. Guido,

    I'm glad to see you posting this! I learned a LOT about Python when I gave much of this material an editing round last year. It was an honor to work on it--so thanks for the shout out :-).

    For everyone else reading, I'll just say there is a lot of interesting stuff yet to come.

    --Dave B.

  8. @RichB: that was before Google was founded. Apparently Larry never got his question answered, and switched to Python... (True, with a C extension for parsing HTML.)

  9. Guido,

    Thanks for all your work. Where I used to write words, I now write Python code. I look forward to reading more about the genesis of python, and using it as I improve my website.

  10. Guido,

    Thank you for your interest in recalling all your basic works and presenting for us. It is a great opportunity to the new comers and the developers who have been pushed into the python pool as a part time user and becomes a full time developer in python.

  11. Guido,

    Thank you for your awesome article. I translated the current two articles at here to Chinese and put on my blog. Hope you do not mind.

    The URL of my translated this article is http://blog.cnliufeng.com/2009/01/translate-introduction-and-overview/

    Here is another one: http://blog.cnliufeng.com/2009/01/translate-pythons-design-philosophy/

    If you are not pleased for them, please leave a word here and I'll remove them at once.

  12. I wonder what would happen if I try to:

    from __future__ import braces

  13. I traslated this article to spanish and I'll try to translate the others as well.

    Spanish translation: http://www.juanjoconti.com.ar/2009/01/20/la-historia-de-python-introduccion-y-repaso/

  14. @Guido,

    you can link the translations at the bottom of your posts, as Paul Graham does. It's very handy for readers.

  15. Guido, assuming you took computer science 101, why do you call an Array a "list" ? Is it because they are implemented as linked lists ?

  16. I remember seeing Alex Martelli's talk about history of recent (then) Python releases. It would be a great lecture if he found time to put that down on this blog.

  17. Thank u Guido.I am a student and i have been tryin since last few days to implementa search engine in python.And this would definitely be a starting place for me.


Note: Only a member of this blog may post a comment.