With PyPy 4.0 just out in October, it’s worth a look at it in comparison with Python. Note although PyPy 4 claims compatibility with Python 3.2.5, I’ve kept the comparisons for benchmarks to Python 2.7.*. The Python 3 on my system is 3.4 which PyPy hasn’t achieved yet. PyPy 4 includes some SIMD vectorization at run-time so if possible it will use it to speed up code.
What is PyPy?
The programming language Python by default uses an interpreter called CPython. The name comes about because CPython interpreter is written in C; CPython doesn’t convert Python to C. It actually converts Python code to Bytecode (like Java) then interprets that code. There are other Python interpreters out there. Whichever one you use (e.g. IronPython on .NET, Jython which actually does compile Python code to Java bytecode), they all run Python.
Then there is PyPy, an interpreter of Python that is written in Python and yet runs Python code faster than CPython which is written in C yet PyPy is faster! So how does this work and why doesn’t everybody use it?
How Much faster than cPython is PyPy?
According to the According to the PyPy benchmarks, it’s around seven times faster according to their benchmarks. Now normal Python is no slouch but a speedup of seven times would be pretty amazing. How does PyPy do this and what are its limitations?
PyPy is written in rpython, where the r stands for restricted and you can read about it in their online documentation. There are some constraints on Python that PyPy employs so that it can infer types at runtime and optimise accordingly. You need to read the documentation as to what’s allowed and what isn’t. The ultimate proof is of course whether PyPy works with your code.
An interesting architectural feature of PyPy is object spaces. This is according to their documentation as being a library with an API, a set of operations that are like Python objects. The Standard Object Space is a complete implementation of the various built-in types and objects of Python. This architecture allows you to add any feature for proxying, extending, changing or otherwise controlling the behaviour of all objects in a running program.
I’d installed Ubuntu 14.04 LTS on my Windows box using the free virtualisation tool VirtualBox and setup both Python 2.7 and PyPy on that for running tests. It’s a pretty fast CPU and with little else running and 64 GB of ram, makes a good test bed. Sure you can run Python etc. on windows but for this kind of stuff I prefer to use Linux. The PyPy version running on Python 2.7.10 is a fairly recent 4.0.1+dfsg-1~ppa1~ubuntu14.04, Nov 20 2015, 19:34:15.
My first test involved a simple Fibonacci generator. In Python a Generator looks like a function, i.e. it’s defined by def but instead of returning a value, it yields it instead. The difference in execution though is that it retains internal values across calls. For a function to be a generator it just needs the keyword yield and when the next() method is called it returns a value in the yield statement and saves all the local variables. There’s a bit more to it but that’s generators.
So I created a simple Fibonacci generator and called it a few hundred times to let it spend a bit of time and I used time.time() to get a start and end time. The code is listed below.
beforePrevious = 1
previous = 1
nextValue = beforePrevious + previous
beforePrevious = previous
previous = nextValue
#count = count +1
#if count %100==0:
# print count,nextValue
generator = fibonacci()
for i in xrange(750):
Uncomment the lines with count to see the values. It’s pretty quick executing code but surprisingly the PyPy execution takes three times as long.
A Better test
You can’t judge based on a single benchmark so I looked around for a better and more complicated example. A developer by the name of of Chad Dotson had done a node.js vs Python vs PyPy comparison a year ago so I thought I’d try out his code which solved the classic problem of placing of 8 queens on an 8 x 8 chess board so no Queen attacked any other.
The board size is set by a command line parameter (add -d if you want to see the boards visually) but the number of solutions grows exponentially so keep it in the range 8-14 or you’ll be waiting a long time for it to finish. While there are 92 solutions for 8 x 8, it gets 724 on 10 x 10 and 14,200 for 12 x 12. 14 x 14 took 35.718 seconds in PyPy for 365,596 solutions, and I didn’t bother trying that out in Python.
For 10 x 10 Python took 0.38809 seconds while PyPy was 0.18674, for 11 x 11 the times were 1.9754 (Python) and 0.35597 (PyPy). For 12 x 12 it was 11.96654 (Python) and 1.26897 (PyPy). These figures show that PyPy works better with longer more complex calculations, possibly due to PyPy optimising as it goes along. At the other extreme for 8 x 8, PyPy took 0.10894 seconds but Python was a nippy 0.02092 seconds, almost five times faster!
I tried this program below to read and parse the Linux password file, but again I think it was too simple even when run 500 times in a loop. The PyPy version took about 50% longer to run. Uncomment the print line to see the output.
start_time = time.time()
for i in range(500):
pswd = file( “/etc/passwd”, “r” )
for aLine in pswd:
fields= aLine.split( “:” )
#print fields, fields
print “Time = “,time.time()-start_time
Despite a couple of slower than Python examples I was very impressed with PyPy. The actual speed gains depends on how much of the code is pure Python and that gets the best gains. If it’s calling a C coded function then don’t expect as much.
You can run PyPy in a browser on the website pypyjs.org. It’s been compiled with Emscriptem into asm.js.