42

Note: The title is deliberately provocative (to make you click on it and want to close-vote the question) and I don't want to look preoccupied.

I've been reading and hearing more and more about PyPy. It's like a linear graph.

  • Why is PyPy so special? As far as I know implementations of dynamic languages written in the languages itself aren't such a rare thing, or am I not getting something?

  • Some people even call PyPy "the future" [of python], or see some sort of deep potential in this implementation. What exactly is the meaning of this?

mXed
  • 71
  • 8
sub
  • 2,653
  • 3
  • 27
  • 29

5 Answers5

47

Good thing to be aware when talking about the PyPy project is that it aims to actually provide two deliverables: first is JIT compiler generator. Yes, generator, meaning that they are implementing a framework for writing implementations of highly dynamic programming languages, such as Python. The second one is the actual test of this framework, and is the PyPy Python interpreter implementation.

Now, there are multiple answers why PyPy is so special: the project development is running from 2004, started as a research project rather than from a company, reimplements Python in Python, implements a JIT compiler in Python, and can translate RPython (Python code with some limitations for the framework to be able to translate that code to C) to compiled binary.

The current version of PyPy is 99% compatible with CPython version 2.5, and can run Django, Twisted and many other Python programs. There used to be a limitation of not being able to run existing CPython C extensions, but that is also being addressed with cpyext module in PyPy. C API compatibility is possible and to some extent already implemented. JIT is also very real, see this pystone comparison.

With CPython:

Python 2.5.5 (r255:77872, Apr 21 2010, 08:44:16) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from test import pystone
>>> pystone.main(1000000)
Pystone(1.1) time for 1000000 passes = 12.28
This machine benchmarks at 81433.2 pystones/second

With PyPy:

Python 2.5.2 (75632, Jun 28 2010, 14:03:25)
[PyPy 1.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``A radioactive cat has 18
half-lives.''
>>>> from test import pystone
>>>> pystone.main(1000000)
Pystone(1.1) time for 1000000 passes = 1.50009
This machine benchmarks at 666625 pystones/second

So you can get a nearly 10x speedup just by using PyPy on some calculations!

So, as PyPy project is slowly maturing and offering some advantages, it is attracting more interest from people trying to address speed issues in their code. An alternative to PyPy is unladden swallow (a Google project) which aims to speed up CPython implementation by using LLVM's JIT capabilities, but progress on unladden swallow was slowed because the developer needed to deal with bugs in LLVM.

So, to sum it up, I guess PyPy is regarded the future of Python because it's separating language specification from VM implementation. Features introduced in, eg. stackless Python, could then be implemented in PyPy with very little extra effort, because it's just altering language specification and you keep the shared code the same. Less code, less bugs, less merging, less effort.

And by writing, for example, a new bash shell implementation in RPython, you could get a JIT compiler for free and speed up many linux shell scripts without actually learning any heavy JIT knowledge.

user137673
  • 1,517
  • 11
  • 7
  • 13
    Let me get this straight. A Python interpreter written in Python and running on another Python interpreter written in say, C, can execute other Python programs faster than the one it is itself running on? If so, it seem like you could do that for a few more levels and have something unbelievably fast... – martineau Nov 23 '10 at 13:27
  • 13
    The PyPy Python interpreter isn't fast when you run it on CPython. It's fast when you bootstrap it from another Python interpreter to be able to run *by itself*. So adding more interpreters to the stack won't make anything faster, sorry. :) – Jean-Paul Calderone Jan 19 '11 at 14:27
  • 4
    To clear this up: the PyPy interpreter is an executable. It is written in RPython, but that RPython is compiled by the PyPy translator. You do essentially bootstrap it when you compile the PyPy interpreter (because you'll probably be using CPython to run the translator that compiles the interpreter) but once the interpreter is compiled, it's stand-alone. So just because it's written in Python doesn't mean it has to be run on a Python interpreter. – John Thompson Apr 23 '12 at 18:32
  • @martineau - Yes. Like the film Inception but in reverse. A python in a python in a python, as carefully documented here http://mythologian.net/ouroboros-symbol-of-infinity . – Robino Nov 07 '17 at 09:04
22

Lets see it this way... Suppose you want to implement your own dynamic language, and you want to make it fast. You have two options: the hard way and pypy.

The hard way means writing your interpreter in c, and then implement a jit by hand, also in c, by using a mix of very complicated techniques such a method-jit, threaded-jit, tracing jit, polymorphic inline caches, loop invariant motion, etc, etc... Spent several years tuning it up and if you persevere a lot and you don't give up, you may end up with a fast dynamic language implementation.

Or, you can use the pypy framework. That means, writing your interpreter in python instead of c (actually, it would be rpython, a more limited subset of python that can be compiled to c). Once you wrote your interpreter, pypy will automatically generate a jit for free. And you are basically done.

Sounds good, huh?

Luis
  • 221
  • 2
  • 2
  • +1 for `Suppose you want to implement your own dynamic language, and you want to make it fast. You have two options: the hard way and pypy.` – IronManMark20 Apr 01 '15 at 03:04
11

The cool thing about PyPy (aside from being fast and written in RPython (a subset of the Python language) so basically bootstrapped, is that it can provide an automatically created JIT (just in time compiler) for any program you write in PyPy: this makes it ideal to implement, quickly, your own language and have it be rather fast.

Read more here

Isaac
  • 15,783
  • 9
  • 53
  • 76
5

Not to mention that they just recently exceeded the speed of CPython on some benchmarks. See their blog, I think. I can't reach it from here:

http://morepypy.blogspot.com/

gomad
  • 1,029
  • 7
  • 16
4

Since most of us agree that it's easier to write Python than C, a Python interpreter that's written in Python (well, technically RPython) should be able to be modified much easier and with less bugs than CPython.

Xiong Chiamiov
  • 13,076
  • 9
  • 63
  • 101