797

I've been hearing a lot about the PyPy project. They claim it is 6.3 times faster than the CPython interpreter on their site.

Whenever we talk about dynamic languages like Python, speed is one of the top issues. To solve this, they say PyPy is 6.3 times faster.

The second issue is parallelism, the infamous Global Interpreter Lock (GIL). For this, PyPy says it can give GIL-less Python.

If PyPy can solve these great challenges, what are its weaknesses that are preventing wider adoption? That is to say, what's preventing someone like me, a typical Python developer, from switching to PyPy right now?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
chhantyal
  • 11,874
  • 7
  • 51
  • 77
  • 42
    Purged comments because most were things that should either be fleshed out in answers (and in some cases are), or shouldn't be said at all. Also edited to address a couple of the concerns raised regarding the subjectivity of this question. **Please try to answer using facts, and back up assertions with sources if possible!** – Shog9 Sep 23 '13 at 23:14
  • 5
    I've been using Pypy a lot. It tends to work very well. However, while Pypy is quite a bit faster for many CPU-heavy workloads, it's actually slower for the I/O-heavy workloads I've thrown at it. For example, I wrote a deduplicating backup program called backshift. For an initial backup, which does lots of file chunking, pypy is great. But for subsequent backups which are mostly just updating timestamps, CPython is faster. – dstromberg Oct 24 '14 at 20:16

12 Answers12

722

NOTE: PyPy is more mature and better supported now than it was in 2013, when this question was asked. Avoid drawing conclusions from out-of-date information.


  1. PyPy, as others have been quick to mention, has tenuous support for C extensions. It has support, but typically at slower-than-Python speeds and it's iffy at best. Hence a lot of modules simply require CPython. Check the list of supported packages, but look at the date that list was updated, because it's not not kept in lockstep with actual support, so it's still possible that packages that marked unsupported on that list are actually supported.
  2. Python support typically lags a few versions behind, so if you absolutely need the latest features, you may need to wait a while before PyPy supports them.
  3. PyPy sometimes isn't actually faster for "scripts", which a lot of people use Python for. These are the short-running programs that do something simple and small. Because PyPy is a JIT compiler its main advantages come from long run times and simple types (such as numbers). PyPy's pre-JIT speeds can be bad compared to CPython.
  4. Inertia. Moving to PyPy often requires retooling, which for some people and organizations is simply too much work.

Those are the main reasons that affect me, I'd say.

Mike 'Pomax' Kamermans
  • 49,297
  • 16
  • 112
  • 153
Veedrac
  • 58,273
  • 15
  • 112
  • 169
  • 19
    Nice that you mention retooling. My web host, for example, has a choice between Python 2.4 and 2.5; and a "major producer of entertainment software" near me is using 2.6 with no plans to upgrade soon. Sometimes it can be a major, costly effort to even discover the cost of a conversion. – Mike Housky Sep 22 '13 at 18:46
  • 23
    PyPy being "as fast as C" is more about generic C than highly optimized multithreaded cache-aware C libraries used for numerics. For numerics, Python is just used to ferry around pointers to big arrays. So PyPy being "as fast as C" means "your pointers+metadata get moved around as fast as C". Not a big deal. Then why bother with Python at all? Go look at the function signatures in cblas and lapacke. – cjordan1 Sep 22 '13 at 20:23
  • @MikeHousky: That's horrible. BOTH those versions are incredibly old. 2.7 is the most recent Python2 version and you really want 2.6+ as a developer to not go mad quickly. – ThiefMaster Sep 22 '13 at 20:35
  • 13
    @cjordan1: I don't get what you're saying. The high level numpy constructs are extremely expressive (`np.sum(M[1:2*n**2:2, :2*n**2] * M[:2*n**2:2, :2*n**2].conjugate(), axis=1)`?) in Python and that makes Python very suitable for the scientific community. Additionally, doing the non-intensive parts in Python and shelling out to C for the smaller intensive loops is a common and usable strategy. – Veedrac Sep 22 '13 at 20:35
  • 27
    @Veedrac That's what I meant. As in "Go look at the function signatures in cblas and lapacke" because they're so long and difficult to use that you'll instantly understand why we use Python to ferry around the pointers and metadata. – cjordan1 Sep 22 '13 at 20:39
  • 1
    @Veedrac Those constructs are powerful, but translating them to the C implementation calls probably doesn't take that much time. I believe what cjordan1 meant to say by "just used to ferry around pointers" was that no *substantial* work is done in Python, which holds true. (Maybe using "mostly" instead of "just" would've been better.) – millimoose Sep 22 '13 at 21:25
  • 1
    @millimoose I'm not sure what you mean by "substantial" but a lot of Numpy/Pandas/etc. *is* in Python, just not the stuff that needs to be fast. Heck, Pandas is even in Cython. – Veedrac Sep 22 '13 at 21:36
  • 1
    PyPy isn't suppose/designed to run in non JITed mode. Analogical you are going to run your Java on JVM which supports JIT. – Robert Zaremba Sep 24 '13 at 10:11
  • @RobertZaremba I was talking about how the non-JITed portions of code, aka. when the code path hasn't warmed up, are quite slow relative to CPython (partially because of the overheads of the JIT, although Javascript manages to make-do). – Veedrac Sep 24 '13 at 10:15
  • 'Thirdly, PyPy isn't actually faster for "scripts"'? This is not true. Where did you pick up this misconception? – tommy.carstensen Apr 25 '15 at 20:43
  • 5
    @tommy.carstensen This isn't really a good place to go in depth, but I'll try. **1.** This was a lot more true when I wrote it than it is now. **2.** "Scripts" are oft IO-heavy. PyPy's IO is still often slower than CPython's - it used to be significantly slower. **3.** PyPy used to be slower than CPython at handling strings - now it's often better and rarely worse. **4.** Many "scripts" are just glue code - making the interpreter faster won't improve overall runtimes in that case. **5.** PyPy's warmup times used to be larger - short running scripts rarely managed to produce a lot of hot code. – Veedrac Apr 26 '15 at 08:11
  • 1
    @Veedrac I have a script for processing 20 GB of XML data, where PyPy is 15 times faster than Python. – ostrokach May 14 '16 at 22:22
  • 1
    @ostrokach Are you commenting on a particular claim I made? Do note a *lot* has changed since late 2013. – Veedrac May 14 '16 at 23:35
  • 4
    I remember reading this page a few weeks ago, and the message was that PyPy is not worth it. But then tried it on a script that I would usually have to leave overnight, and found that **PyPy makes a huge difference**. So I wanted to share this finding and encourage others to give it a try. The *PyPy isn't actually faster for "scripts"* is misleading... – ostrokach May 15 '16 at 00:07
  • 2
    @ostrokach I'm hesitant to change an answer significantly after the attention it's gotten (99.8% of votes are positive, so people agree with it rather strongly as phrased), but I did just update the phrasing there a little. – Veedrac May 15 '16 at 00:39
  • "tenuous support for C extensions" - is that still true? I've heard that there were improvements – Martin Thoma Mar 05 '20 at 01:09
  • 1
    In late 2020, **this answer is highly misleading and arguably unfactual.** *No* big ticket blockers inhibiting widespread migration from CPython to PyPy remain. "Tenuous support for C extensions" is no longer the case at all. It should also be noted that the [automated list of PyPy-compatible packages](http://packages.pypy.org) does *not* correspond to reality. Most packages listed as incompatible actually are – including [SciPy](https://stackoverflow.com/a/60289841/2809027), [scikit-learn](https://github.com/scikit-learn/scikit-learn/pull/11010), and Pandas. This needs a full-scale rewrite. – Cecil Curry Dec 19 '20 at 01:38
  • @CecilCurry I can empathize, but the answer has for a while headed with a note warning as much. While I don't want to change the overall gist like you're asking, I am happy to add clarifications and improve the header. I'll add something about the compatible packages list being incomplete. – Veedrac Dec 19 '20 at 07:03
114

That site does not claim PyPy is 6.3 times faster than CPython. To quote:

The geometric average of all benchmarks is 0.16 or 6.3 times faster than CPython

This is a very different statement to the blanket statement you made, and when you understand the difference, you'll understand at least one set of reasons why you can't just say "use PyPy". It might sound like I'm nit-picking, but understanding why these two statements are totally different is vital.

To break that down:

  • The statement they make only applies to the benchmarks they've used. It says absolutely nothing about your program (unless your program is exactly the same as one of their benchmarks).

  • The statement is about an average of a group of benchmarks. There is no claim that running PyPy will give a 6.3 times improvement even for the programs they have tested.

  • There is no claim that PyPy will even run all the programs that CPython runs at all, let alone faster.

spookylukey
  • 6,380
  • 1
  • 31
  • 34
  • 20
    Of course there is no claim that PyPy will run all Python code faster. But if you take all pure Python application I can bet that significant majority of them will run much faster (>3x times) on PyPy then on CPython. – Robert Zaremba Sep 24 '13 at 10:07
  • 29
    Neither of your first two bullet points make sense. How can you say that benchmarks say "absolutely nothing about your program". It's pretty obvious that benchmarks aren't a perfect indicator of all real applications, but they can definitely be useful as an indicator. Also I don't understand what you find misleading about them reporting the average of a group of benchmarks. They state pretty clearly it's an average. If a programmer doesn't understand what an average is then they have much more serious concerns than language performance. – Sean Geoffrey Pietz Apr 13 '14 at 23:12
  • 1
    Also your claim "That site does not claim PyPy is 6.3 times faster than CPython" seems purely semantic. When he says that, there is a pretty clear (at least to me) implication he is talking about an average. if the benchmarks aren't 6.3 times faster, then how much faster are they? – Sean Geoffrey Pietz Apr 13 '14 at 23:18
  • 9
    @SeanGeoffreyPietz - I wasn't claiming PyPy's site was in any way misleading - they have presented their results accurately. But the original question misquoted them, and was demonstrating that the author didn't understand the importance of the word 'average'. Many of the individual benchmarks are not 6.3 times faster. And if you use a different type of average you get a different value, so "6.3 x faster" is not an adequate summary of "geometric average is 6.3 x faster". "Group A is Z times faster than group B" is too vague to be meaningful. – spookylukey Apr 17 '14 at 20:09
  • 13
    -1: @spookylukey You seem to suggest that the benchmark suite is biased without providing evidence to support the claim. Criticism should always be backed up with evidence! – Evgeni Sergeev Jul 25 '14 at 00:32
  • 9
    @EvgeniSergeev - no, I'm implying that all benchmarks are biased! Not necessarily deliberately, of course. The space of possible useful programs is infinite and incredibly varied, and a set of benchmarks only ever measures the performance on those benchmarks. Asking "how much faster is PyPy than CPython?" is like asking "how much faster if Fred than Joe?", which is what the OP seems to want to know. – spookylukey Jul 25 '14 at 09:58
  • 1
    Here an example for PyPy slower than CPython 3 : This code took 218 ms on PyPy but took 124 ms on Python 3 : http://codeforces.com/contest/278/submission/9820817 , this is rare example, most other code take much less time on PyPy than Python – Mohamed El-Nakeep Feb 14 '15 at 20:57
  • 1
    The benchmarks shows the main parts the have progress and speed ove CPython. However on their same website they acknowledge lagging in speed of PyPy compared to CPython in 4 cases, namely: CPython C extension modules, Missing RPython modules, Abuse of itertools and Ctypes. These cases are the focus of their future work. – Mohamed El-Nakeep Feb 14 '15 at 23:29
  • To clarify my answer, I'm talking about the statistics fallacy known as the danger of summary metrics. As an example, in one of my Python libraries, PyPy runs the test suite about 3 times slower than CPython, yet benchmarks on the same library show PyPy to be 2 to 10 faster. This is completely normal for PyPy. It is pointless to try and represent this using a single summary metric - and that's just a single library – spookylukey Aug 27 '22 at 07:05
90

Because pypy is not 100% compatible, takes 8 gigs of ram to compile, is a moving target, and highly experimental, where cpython is stable, the default target for module builders for 2 decades (including c extensions that don't work on pypy), and already widely deployed.

Pypy will likely never be the reference implementation, but it is a good tool to have.

Tritium21
  • 2,845
  • 18
  • 27
  • 2
    According to http://pypy.org/download.html, PyPy needs 4 GB of RAM to compile (on a 64-bit system), not 8. And there's an option on that page to do it under 3 GB if needed. – knite Oct 29 '15 at 23:11
  • 4
    @knite 1: that's new as of 2015, the documentation has historically read 8 GB. 2: in practice in 2015 you still need at least 8, with 6-7 free. – Tritium21 Oct 29 '15 at 23:28
  • 5
    The memory requirement to compile is not so relevant if you use a [build or distribution](http://pypy.org/download.html). As to "moving target, and highly experimental", can you give a couple of examples of stuff that breaks? Again, if people are using release builds rather than nightly builds or source, don't they have a reasonable expectation of functionality? – smci Apr 22 '17 at 11:55
  • 1
    @smci This is an ancient question based on ancient data, with ancient answers. Consider this question and every answer to be historical for the state of pypy 4 years ago. – Tritium21 Apr 23 '17 at 01:06
  • 2
    @Tritium21: I'm only interested in the current answer. What is it? You might like to edit your answer to say *"As of 2013, comparing pypy vs version 2.x of Python was..."* Also if the "6.3x geometric-average" claim in the question is out-of-date ([as of 4/2017 they claim 7.5x, but even then depends on the benchmarks...](http://speed.pypy.org)), then that needs editing too (version numbers, latest data, etc.) I think the benchmark suite is not very relevant, hardly anyone would run raytracing in a scripting language on a CPU these days. I did find https://pybenchmarks.org – smci Apr 23 '17 at 01:15
  • @smic For a more up to date answer, ask the question again, and try closing this one as a duplicate of it. The question itself is out of date. – Tritium21 Apr 23 '17 at 01:21
41

The second question is easier to answer: you basically can use PyPy as a drop-in replacement if all your code is pure Python. However, many widely used libraries (including some of the standard library) are written in C and compiled as Python extensions. Some of these can be made to work with PyPy, some can't. PyPy provides the same "forward-facing" tool as Python --- that is, it is Python --- but its innards are different, so tools that interface with those innards won't work.

As for the first question, I imagine it is sort of a Catch-22 with the first: PyPy has been evolving rapidly in an effort to improve speed and enhance interoperability with other code. This has made it more experimental than official.

I think it's possible that if PyPy gets into a stable state, it may start getting more widely used. I also think it would be great for Python to move away from its C underpinnings. But it won't happen for a while. PyPy hasn't yet reached the critical mass where it is almost useful enough on its own to do everything you'd want, which would motivate people to fill in the gaps.

BrenBarn
  • 242,874
  • 37
  • 412
  • 384
  • 20
    I dont think C is a language that is going anywhere any time soon (I would be willing to say, it will not disappear in our lifetime). until there is another language that will run anywhere, we will have C. (note, the JVM is written in C. Even java, the language that "runs everywhere" needs C for its everywhereness.) Otherwise I agree with this post on most of its points. – Tritium21 Sep 22 '13 at 17:35
  • 9
    @Tritium21: Yeah, I'm just editorializing there. I'm fine with C existing, but I think Python's dependence on C is hugely detrimental, and PyPy is a great example of why: now we have the chance to get faster Python, but we're tripped up by years of relying on C. It'd be much better for Python to stand on its own two feet. It's even okay if Python itself is written in C, but the problem is the existence of an extension mechanism that encourages people to extend Python in ways that depend on C. – BrenBarn Sep 22 '13 at 17:39
  • 4
    double edge sword on that - part of what made python so popular is its ability to extend other applications and be extended by other applications. If you take that away, I don't think we would be talking about python. – Tritium21 Sep 22 '13 at 17:42
  • If you take away C support we wouldn't a lot of C bindings to lower level libraries that only exist because it's easy to do. Stuff like Xlib is useful, and dropping support for that to help JIT'd interpreters is a bit backwards. Let the demand speak for itself, I say. Maybe it'd be different if PyPy's JIT was as fast as modern overfunded Javascript interpreters'. – Veedrac Sep 22 '13 at 17:46
  • 12
    @BrenBarn It is utter folly to claim that Python's dependence on C is detrimental. Without the C-API of Python, most of the really powerful libraries and great interop that Python gained in its formative teenage years (late 90s), including the entire numeric/scientific ecosystem and GUI interfaces, would not have been possible. Look around to get some perspective on the whole universe of usages of Python, before making such blanket statements. – Peter Wang Sep 23 '13 at 03:51
  • 4
    @PeterWang All those libraries can be written in Python, however they wouldn't be as fast as they are. What BrenBarn is saying is that now we have a chance to make python fast enough so that those libs can be written in python but we are refusing to take that chance, because taking it means losing the ability to use the C libraries. I believe that's what he meant by detrimental, not that the existence of C libraries is a bad thing but that the only way to make fast libraries is using C. – vikki Sep 23 '13 at 11:30
  • 1
    Right, basically what vikki said. It's been great for Python that C interfacing is *possible*. What has been bad for Python is that C interfacing was *necessary* for so many things, and what is still bad for Python is that C interfacing is now even more necessary due to accumulated dependence on C-based modules. – BrenBarn Mar 11 '14 at 20:30
  • How difficult would it be to write a JIT compiler for python that supports C extensions? – Sean Geoffrey Pietz Apr 13 '14 at 23:28
  • 1
    @SeanGeoffreyPietz The JIT wouldn't be able to reason about the C part, and thus probably can't safely optimize away a lot of calls. This might make it slower than CPython. Oh, and it would have to expose the same API as CPython, which would probably gum up a lot of things. – leewz Apr 14 '14 at 23:39
  • @leewangzhong I'm not very knowledgable about compilers so please excuse me if this is a dumb question, but how come LuaJIT is able to achieve a tracing JIT compiler that works with a C api if PyPy can't? – Sean Geoffrey Pietz Apr 14 '14 at 23:47
  • @SeanGeoffreyPietz I'm just saying reasons why it wouldn't be as performant as they'd like, and they REALLY like performance. Looks like PyPy DOES have something similar to Lua's C interface, since CFFI is from Lua's FFI. http://cffi.readthedocs.org/en/release-0.8/ – leewz Apr 15 '14 at 00:18
  • @SeanGeoffreyPietz Your question should probably be posted as one, but from what I'm seeing, the issue is that LuaJIT's C API is for Lua interacting with C code, while CPython's C modules are for C messing around with the Python runtime. Lua also had something like CPython C modules, and LuaJIT's C is NOT made with the same perspective as the C modules (I think). – leewz Apr 15 '14 at 00:24
  • More info: http://pypy.readthedocs.org/en/latest/faq.html#do-cpython-extension-modules-work-with-pypy – leewz Apr 15 '14 at 00:26
15

I did a small benchmark on this topic. While many of the other posters have made good points about compatibility, my experience has been that PyPy isn't that much faster for just moving around bits. For many uses of Python, it really only exists to translate bits between two or more services. For example, not many web applications are performing CPU intensive analysis of datasets. Instead, they take some bytes from a client, store them in some sort of database, and later return them to other clients. Sometimes the format of the data is changed.

The BDFL and the CPython developers are a remarkably intelligent group of people and have a managed to help CPython perform excellent in such a scenario. Here's a shameless blog plug: http://www.hydrogen18.com/blog/unpickling-buffers.html . I'm using Stackless, which is derived from CPython and retains the full C module interface. I didn't find any advantage to using PyPy in that case.

Eric Urban
  • 3,671
  • 1
  • 18
  • 23
  • 1
    PyPy has many, carefully run [benchmarks](http://speed.pypy.org/changes/) (unlike CPython unfortunately, which doesn't really have a user-facing benchmark suite at the moment). Of course for network traffic PyPy can't magically make anything faster. – Julian Sep 22 '13 at 19:23
  • 2
    Julian, it's worth noting that the PyPy folks have been focusing a lot of effort on improving the runtimes of that particular benchmark suite for years now. To some degree it seems that they are "overfitting" their optimizations to this set of benchmarks and, in my experience, aside from purely numerical computations (which are better off in Fortran or C99 anyway), I've never gotten PyPy to be more than ~2X faster than CPython. – Alex Rubinsteyn Sep 22 '13 at 19:32
  • 9
    @AlexRubinsteyn But the view of those working on PyPy has always generally been that if you find a case where PyPy is slower than CPython, and you can turn it into a reasonable benchmark, it has a good chance of being added to the suite. – gsnedders Sep 22 '13 at 22:42
  • 1
    I checked your blog. In your results, the plain-python pair of (pickle, StringIO) shows that pypy is ~6.8x faster over cpython. I think this is a useful result. In your conclusion, you point out (correctly) that pypy code (which is plain python!) is slower than C code (cPickle, cStringIO), not cpython code. – Caleb Hattingh Feb 12 '14 at 21:31
  • 2
    @gsnedders I have offered a benchmark based on [rinohtype](http://www.mos6581.org/rinohtype/) on [multiple](https://mail.python.org/pipermail/pypy-dev/2015-August/013769.html) [occasions](https://bitbucket.org/pypy/pypy/issues/2365/rinohtype-much-slower-on-pypy3#comment-34810420). They have not yet added it to the suite. – Brecht Machiels Apr 07 '17 at 13:16
15

Q: If PyPy can solve these great challenges (speed, memory consumption, parallelism) in comparison to CPython, what are its weaknesses that are preventing wider adoption?

A: First, there is little evidence that the PyPy team can solve the speed problem in general. Long-term evidence is showing that PyPy runs certain Python codes slower than CPython and this drawback seems to be rooted very deeply in PyPy.

Secondly, the current version of PyPy consumes much more memory than CPython in a rather large set of cases. So PyPy didn't solve the memory consumption problem yet.

Whether PyPy solves the mentioned great challenges and will in general be faster, less memory hungry, and more friendly to parallelism than CPython is an open question that cannot be solved in the short term. Some people are betting that PyPy will never be able to offer a general solution enabling it to dominate CPython 2.7 and 3.3 in all cases.

If PyPy succeeds to be better than CPython in general, which is questionable, the main weakness affecting its wider adoption will be its compatibility with CPython. There also exist issues such as the fact that CPython runs on a wider range of CPUs and OSes, but these issues are much less important compared to PyPy's performance and CPython-compatibility goals.


Q: Why can't I do drop in replacement of CPython with PyPy now?

A: PyPy isn't 100% compatible with CPython because it isn't simulating CPython under the hood. Some programs may still depend on CPython's unique features that are absent in PyPy such as C bindings, C implementations of Python object&methods, or the incremental nature of CPython's garbage collector.

11

CPython has reference counting and garbage collection, PyPy has garbage collection only.

So objects tend to be deleted earlier and __del__ is called in a more predictable way in CPython. Some software relies on this behavior, thus they are not ready for migrating to PyPy.

Some other software works with both, but uses less memory with CPython, because unused objects are freed earlier. (I don't have any measurements to indicate how significant this is and what other implementation details affect the memory use.)

pts
  • 80,836
  • 20
  • 110
  • 183
  • 20
    It should be stressed that relying on `__del__` being called early or at all is wrong even in CPython. As you put it, it *usually* works and some people take that to mean it's guaranteed. If anything that references the object is caught up in a reference cycle (which is rather easy - did you know that inspecting the current exception in a certain non-contrived way creates a reference cycle?) finalization is delayed indefinitely, until the next cycle GC (which may be **never**). If the object is itself part of a reference cycle, `__del__` will not be called *at all* (prior to Python 3.4). –  Sep 23 '13 at 13:08
  • 4
    Overhead per object is higher in CPython, which matters a LOT once you start creating lots of objects. I believe PyPy does the equivalent of __slots__ by default, for one thing. –  Jan 03 '14 at 19:55
7

For a lot of projects, there is actually 0% difference between the different pythons in terms of speed. That is those that are dominated by engineering time and where all pythons have the same amount of library support.

Stephan Eggermont
  • 15,847
  • 1
  • 38
  • 65
  • 1
    If your project is that simple, then obviously it doesn't matter, but the same could be said of any implementation of any language: if all you do is aggregate other libraries' functions via relatively performant ABIs, then it's all irrelevant. –  Jan 03 '14 at 19:54
  • 2
    It doesn't have anything to do with simple. In engineering time the feedback loop is important. Sometimes much more important than run time. – Stephan Eggermont Jan 03 '14 at 22:47
  • 1
    Well, you're speaking very vaguely (engineering time with no reference to what's being engineered, what the constraints are, etc.; feedback loop with no reference to what is being fed back to whom, etc.), so I'm going to bow out of this conversation rather than trade cryptic references. –  Jan 08 '14 at 23:34
  • 1
    Nothing vague here. Take a look at the OODA loop, or PDCA. – Stephan Eggermont Jan 20 '14 at 23:10
  • "A lot of projects" is vague, "difference" is vague, "speed" is vague, "library support" is vague. OODA and PDCA? I think you're just trolling now -- it's both vague in terms of whatever implementation you're talking about (assuming you really are talking about python code), AND they're two examples out of "a lot of projects", even IF you're talking about specific implementations. –  Jan 25 '14 at 09:22
  • Stephan. I have personal experience of PyPy being a LOT faster and a LOT less memory-intensive. My use case is a bit unusual (big data), but the additional "engineering time" that you're talking about was non-existent for me. The choice is no more complicated than running via CPython, or (trivially) installing PyPy and running via that. The later is a big win. If you're attempting to make an argument against that experience, some solid facts would help your case. The burden of proof is on you, since you're making claims that there's "0% difference". –  Feb 05 '14 at 09:24
  • You are doing big data, that is where execution speed and memory usage matters. – Stephan Eggermont Feb 05 '14 at 10:03
  • 4
    @user Well, any run once project that takes a month to write, and a minute to run, will have a overall 0.0% speed up (1month+1min vs 1month) from using PyPy, even if PyPy were a thousand times faster. Stephan wasn't claiming that all projects would have a 0% speed up. – gmatht Mar 24 '15 at 01:17
6

To make this simple: PyPy provides the speed that's lacked by CPython but sacrifices its compatibility. Most people, however, choose Python for its flexibility and its "battery-included" feature (high compatibility), not for its speed (it's still preferred though).

Yishen Chen
  • 569
  • 8
  • 12
5

I've found examples, where PyPy is slower than Python. But: Only on Windows.

C:\Users\User>python -m timeit -n10 -s"from sympy import isprime" "isprime(2**521-1);isprime(2**1279-1)"
10 loops, best of 3: 294 msec per loop

C:\Users\User>pypy -m timeit -n10 -s"from sympy import isprime" "isprime(2**521-1);isprime(2**1279-1)"
10 loops, best of 3: 1.33 sec per loop

So, if you think of PyPy, forget Windows. On Linux, you can achieve awesome accelerations. Example (list all primes between 1 and 1,000,000):

from sympy import sieve
primes = list(sieve.primerange(1, 10**6))

This runs 10(!) times faster on PyPy than on Python. But not on windows. There it is only 3x as fast.

lifolofi
  • 129
  • 1
  • 7
4

PyPy has had Python 3 support for a while, but according to this HackerNoon post by Anthony Shaw from April 2nd, 2018, PyPy3 is still several times slower than PyPy (Python 2).

For many scientific calculations, particularly matrix calculations, numpy is a better choice (see FAQ: Should I install numpy or numpypy?).

Pypy does not support gmpy2. You can instead make use of gmpy_cffi though I haven't tested its speed and the project had one release in 2014.

For Project Euler problems, I make frequent use of PyPy, and for simple numerical calculations often from __future__ import division is sufficient for my purposes, but Python 3 support is still being worked on as of 2018, with your best bet being on 64-bit Linux. Windows PyPy3.5 v6.0, the latest as of December 2018, is in beta.

qwr
  • 9,525
  • 5
  • 58
  • 102
4

Supported Python Versions

To cite the Zen of Python:

Readability counts.

For example, Python 3.8 introduced fstring =.

There might be other features in Python 3.8+ which are more important to you. PyPy does not support Python 3.8+ at the moment.

Shameless self-advertisement: Killer Features by Python version - if you want to know more things you miss by using older Python versions

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
  • But is **PyPy** faster than **CPython** for the same Python versions - I can understand using python3.7 and 3.8 and getting more benefits, but if I can use `PyPy` on the side for some project, to bypass **GIL** and have faster parallel processing in case of CPU oriented processes – aspiring1 Oct 30 '20 at 07:15
  • As of today 9/OCT/21, PyPy support or is compitable with python3.7 and now the team is working toward supporting python3.8. Ref https://www.pypy.org/posts/2021/04/pypy-v734-release-of-python-27-and-37.html – Ghassan Maslamani Sep 08 '21 at 10:33
  • @aspiring PyPy has a GIL. – DavidW Sep 08 '21 at 13:23