Python and C++ performance comparison

Question

In a lecture I've encountered the following problem: Given a simple program which computes the sum of a column in a large data set, performance of a python and a c++ implementation are being compared. The main bottleneck should be reading the data. The computation itself is rather simple. On first execution, the python version is about 2 times slower than c++ which makes sense.

Then on the second execution, the c++ program speeds up from 4 seconds to 1 second because apparently the "first execution is I/O bound, second is CPU bound". This still makes sense since probably the file contents were cached omitting the slow reading from disk.

However, the python implementation did not speed up at all on the second run, despite the warm cache. I know python is slow, but is it that slow? Does this mean that executing this simple computation in python is slower than reading about .7 GB from disk?

If this is always the case, I'm wondering why the biggest deep learning frameworks I know (PyTorch, tensorflow) have python apis. For real time object detection for example, it must be slower to parse the input (read frames from a video, maybe preprocess) to the network and to interpret the output, than performing the forward propagation itself on a gpu.

Have I misunderstood something? Thank you.

As you say Tensorflow and Pytorch have Python APIs, they still run C++ code. So it is for making use of simplicity of Python. — relay, Feb 17 '18 at 09:48
True, but is it not a gigantic drawback to use python just to feed the data to a model? To give an example, it seems to me that wrapping those frameworks in python might make loading the data and calling the `predict` function slower than the C++ code which happens internally. — Gerry, Feb 17 '18 at 10:28

score 1 · Accepted Answer · answered Feb 17 '18 at 10:28

That's not so easy to answer without implementation details, but in general, python is known for it's much less cache friendliness, because you mostly haven't the option to low-level optimize cache behaviour in python. However, this isn't always correct. You propably can optimize the cache friendliness in python directly, or you use parts of c++ code for critical sections. But always consider, that you can just optimize your code better in C++. So if you have really critical code parts, where you want to achieve every percent of speed and effiency, you should use C++. That's the reason, that many programs use both, C++ for raw performance things and python for a nice interface and program structure.

Does this mean that deep learning apis use python on those parts where simplicity outweighs a tiny performance drawback? — Gerry, Feb 17 '18 at 10:29
Yes, it does. (Also because those parts can be developed with python way quicker) — Ben, Feb 17 '18 at 17:49

Python and C++ performance comparison

1 Answers1