14

Working on different projects I have the choice of selecting different programming languages, as long as the task is done.

I was wondering what the real difference is, in terms of performance, between writing a program in Python, versus doing it in C.

The tasks to be done are pretty varied, e.g. sorting textfiles, disk access, network access, textfile parsing.

Is there really a noticeable difference between sorting a textfile using the same algorithm in C versus Python, for example?

And in your experience, given the power of current CPU's (i7), is it really a noticeable difference (Consider that its a program that doesnt bring the system to its knees).

Tomer Shetah
  • 8,413
  • 7
  • 27
  • 35
Pablo Hevia-Koch
  • 512
  • 1
  • 4
  • 9
  • 1
    http://theunixgeek.blogspot.com/2008/09/c-vs-python-speed.html – Incognito Aug 20 '10 at 18:41
  • 1
    If you are not sure about C, have you ever considered to use a decent higher level compiled language like C++ or Java instead of python. You know... python is not the answer for everything. – Andrei Ciobanu Aug 20 '10 at 19:08
  • Thanks Wayne for the spelling correction :) – Pablo Hevia-Koch Aug 20 '10 at 19:43
  • 2
    In terms of speed, Python wins by far for development time. C wins by far for performance/size/memory constraints. Weigh your priorities and pick what fits. – Nick T Aug 20 '10 at 19:48
  • 2
    If the workload is tiny then even a large difference between the language implementations will not be noticeable. If the workload is huge then even a tiny difference between the language implementations will be noticeable. – igouy Aug 21 '10 at 21:17

14 Answers14

38

Use python until you have a performance problem. If you ever have one figure out what the problem is (often it isn't what you would have guessed up front). Then solve that specific performance problem which will likely be an algorithm or data structure change. In the rare case that your problem really needs C then you can write just that portion in C and use it from your python code.

jshen
  • 11,507
  • 7
  • 37
  • 59
  • 3
    Look at the compiled Cython language before you write any C. Cython compiles to shared libraries that can be directly imported into Python. – Eike Aug 21 '10 at 11:33
12

C will absolutely crush Python in almost any performance category, but C is far more difficult to write and maintain and high performance isn't always worth the trade off of increased time and difficulty in development.

You say you're doing things like text file processing, but what you omit is how much text file processing you're doing. If you're processing 10 million files an hour, you might benefit from writing it in C. But if you're processing 100 files an hour, why not use python? Do you really need to be able to process a text file in 10ms vs 50ms? If you're planning for the future, ask yourself, "Is this something I can just throw more hardware at later?"

Writing solid code in C is hard. Be sure you can justify that investment in effort.

lazyconfabulator
  • 477
  • 2
  • 6
  • 13
10

In general IO bound work will depend more on the algorithm then the language. In this case I would go with Python because it will have first class strings and lots of easy to use libraries for manipulating files, etc.

ChaosPandion
  • 77,506
  • 18
  • 119
  • 157
  • 3
    +1: Unless it involves many compute-intensive loops, the limiting factors always seem to be OS resources like file systems and process slots and memory. – S.Lott Aug 20 '10 at 18:35
  • @Jacek - Good point. You can also write high performance code in C for portions of your project as @jshen noted in his answer. – ChaosPandion Aug 20 '10 at 18:49
  • "(processed by built-in functions) then Python performance will usually be comparable to that of C" ... Oh please. I get what you are saying, but... oh please. – JustBoo Aug 20 '10 at 19:34
  • @JustBoo: Careful of bashing Python. The list sort in Python is amazingly fast. The built-in functions are -- in many cases -- syntactic sugar over the C library. If Python is used wisely, it isn't inherently slow. Of course, bad choice of algorithm can make any language horribly slow. – S.Lott Aug 20 '10 at 20:33
7

Is there really a noticeable difference between sorting a textfile using the same algorithm in C versus Python, for example?

Yes.

The noticeable differences are these

  1. There's much less Python code.

  2. The Python code is much easier to read.

  3. Python supports really nice unit testing, so the Python code tends to be higher quality.

  4. You can write the Python code more quickly, since there are fewer quirky language features. No preprocessor, for example, really saves a lot of hacking around. Super-experience C programmers hardly notice it. But all that #include sandwich stuff and making the .h files correct is remarkably time-consuming.

  5. Python can be easier to package and deploy, since you don't need a big fancy make script to do a build.

S.Lott
  • 384,516
  • 81
  • 508
  • 779
  • 1
    2. Opinion. Some people hate the whitespace-based code. 3. Nothing to do with Python, C has unit testing libraries. 4. Header files aren't really time consuming if you have a good editor. 5. No. There is nothing simpler than C for packaging becaue Makefiles are very simple. – alternative Aug 20 '10 at 18:41
  • Up until I started messing around with the *Quake 3* source code I hadn't realized how much ceremony is involved with complex C projects. *(Previously I had only played around with micro-controllers.)* – ChaosPandion Aug 20 '10 at 18:41
  • I forgot: -1 because the entire answer has nothing to do with performance. – alternative Aug 20 '10 at 18:42
  • Readability DOES have to do with being able to clearly see where a function ends. The whole thing is an opinion. Some people like the brief code - Others like the so called repetition that shows your intent. – alternative Aug 20 '10 at 18:50
  • 5
    @mathepic: performance? The programmer's performance is often the most expensive part of software development. Are you saying that optimizing the programmer's time has no value? – S.Lott Aug 20 '10 at 19:03
  • 3
    @mathepic A Makefile isn't the same as a package. And a simple Makefile rarely gets the job done on all platforms. – Michael Mior Aug 20 '10 at 19:06
  • 8
    This question is not about the performance of the programmer. It is about the performance of the program. Therefore, this answer is off topic and is simply evangalizing Python over C. – alternative Aug 20 '10 at 19:06
  • 2
    @mathepic - You are really borderline trolling here. With some work programmers can be productive in any language. Python simply takes less effort to be productive in. – ChaosPandion Aug 20 '10 at 19:08
  • @mathepic: "question is not about ... It is about ..." Really? What evidence do you have for this assertion? – S.Lott Aug 20 '10 at 19:24
  • "given the power of current CPU's (i7), is it really a noticeable difference". It seems reasonably unlikely to me that the questioner is asking about programmer time, he's asking about runtime. He probably *should* be asking about programmer time, though, so this is a relevant answer even if the questioner doesn't know it yet. IMO though if you're guaranteeing that the Python code will not run noticeably slower than the C code, it would be best to do so explicitly rather than by omission ("the noticeable differences are..."). Despite the rhetorical power of that omission. – Steve Jessop Aug 20 '10 at 19:30
  • @Steve Jessop: They're not asking about IDE performance with that "power of current CPU's (i7)" business? – S.Lott Aug 20 '10 at 19:31
  • 3
    @Steve - Yes, the original question may have been with regards to performance but sometimes we can help the questioner more by shifting the direction of the question to something more relevant. – ChaosPandion Aug 20 '10 at 19:33
  • @S.Lott: Questioner doesn't mention IDEs, that I can see. Are you saying you think they do mean an IDE, or saying that you don't know whether they do or not? I think the balance of probability is that they are not, because people who throw around words like "in terms of performance" and "CPU power", without being precise, are almost invariably thinking about how fast their code will run, not how long it will take them to write it. Assuming that the questioner doesn't have an i7 for a brain, that is. – Steve Jessop Aug 20 '10 at 20:22
  • @Steve Jessop: "... don't know whether they do or not". Correct. "people who throw around words like... are almost invariably thinking about how fast their code will run" Generally true. "sometimes we can help the questioner more by shifting the direction of the question". My point precisely. – S.Lott Aug 20 '10 at 20:30
  • +1 for mentioning the hackiness of C. Every C program contains a massive build script and is usually executed by hundreds of thousands of lines to check if your C compiler and bash crap have certain features. @mathepic "2. Opinion. Some people hate the whitespace-based code." Anyone who thinks Python is less readable than C either A) doesn't know C B) doesn't know Python C) Is just another bad coder who writes bad code. Please, show me a good 1M LOC project in a C based syntax language without indentation, I'd really like to see that. – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Aug 22 '10 at 00:11
  • @Longpoke: As I said, its an OPINION. I don't dislike indentation - I dislike syntactic indentation because it means its impossible to automatically indent correctly. As for the hunders of thousands of lines, its automatically generated and not that big. (I haven't even stated that the python opinion is wrong, yet I'm being insulted by these people for stating that their opinions aren't always the global opinion...) – alternative Aug 22 '10 at 00:34
  • 1
    "impossible to automatically indent correctly"? What does that mean? Impossible? Who -- or what -- is automatically indenting? An IDE? Mine all work perfectly. What are you saying is "impossible" here? – S.Lott Aug 22 '10 at 01:06
  • "I dislike syntactic indentation because it means its impossible to automatically indent correctly." No it isn't, I do it every day. The fact is that Python is easier to read, it doesn't matter if one thinks it looks pretty or not, because it's easier to read and maintain, which is all that matters. Please don't BS me and tell me it's easier to read `int reg_cb(Thing** (*cb)(int**, int, int), int x, int y) {...}` than `def reg_cb(cb, x, y): ...`. – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Aug 22 '10 at 01:10
  • BTW Ada was invented because C was deemed unfeasibly hard to maintain and ensure safety for large scale mission critical DoD operations. Maybe you should go argue with DoD if you think C is more readable than . – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Aug 22 '10 at 01:17
4

The first rule of computer performance questions: Your mileage will vary. If small performance differences are important to you, the only way you will get valid information is to test with your configuration, your data, and your benchmark. "Small" here is, say, a factor of two or so.

The second rule of computer performance questions: For most applications, performance doesn't matter -- the easiest way to write the app gives adequate performance, even when the problem scales. If that is the case (and it is usually the case) don't worry about performance.

That said:

  • C compiles down to machine executable and thus has the potential to execute as at least as fast as any other language
  • Python is generally interpreted and thus may take more CPU than a compiled language
  • Very few applications are "CPU bound." I/O (to disk, display, or memory) is not greatly affected by compiled vs interpreted considerations and frequently is a major part of computer time spent on an application
  • Python works at a higher level of abstraction than C, so your development and debugging time may be shorter

My advice: Develop in the language you find the easiest with which to work. Get your program working, then check for adequate performance. If, as usual, performance is adequate, you're done. If not, profile your specific app to find out what is taking longer than expected or tolerable. See if and how you can fix that part of the app, and repeat as necessary.

Yes, sometimes you might need to abandon work and start over to get the performance you need. But having a working (albeit slow) version of the app will be a big help in making progress. When you do reach and conquer that performance goal you'll be answering performance questions in SO rather than asking them.

mpez0
  • 2,815
  • 17
  • 12
  • Its easier to search a linked list than a dynamically allocated block of memory, but should I use a linked list for a search? – alternative Aug 20 '10 at 18:57
  • @mathepic: You might want to use a tree for a search. Or you might want to use a hashmap. I'm not sure what you're getting at with your comment. – S.Lott Aug 20 '10 at 19:02
  • @mathepic: both C and Python support linked lists and dynamically allocated memory, not to mention other techniques. Since the question asks for C vs Python, I don't get the purpose of your comment, either. – mpez0 Aug 22 '10 at 14:34
  • The purpose of my comment was to show how the easiest thing to program is not always the best way to do it. – alternative Aug 22 '10 at 15:07
4

If your text files that you are sorting and parsing are large, use C. If they aren't, it doesn't matter. You can write poor code in any language though. I have seen simple code in C for calculating areas of triangles run 10x slower than other C code, because of poor memory management, use of structures, pointers, etc.

Your I/O algorithm should be independent of your compute algorithm. If this is the case, then using C for the compute algorithm can be much faster.

Derek
  • 11,715
  • 32
  • 127
  • 228
  • -1 This is nonsense. If the file is large, this computation is I/O bound, and thus wont be any faster in C due to context switching and cache coherency issues. If we are talking memory overhead, both C and Python are perfectly capable of reading and processing **chunks** of the file at once. – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Aug 22 '10 at 00:19
  • +1 to counteract nonsense -1. @Longpoke Just because the I/O is a bottleneck doesn't mean that the processing code can't be a second bottleneck. – alternative Aug 22 '10 at 00:36
  • @mathepic The processing code footprint is asymptotically insignificant compared to the I/O of reading a giant file. Please go learn db4o or something (a database written in Java which is faster than RDBMS which are written in C/C++). Or at least go learn what cache coherency means... As soon as you mess with any decent I/O, all these illusionary advantages that C has over other languages are flat out __destroyed__. – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Aug 22 '10 at 01:13
  • @Longpoke You are assuming that the processing code is not extremely complex. Yes, the I/O part will perform the same, but the processing code still has to do complex things. – alternative Aug 22 '10 at 15:18
  • Any decently designed software that is working on an out of core dataset is going to have the I/O be separate from the computational portion. Tuning the application to have the appropriate amount of I/O overlapping the computational portion will keep the processor "fed" with data. In flat memory design, unstructured C software, your software will be faster than just about any other language with the exception of FORTRAN. – Derek Aug 23 '10 at 13:59
  • @mathepic: Yes, C may be faster if it does heavy processing on the read data, but that's only really against Python. Comparing C to another statically typed language wont really be of much difference because the code is almost exactly the same (unless you do some hacks in C with pointers etc). In any case, you could have still wrote the entire system in Python and got an unnoticeable speed difference. Perhaps the processing part may be slow (unlikely), in that case you can just write the processing part in C/Ada/FORTRAN/Go/D/etc and the rest in Python or another HLL. – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Aug 23 '10 at 15:01
3

(Assumption - The question implies that the author is familiar with C but not Python, therefore I will base my answer with that in mind.)

I was wondering what the real difference is, in terms of performance, between writing a program in Python, versus doing it in C.

C will almost certainly be faster unless it is implemented poorly, but the real questions are:

  • What are the development implications (development time, maintenance, etc.) for either implementation?
  • Is the performance benefit significant?

Learning Python can take some time, but there are Python modules that can greatly speed development time. For example, the csv module in Python makes reading and writing csv easy. Also, Python strings, arrays, maps, and other objects make it more flexible than plain C and more elegant, in my opinion, than the equivalent C++. Some things like network access may be much quicker to develop in Python as well.

However, it may take time to learn how to program Python well enough to accomplish your task. Since you are concerned with performance, I suggest trying a simple task, such as sorting a text file, in both C and Python. That will give you a better baseline on both languages in terms of performance, development time, and possibly maintenance.

Ryan
  • 7,835
  • 2
  • 29
  • 36
  • and make sure that you run your python code past an experienced python developer. C is not the only language with room for poor programming to drastically increase both development time and running time. – aaronasterling Aug 21 '10 at 11:40
1

It really depends a lot on what your doing and if the algorithm in question is available in Python via a natively compiled library. If it is, then I believe you'll be looking at performance numbers close enough that Python is most likely your answer -- assuming it's your preferred language. If you must implement the algorithm yourself, depending on the amount of logic required and the size of your data set, C/C++ may be the better option. It's hard to provide a less nebulous answer without more information.

tanis
  • 11
  • 1
1

To get an idea of the raw difference in speed, check out the Computer Languages Benchmark Game.

Then you have to decide whether that difference matters to you.

Personally, I ended up deciding that it did, but most of the time instead of using C, I ended up using other higher-level languages. Personally I mostly use Scala, but Haskell and C# and Java each have their advantages also.

Rex Kerr
  • 166,841
  • 26
  • 322
  • 407
0

C is definitely faster than Python because Python is written in C. C is middle level language and hence faster but there not much a great difference between C & Python regarding executable time it takes. but it is really very easy to write code in Python than C and it take much shorter time to write code and learn Python than C. Because its easy to write its easy to test also.

0

Across all programs, it isn't really possible to say whether things will be quicker or slower on average in Python or C.

For the programs that I've implemented in both languages, using similar algorithms, I've seen no improvement (and sometimes a performance degradation) for string- and IO-heavy code, when reimplementing python code in C. The execution time is dominated by allocation and manipulation of strings (which functionality python implements very efficiently) and waiting for IO operations (which incurs the same overhead in either language), so the extra overhead of python makes very little difference.

But for programs that do even simple operations on image files, say (images being large enough for processing time to be noticeable compared to IO), C is enormously quicker. For this sort of task the bulk of the time running the python code is spent doing Python Stuff, and this dwarfs the time spent on the underlying operations (multiply, add, compare, etc.). When reimplemented as C, the bureaucracy goes away, the computer spends its time doing real honest work, and for that reason the thing runs much quicker.

It's not uncommon for the python code to run in (say) 5 seconds where the C code runs in (say) 0.05. So that's a 100x increase -- but in absolute terms, this is not so big a deal. It takes so much less longer to write python code than it does to write C code that your program would have to be run some huge number of times to turn a time profit. I often reimplement in C, for various reasons, but if you don't have this requirement then it's probably not worth bothering. You won't get that part of your life back, and next year computers will be quicker.

0

Actually you can solve most of your tasks efficiently with python.

You just should know which tools to use. For text processing there is brilliant package from Egenix guys - http://www.egenix.com/products/python/mxBase/mxTextTools/. I was able to create very efficient parsers with it in python, since all the heavy lifting is done by native code.

Same approach goes for any other problem - if you have performance problems, get a C/C++ library with Python interface which implements whatever bottleneck you got efficiently.

Daniel Kluev
  • 11,025
  • 2
  • 36
  • 36
-1

You will find C is much slower. Your developers will have to keep track of memory allocation, and use libraries (such as glib) to handle simple things such as dictionaries, or lists, which python has built-in.

Moreover, when an error occurs, your C program will typically just crash, which means you'll need to get the error to happen in a debugger. Python would give you a stack trace (typically).

Your code will be bigger, which means it will contain more bugs. So not only will it take longer to write, it will take longer to debug, and will ship with more bugs. This means that customers will notice the bugs more often.

So your developers will spend longer fixing old bugs and thus new features will get done more slowly.

In the mean-time, your competitors will be using a sensible programming language and their products will be increasing in features and usability, rapidly yours will look bad. Your customers will leave and you'll go out of business.

MarkR
  • 62,604
  • 14
  • 116
  • 151
  • -1. "your C program will typically just crash" you can write memory dumps on crash and debug them later. – SigTerm Aug 22 '10 at 00:34
-1

The excess time to write the code in C compared to Python will be exponentially greater than the difference between C and Python execution speed.