17

First - This is not meant to be a 'which is better, ignorant nonionic war thread'... But rather, I generally need help in making an architecture decision / argument to put forward to my boss.

Skipping the details - I simply just would love to know and find the results of anyone who has done some performance comparisons of Shell vs [Insert General Purpose Programming Language (interpreted) here), such as C# or Java...

Surprisingly, I have spent some time on Google on searching here to not find any of this data. Has anyone ever done these comparisons, in different use-cases; hitting a database like in a XYX # of loops doing different types of SQL (Oracle pref, but MSSQL would do) queries such as any of the CRUD ops - and also not hitting database and just regular 50k loop type comparison doing different types of calculations, and things of that nature?

In particular - for right now, I need to a comparison of hitting an Oracle DB from a shell script vs, lets say C# (again, any GPPL thats interpreted would be fine, even the higher level ones like Python). But I also need to know about standard programming calculations / instructions/etc...

Before you ask 'why not just write a quick test yourself? The answer is: I've been a Windows developer my whole life/career and have very limited knowledge of Shell scripting - not to mention *nix as a whole.... So asking the question on here from the more experienced guys would be grealty beneficial, not to mention time saving as we are in near perputual deadline crunch as it is ;).

mac
  • 42,153
  • 26
  • 121
  • 131
dferraro
  • 6,357
  • 11
  • 46
  • 69
  • 11
    C# and Java are not interpreted. – cdhowie Dec 20 '10 at 17:00
  • 3
    I think the better question is "Why does it matter to you?" You obviously have a use case in mind, if we knew what it was then we could be more helpful. – Chris Pitman Dec 20 '10 at 17:01
  • Shell and programming languages have very different purposes, and you often just can't express the same thing in both. That's like comparing Emacs Lisp to .Net. You mention SQL queries, but shell just doesn't have any means to do it. It would call some external program for that. – Sergei Tachenov Dec 20 '10 at 17:10
  • 1
    @cdhowie, actually the older JVM's simply interpreted the java byte code instead, which tends to be slower. So, in that sense, Java is "interpreted." – dferraro Dec 20 '10 at 18:33
  • @Chris - the use case is business asking us 'when we push this button on your application (thin client .NET WinForms app talking to *nix server running the .ksh script) it takes 40 minutes to run - we want it to lower down to 5 minutes. And yes I'm aware of how to handle business as well as management expectations. Sigh, I'd just like to ask a question and get the answer without 'more details' as I provided plenty for my Q - everything else doesn't matter (yes i know how to handle business expectorants, yes I know what a profiler is, yes I know how the CLR and JVM work.. – dferraro Dec 20 '10 at 18:58
  • Just changing the language some process is written in isn't going to lower the run time of something from 40 minutes to 5 minutes. You're going to have to figure out what the process is doing, and what parts take up so much time to know how to fix this. There may be inefficient algorithms (i.e. O(n log n) vs O(n^2)), you may be able to speed things up by batching requests, or using something like map/reduce to parallelize the process - however, without knowing what the process does, there's no way to know what's taking so long, or what to do to improve it. – Nate Dec 20 '10 at 19:14
  • @nate... thanks for the reply.. I do understand what you mean. Perhaps I worded by question poorly. I guess what I wanted to ask was, very simply - (I'm not looking for help optimizing my process) - has anyone ever done comparisons on shell scrips vs standard general purpose languages, and compiled data? I've seen this a million times for a a gazillion language variations - can't find data tho on shell scripts vs, say C# or Java for example... – dferraro Dec 20 '10 at 19:37
  • @dferraro: So by that same sense of Java being interpreted, VMWare is a PC emulator, right? Not a virtualizer/hypervisor? – cdhowie Dec 20 '10 at 20:09

7 Answers7

10

Once upon a time, ye olde The Great Computer Language Shootout did include some shell scripts.

So, courtesy of the Internet Archive, from 2004 -

Note the shell scripts didn't have programs for many of the tests.

    Score Missing-Tests

Java 20     1

Perl 16     0

Python 16   0

gawk 12     6 

mawk 10     6 

bash 7      12  

Note shell scripts can sometimes be small and fast :-)

"Reverse a file"

        CPU (sec)   Mem (KB)    Lines Code

bash    0.0670      1464        1

C gcc   0.0810    4064        59

Python  0.3869    13160       6
igouy
  • 2,547
  • 17
  • 16
  • 1
    That's not really measuring the "speed of bash" though ;-) [Ways to reverse a file in Linux](http://www.cyberciti.biz/faq/howto-linux-unix-shell-reverse-file-lines/) -- Nice finding of the archive. –  Dec 21 '10 at 19:02
  • 1
    @pst - If the OP can use that kind-of technique to resolve his problem, I don't think he'll care about whether anyone thinks it's 'not really measuring the "speed of bash"'. – igouy Dec 21 '10 at 19:12
  • thanks, this is all i was looking for, some kind of dumb-loop-that-did-nothing and the time differences (or a link where there were many of these compared). The actual details of why I needed and the bigger picture were far too much to write that anyone would actually care for / read – dferraro Dec 30 '10 at 20:30
  • Why isn't assembly in the shootout? – Pacerier Oct 31 '14 at 15:50
  • 1
    @Pacerier -- "Because I know it will take more time than I choose to donate. [Been there; done that.](http://benchmarksgame.alioth.debian.org/play.html#languagex)" – igouy Nov 16 '14 at 17:30
  • For the file-reversing script, the link is operational, but the source code & logs for at least some of the languages isn't. – trysis Aug 30 '20 at 22:51
8

It is highly dependent on what the script is doing. I've seen poorly written shell scripts sped up by one, two even three orders of magnitude by making simple changes.

Typically, a shell script is simply some glue logic that runs utilities that are usually compiled C or C++. If that's the case, there may not be much that can be done to speed things up. If the grunt work is being done by a poorly written utility that's compiled, it's just doing a lot of wasted effort really fast.

That said, Python or Perl are going to be much faster than a shell script, but a VM or native code will be faster yet.

Since you can't tell us any details, we can't really provide specific help.

If you want to see a simple demonstration for comparison, try my pure-Bash implementation of hexdump and compare it to the real thing:

$ time ./bash-hexdump /bin/bash > /dev/null
real    7m17.577s
user    7m2.570s
sys     0m14.745s
$ time hexdump -C /bin/bash > /dev/null
real    0m2.459s
user    0m2.260s
sys     0m0.176s

One of the main reasons the Bash version is slow is that it reads the file character by character which is necessary to handle null bytes (shells aren't very good at handling binary data), but the primary reason is the speed of execution. Here is an example of a Python script I found:

$ time ./hexdump.py /bin/bash > /dev/null
real    0m11.694s
user    0m11.605s
sys     0m0.040s
Community
  • 1
  • 1
Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
3

I simply just would love to know and find the results of anyone who has done some performance comparisons of...

The abiding lesson of such comparisons is that the particular details matter - a lot.

Not only the particular details of the task, but (shouldn't we know this as programmers) the particular details of how the shell script is written.

So can you find someone who understands that shell language and can check that shell script was written in an efficient way? (Wouldn't it be nice if changing a couple of lines took it from 40 minutes to 5 minutes.)

igouy
  • 2,547
  • 17
  • 16
2

Just did this very simple benchmark on my system, and the results are as expected.

Add up all integers between 1 and 50,000 and output answer at each step

Bash: 3 seconds C: 0.5 seconds

1

If you are writing code and you have concerns about the speed of processing, you should be writing code that is either compiled directly to assembly or compiled for a modern VM.

But... with Moore's Law kicking up processing power every 18 months, I wonder: are the performance requirements really necessary? Even interpreted code runs incredibly fast on most modern systems, and it's only going to get better with time. Do you really need the kind of speed improvements that compiled code would give you?

If the answer is no, then write in whatever makes you happy.

Jonathan B
  • 1,040
  • 6
  • 11
  • Thanks the for the reply Jon. Heh, my question really stems from a *single* requirement from the business asking us to make a change in a .NET WinForms client application. The change? Make a button that gets pushed frequently 'go from taking 40 minutes to 5 minutes'. WinForm app doesn't really do anything, but send req to a *nix server which does the actual processeing in a shell script. I have signed NDA so I cant get into really any specifics but hopefully that will help. Thanks much. – dferraro Dec 20 '10 at 18:45
  • FYI - this is just a small piece of a much bigger system I described above to Chris - just picture that WinForms client has a whole bunch of tabs and buttons and drop downs filled with parameters and all types of things of such nature, most work being done in the shell scripted server.... – dferraro Dec 20 '10 at 18:47
  • @dferraro : Pushing a button every 5 minutes instead of every 40 minutes really shouldn't make much of a difference in processing. Unless this button is initiating a method that takes longer than 5 minutes to complete on the client end, then it shouldn't really make that much of a difference. If it does take longer, i'd recommend multiple threading to help. – Alex Dec 20 '10 at 19:22
  • @Alex - "Make a button that gets pushed frequently" NOT "Pushing a button every 5 minutes instead of every 40 minutes". The processing is "taking 40 minutes" BUT the customer would like the processing to only take "5 minutes". – igouy Dec 21 '10 at 18:18
  • 1
    @dferraro: That's an entirely different problem. I would yank out the testing tools and find out where the processing bottleneck lies. Once you find that you can start work on optimization. Changing languages might not help - in fact, the added complexity might make things worse. (One thought - is there SQL involved? I've often found the DB is the source of many bottlenecks...) – Jonathan B Dec 23 '10 at 18:35
1

While this doesn't include "Shell" (aka sh/bash/ksh/powerscript) languages, it is a relatively large list of "language [implementation] performance" -- packed full with generalities and caveats. In any case, someone may enjoy it.

http://benchmarksgame.alioth.debian.org/

igouy
  • 2,547
  • 17
  • 16
0

As mentioned above, you won't be able to do SQL queries from shell. Languages which runs on a VM will take a little time upfront because of the VM factor but otherwise the difference should be negligible.

If the question really is to decrease it from 40 to 5 minutes that I will try to find out which piece is taking the majority of the time. If the query is running for the longest time then switching language won't help you much.

Again (without much detail in the question) I would start with looking into different components of the system to see which one is the bottleneck.

Nauman
  • 301
  • 1
  • 2