38

Usually, bash functions are defined using curly braces to enclose the body:

foo()
{
    ...
}

When working on a shell script today making extensive use of functions, I've run into problems with variables that have the same name in the called as in the calling function, namely that those variables are the same. I've then found out that this can be prevented by defining the local variables inside the function as local: local var=xyz.

Then, at some point, I've discovered a thread (Defining bash function body using parenthesis instead of braces) in which it is explained that it's just as valid to define a function using parentheses like this:

foo()
(
    ...
)

The effect of this is that the function body is executed in a subshell, which has the benefit that the function has its own variable scope, which allows me to define them without local. Since having a function-local scope seems to make much more sense and to be much safer than all variables being global, I immediately ask myself:

  • Why are braces used by default to enclose the function body instead of parentheses?

However, I quickly also discovered a major downside to executing the function in a subshell, specifically that exiting the script from inside a function doesn't work anymore, instead forcing me to work with the return status along the whole call tree (in case of nested functions). This leads me to this follow-up question:

  • Are there other major downsides (*) to using parentheses instead of braces (which might explain why braces seem to be preferred)?

(*) I'm aware (from exception-related discussions I've stumbled upon over time) that some would argue that explicitly using the error status is much better than being able to exit from anywhere, but I prefer the latter.

Apparently both styles have their advantages and disadvantages. So I hope some of you more experienced bash users can give me some general guidance:

  • When shall I use curly braces to enclose the function body, and when is it advisable to switch to parentheses?

EDIT: Take-aways from the answers

Thanks for your answers, my head's now a bit clearer with regards to this. So what I take away from the answers is:

  • Stick to the conventional curly braces, if only in order not to confuse potential other users/developers of the script (and even use the braces if the whole body is wrapped in parentheses).

  • The only real disadvantage of the curly braces is that any variable in the parent scope can be changed, although in some situations this might be an advantage. This can easily be circumvented by declaring the variables as local.

  • Using parentheses, on the other hand, might have some serious unwanted effects, such as messing up exits, leading to problems with killing a script, and isolating the variable scope.

Community
  • 1
  • 1
flotzilla
  • 1,181
  • 1
  • 13
  • 23

4 Answers4

22

Why are braces used by default to enclose the function body instead of parentheses?

The body of a function can be any compound command. This is typically { list; }, but three other forms of compound commands are technically allowed: (list), ((expression)), and [[ expression ]].

C and languages in the C family like C++, Java, C#, and JavaScript all use curly braces to delimit function bodies. Curly braces are the most natural syntax for programmers familiar with those languages.

Are there other major downsides (*) to using parentheses instead of braces (which might explain why braces seem to be preferred)?

Yes. There are numerous things you can't do from a sub-shell, including:

  • Change global variables. Variables changes will not propagate to the parent shell.
  • Exit the script. An exit statement will exit only the sub-shell.

Starting a sub-shell can also be a serious performance hit. You're launching a new process each time you call the function.

You might also get weird behavior if your script is killed. The signals the parent and child shells receive will change. It's a subtle effect but if you have trap handlers or you kill your script those parts not work the way you want.

When shall I use curly braces to enclose the function body, and when is it advisable to switch to parentheses?

I would advise you to always use curly braces. If you want an explicit sub-shell, then add a set of parentheses inside the curly braces. Using just parentheses is highly unusual syntax and would confuse many people reading your script.

foo() {
   (
       subshell commands;
   )
}
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • Though I've first considered @kojiro's, I'll accept your answer since you explicitly gave an answer to all three questions, plus you've raised two very good points with the weir behaviour if the script is killed (wouldn't have thought of that) and that using parentheses without any braces would be too unusual synthax. – flotzilla Jan 07 '15 at 09:02
  • But I still see the ability to change global variables out-of-the-box but having to declare local variables explicitly as such as more of a danger than a plus. I'd very much prefer the variables to be local by default, but being able to declare them as global so they could still be accessed. But I'll just get into the habit of always declaring them with `local` to get around this. – flotzilla Jan 07 '15 at 09:05
  • 1
    @VaticanViolator I agree, it'd be nicer if they were local by default. It's just one of those things that you learn to accept and deal with. For instance, I name local variables in lowercase and global variables in uppercase. It makes it clear which is which and lessens the risk of accidentally trampling over globals if I forget a `local` declaration. – John Kugelman Jan 07 '15 at 14:29
  • Yes I've also intuitively fallen into the habit of naming globals in uppercase some time ago. Well I guess the difficulties with local/global variables are managable in reasonably small scripts, and if it gets much larger, it should be broken into pieces or written in another language anyways. – flotzilla Jan 08 '15 at 09:16
  • **1)** "three other forms of compound commands are technically allowed": not three, there are many more, e.g. `if then fi`, see: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_04 **2)** "new process each time" This may be true, but is a well hidden implementation detail :-) it even sets the same `$$`: http://unix.stackexchange.com/questions/138463/do-parentheses-really-put-the-command-in-a-subshell – Ciro Santilli OurBigBook.com Jul 09 '15 at 11:26
9

I tend to use a subshell when I want to change directories, but always from the same original directory, and cannot be bothered to use pushd/popd or manage the directories myself.

for d in */; do
    ( cd "$d" && dosomething )
done

This would work as well from a function body, but even if you define the function with curly braces, it is still possible to use it from a subshell.

doit() {
    cd "$1" && dosomething
}
for d in */; do
    ( doit "$d" )
done

Of course, you can still maintain variable scope inside a curly-brace-defined function using declare or local:

myfun() {
    local x=123
}

So I would say, explicitly define your function as a subshell only if not being a subshell is detrimental to the obvious correct behavior of that function.

Trivia: As a side note, consider that bash actually always treats the function as a curly-brace compound command. It just sometimes has parentheses in it:

$ f() ( echo hi )
$ type f
f is a function
f () 
{ 
    ( echo hi )
}
kojiro
  • 74,557
  • 19
  • 143
  • 201
6

It really matters. Since bash functions do not return values and the variables they used are from the global scope (that is, they can access the variables from "outside" its scope), the usual way to handle the output of a function is to store the value in a variable and then call it.

When you define a function with (), you are right: it will create sub-shell. That sub-shell will contain the same values the original had, but won't be able to modify them. So that you are losing that resource of changing global scope variables.

See an example:

$ cat a.sh
#!/bin/bash

func_braces() { #function with curly braces
echo "in $FUNCNAME. the value of v=$v"
v=4
}

func_parentheses() (
echo "in $FUNCNAME. the value of v=$v"
v=8
)


v=1
echo "v=$v. Let's start"
func_braces
echo "Value after func_braces is: v=$v"
func_parentheses
echo "Value after func_parentheses is: v=$v"

Let's execute it:

$ ./a.sh
v=1. Let's start
in func_braces. the value of v=1
Value after func_braces is: v=4
in func_parentheses. the value of v=4
Value after func_parentheses is: v=4   # the value did not change in the main shell
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • 2
    technically functions can return an exit code values via `return` which will be pass back to the parent accessible via chaining or `$?` – Catskul Jul 26 '16 at 22:26
0

Note: Sometimes braced list runs not in same process:

a=22; { echo $a; a=46; echo $a; }; echo $a
says 22 46 46

but

a=22; { echo $a; a=46; echo $a; }|cat; echo $a
says 22 46 22

Thanks to fedorqui :)

  • 3
    Nothing to do with braces or lists. Try`a=22; a=46|cat; echo $a` or `a=22|cat; { echo $a; a=46; echo $a; }|cat; echo $a`. It's the pipe that is spawning a subshell. – hh skladby Dec 08 '21 at 13:42