291

I run across many shell scripts with variables in all caps, and I've always thought that there is a severe misunderstanding with that. My understanding is that, by convention (and perhaps by necessity long ago), environment variables are in all-caps.

But in modern scripting environments like Bash, I have always preferred the convention of lower-case names for temporary variables, and upper-case ones only for exported (i.e. environment) variables. For example:

#!/usr/bin/env bash
year=$(date +%Y)
echo "It is $year."
export JAVA_HOME="$HOME/java"

That has always been my take on things. Are there any authoritative sources which either agree or disagree with this approach, or is it purely a matter of style?

Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
JasonSmith
  • 72,674
  • 22
  • 123
  • 149

9 Answers9

374

By convention, environment variables (PAGER, EDITOR, ...) and internal shell variables (SHELL, BASH_VERSION, ...) are capitalized. All other variable names should be lower case.

Remember that variable names are case-sensitive; this convention avoids accidentally overriding environmental and internal variables.

Keeping to this convention, you can rest assured that you don't need to know every environment variable used by UNIX tools or shells in order to avoid overwriting them. If it's your variable, lowercase it. If you export it, uppercase it.

codeforester
  • 39,467
  • 16
  • 112
  • 140
lhunath
  • 120,288
  • 16
  • 68
  • 77
  • 13
    +1. Good point about accidental overwriting. I forgot to mention, but now that you mention it, I think I decided on using lowercase because I read or heard about just that problem. – JasonSmith Mar 23 '09 at 16:23
  • 8
    I thought the main reason for using uppercase variable names was to avoid conflicts with shell commands. We recently had the hostname of one of our servers accidentally changed to '=' because a script used a variable 'hostname'. – ThisSuitIsBlackNot Sep 25 '11 at 18:06
  • 38
    @ThisSuitIsBlackNot Ignoring crappy code, variables are prefixed with a dollar when expanded and used in a place where they cannot be confused with a command name when they're not. Obviously, doing hostname = moo is going to land you in trouble. Not because you're using a lowercased "hostname", but because you're not using the correct assignment syntax. Assignment is done with hostname=moo, no spaces. Assuming correct code, you don't need to worry about variable names conflicting with command names. – lhunath Oct 21 '11 at 08:30
  • 7
    All the text books I've looked at always user upper case for all shell variables. While lower case variable names are permissible, uppercase is the convention. – Brian S. Wilson Nov 14 '16 at 15:00
  • 6
    I didn't know this, and I just lost a couple hrs. over using `USER="username"` in a bash script automating some remote commands over ssh instead of `user="username"`. Ugh! Glad I know now! – Gabriel Staples May 31 '18 at 02:31
  • `SHELL` is not an "internal shell variable". It is a value the user puts in their environment to indicate their preferred shell to other tools, similar to the way EDITOR, VISUAL. and PAGER are used to describe preferences. – William Pursell Jan 14 '20 at 14:44
  • 1
    The "rest assured that you don't need to know every environment variable" thing is overstated. Obviously you **will** need to be careful not to overwrite such variables with your *own* environment varialbes, if you use all-caps for those. It's still a good point, but "rest assured" is too strong. –  May 04 '20 at 01:24
  • 2
    @BrianS.Wilson, it used to be common that everybody everywhere seemed to write all caps variables in whatever shell script. I've struggled to find a bash standard to suggest what's in this answer, but it's still a convention that I've practiced for years now, and that was after years of creating capital vars; doing it the way that you've seen in conventions in old books. Conventions can change, and sometimes do for good reason. Not to mention, there's no theoretical reason to have uppercase vars that I can think of unless a more experienced/longer-timer can explain why that convention started. – John Pancoast Mar 24 '21 at 17:29
  • And to add to my last answer, I can't find a reason one way or the other about var case in shell/bash scripts. It's almost as-if the built-ins and ENV vars and whatever else were capitalized so people just kind of created their bash vars as capital?? I honestly don't have another reason and would love to have one from someone who knows that history. Regardless, following the convention in this answer has another advantage: it makes my variable cases the same as literally all other languages I code in. That coupled with a bit more safety in regards to conflicts with env/other vars: I like it. – John Pancoast Mar 24 '21 at 17:38
  • 4
    @JohnPancoast, As a former Bell Labs employee, to the best of my knowledge, the convention was to ensure shell variables are easily distinguished from commands. It is still an extremely common practice to use the convention for the same reason. I have yet to see any published book, new or old, that don't follow this convention again for the sake of clarity. Having worked in many companies, I have dealt with shell code that used either format, and sometimes mixed the two. The use of lower case variables always made the scripts more difficult to work with IMHO. – Brian S. Wilson Mar 26 '21 at 01:12
  • @BrianS.Wilson, thanks that makes a lot of sense and well, if you were at Bell in those old days, you've settled the "long-timer" question :) No disrespect meant. And I will _definitely_ agree from my own experience of a couple decades around sys' and dev that capitalized snake case vars are _absolutely_ still a widely used convention for sure. The project I just joined has capitalized snake case, the last one had caps snake case, my own code throughout the years has morphed perhaps, and in the wild I've seen them both. – John Pancoast Mar 27 '21 at 17:16
  • 1
    @BrianS.Wilson So to simplify: the original convention of creating uppercase variables in `shell` scripts was most likely to keep variables from conflicting with commands. This newer convention is to use lowercase variables in `shell` scripts to keep them from conflicting with ENV variables. It's interesting, I've never had ENV var conflicts although I did still adopt this convention (for a couple reasons) but I'm rethinking it. It's an interesting one. Thank you for the enlightening answer. – John Pancoast Mar 27 '21 at 17:52
  • 3
    @Seamus, "shell variables cannot overwrite environment variables" -- this is false. Try running `PATH=foo`; even with no `export`, your next attempt to run `ls` or any other non-builtin command in the same shell will fail, because the shell _automatically_ exports updates to any shell variable that is flagged as exported. That is to say, the export flag follows the variable name, and updates to variables thus-flagged are mirrored to the environment automatically. – Charles Duffy Dec 07 '22 at 21:49
  • @Seamus, more to the point, absent an `ENV` value that points to a file resetting `PATH` to a sane value, `PATH/foo; /bin/sh -c 'ls'` will likewise fail, demonstrating that the update was in fact explicitly exported to the environment and inherited by child processes. – Charles Duffy Dec 07 '22 at 21:54
  • (err, that last `/` should have been an `=`, of course) – Charles Duffy Dec 08 '22 at 02:28
  • See [awk-sleep-and-curl-not-working-in-bash-while-loop-command-not-found](https://stackoverflow.com/questions/75102431/awk-sleep-and-curl-not-working-in-bash-while-loop-command-not-found) for an example of the kind of problems that occur if you don't follow this advice. – Ed Morton Jan 13 '23 at 00:39
  • 1
    Regarding a comment above that "All the text books I've looked at always user upper case for all shell variables. While lower case variable names are permissible, uppercase is the convention." - this person is clearly reading the wrong text books or misunderstanding them and as a result has come to the wrong conclusion. Using all upper case names for non-environment variables has been frequently hurting newcomers for decades (e.g. see see [other comment above](https://stackoverflow.com/questions/673055/correct-bash-and-shell-script-variable-capitalization#comment88241495_673940)) - dont do it – Ed Morton Mar 26 '23 at 15:38
  • The idea that if you export it, use upper , if not use lower seems completely brain dead to me. What about variables that are sometimes exported and sometimes not? What about a variable you don't export, but you expect may be incoming in the environment (or may not). What if you use lower case everywhere, and suddenly decide you want to export it, now you have to rewrite all scripts in your system? The system variables are UPPER because it has always been thus the convention in UNIX to use upper. You can throw convention to the wind if you want, I'll stick with tradition. – xpusostomos Apr 28 '23 at 04:16
  • 1
    @EdMorton This question was voluntarily removed by its author. awk-sleep-and-curl-not-working – brendan Aug 31 '23 at 21:42
64

Any naming conventions followed consistently will always help. Here are a few helpful tips for shell variable naming:

  • Use all caps and underscores for exported variables and constants, especially when they are shared across multiple scripts or processes. Use a common prefix whenever applicable so that related variables stand out and won't clash with Bash internal variables which are all upper case.

    Examples:

    • Exported variables with a common prefix: JOB_HOME JOB_LOG JOB_TEMP JOB_RUN_CONTROL
    • Constants: LOG_DEBUG LOG_INFO LOG_ERROR STATUS_OK STATUS_ERROR STATUS_WARNING
  • Use "snake case" (all lowercase and underscores) for all variables that are scoped to a single script or a block.

    Examples: input_file first_value max_amount num_errors

    Use mixed case when local variable has some relationship with an environment variable, like: old_IFS old_HOME

  • Use a leading underscore for "private" variables and functions. This is especially relevant if you ever write a shell library where functions within a library file or across files need to share variables, without ever clashing with anything that might be similarly named in the main code.

    Examples: _debug _debug_level _current_log_file

  • Avoid camel case. This will minimize the bugs caused by case typos. Remember, shell variables are case sensitive.

    Examples: inputArray thisLooksBAD, numRecordsProcessed, veryInconsistent_style


See also:

codeforester
  • 39,467
  • 16
  • 112
  • 140
  • 8
    This is *a* convention but it is hardly universally accepted. The rationale against camel case isn't entirely convincing. The recommendation to use SHOUTING for exported variables is mildly controversial. – tripleee Jul 06 '18 at 09:27
  • 11
    I didn't make a claim that it is a commonly followed convention. I have seen that most programmers don't think seriously about following strong conventions in shell scripts and thought of jotting down my thoughts based on what I have been doing. – codeforester Jul 06 '18 at 19:21
23

If shell variables are going to be exported to the environment, it’s worth considering that the POSIX (Issue 7, 2018 edition) Environment Variable Definition specifies:

Environment variable names used by the utilities in the Shell and Utilities volume of POSIX.1-2017 consist solely of uppercase letters, digits, and the underscore ( _ ) from the characters defined in Portable Character Set and do not begin with a digit.

...

The name space of environment variable names containing lowercase letters is reserved for applications. Applications can define any environment variables with names from this name space without modifying the behavior of the standard utilities.

Anthony Geoghegan
  • 11,533
  • 5
  • 49
  • 56
  • 1
    This applies even when one does _not_ intend to export a variable to the environment, because if _someone else_ exported a variable with the same name, then any updates to a like-named shell variable will be mirrored to the environment automatically. – Charles Duffy Dec 07 '22 at 21:52
  • The "Shell and utilities" of Posix is a very restrictive set, that no real Unix is limited to. The idea that there should be an artificial boundary between a particular set of utilities, and the wider range of utilities and applications in the universe, is a silly distinction. – xpusostomos Apr 28 '23 at 04:18
6

I do what you do. I doubt there's an authoritative source, but it seems a fairly widespread de-facto standard.

Draemon
  • 33,955
  • 16
  • 77
  • 104
  • 1
    I agree. It's because ALL_CAPS is ugly, but it's good to make ENVIRONMENT VARIABLES stand out by being ugly. – slim Mar 23 '09 at 11:46
  • 3
    I agree with you on coding style, but I definitely disagree that it's widespread! Shell scripts are one of those side languages that people just learn informally, and so I feel like everybody is always saying LOCATION=`cat /tmp/location.txt` – JasonSmith Mar 23 '09 at 16:38
  • @jhs - I've obviously been lucky in the shell scripts I've had to work with! – Draemon Mar 23 '09 at 17:29
  • 5
    *"The name space of environment variable names containing lowercase letters is reserved for applications."* -- [POSIX IEEE Std 1003.1-2008 section 8.1](http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/basedefs/V1_chap08.html) – tripleee Dec 19 '17 at 18:57
5

Actually, the term "environment variables" seems to be of fairly recent coinage. Kernighan and Pike in their classic book "The UNIX Programming Environment", published in 1984, speak only of "shell variables" - there is not even an entry for "environment" in the index!

  • 13
    I think that is a omission of the book. getenv(), setenv() and environ were introduced in UNIX version 7 (1979). http://en.wikipedia.org/wiki/Version_7_Unix – Juliano Mar 23 '09 at 16:05
  • 4
    That book looks to note that upper case variables do have special meaning. – ashawley Mar 23 '09 at 17:25
  • 1
    The functions in UNIX 7th Edition were `getenv()` and `putenv()`; `setenv()` and `unsetenv()` are more recent additions. – Jonathan Leffler Dec 22 '20 at 17:40
1

It's just a very widely held convention, I doubt there's any "authoritative" source for it.

Alnitak
  • 334,560
  • 70
  • 407
  • 495
  • 1
    There are authoritative sources quoted in other answers here, including in the POSIX specification (which was published long before this "answer"). – Charles Duffy Dec 07 '22 at 21:53
1

i tend use ALL_CAPS both for environment and global variables. of course, in Bash there's no real variable scope, so there's a good portion of variables used as globals (mostly settings and state tracking), and relatively few 'locals' (counters, iterators, partly-constructed strings, and temporaries)

Javier
  • 60,510
  • 8
  • 78
  • 126
  • Yes, I kind of conceptually think of non-exported variables as locals, since Bash is so often forking child processes to do whatever it's tasked with doing. – JasonSmith Mar 23 '09 at 12:19
0

Bash, and most shell script interpreters, recognize global and local variables within functions (e.g typeset, declare, local) and should be used as appropriate. As previously commented, "Environment variable names used by the utilities in the Shell and Utilities volume of POSIX.1-2017 consist solely of uppercase letters, digits, and the underscore ( _ ) from the characters defined in Portable Character Set and do not begin with a digit. ... The name space of environment variable names containing lowercase letters is reserved for applications. Applications can define any environment variables with names from this name space without modifying the behavior of the standard utilities." (POSIX IEEE Std 1003.1-2008 section 8.1 )

Brian S. Wilson
  • 561
  • 4
  • 11
0

Let's be clear on our terminology. Environment variables are those variables set by the Bash environment (e.g. ${0}, ${SHELL}, ${SECONDS}, etc.) and which do not need to be set by the user. User Variables (and Constants) are set by the user either in their .bash_profile, .bash_rc, or in a particular script file. User variables can be exported to the environment to become Environment variables; however, unless exported, the scope of User variable is limited to the current interpreter execution (either the shell environment or the executing shell script [i.e. will not be passed to any child] environment). If an Environment variable is unset, or reset, it will usually lose any special meaning or value.

In my 30+ years writing shell scripts, doing Build and Release and some System Administration, I've seen all of the aforementioned variable styles. Unix allows variable names composed of the majuscule and minuscule characters or any mix of the two sets, Linux adopted this same abomination for some unknown reason, probably portability. Posix strongly encourages the use of the majuscule character set as do almost all texts on Bash programming. My conclusion is that this is a convention that is widely adopted and used, but is not strictly required and you are free to make any poor choice you wish.

That said, there are some conventions that are used because of their utility and because they help programmers efficiently and effectively develop useful and maintainable code. When I write bash code:

  1. I use majuscule characters and the '_' characters for all variable and constant names.

  2. I typeset (AKA define) and initialize all variables (and constants) and specify the variable type (integer, read only, exported, array, hash, etc.) that are local to scripts and functions (no everything does not need to be global in Bash).

  3. I use '{' and '}' characters around all variables (syntactically required or not, to avoid unintentional naming errors, which I have seen in practice) and makes the Variable/Constant stand out.

  4. I always use "#!/usr/bin/env bash" now, and previously always used "#!/usr/bin/bash" on systems where "/usr/bin/env" was not available.

  5. I use "shopt -s extglob # Turn on extended global expressions" in my scripts because this is great to have when I'm doing regular expressions and pattern matching.

  6. I always use "set -o pipefail -o nounset -o errtrace -o functrace" to avoid issues with pipes failing in the middle, fat fingering variable names, and ease of tracing errors and functions. I know of others that often use " shopt -s inherit_errexit nullglob compat" and I can see the utility of these options as well.

  7. All error messages I print out follow a pattern that will let the programmer know where in the code the error was found and reported. echo -e "ERROR [${LINENO}] in ${FUNCNAME[*]}: ..." 1>&2

Consistently using widely accepted conventions and good programming practices can significantly reduce debug time and make your code easily portable and maintainable. For example, Bash doesn't require defining and initializing variables, but it can prevent using uninitialised values and lets users write better code and detect mispelled value names.

Having worked on code that uses all miniscule characters for variables and constants, my experience is that this practice makes it very difficult to clearly see where the variable is being used, and makes it very easy to make mistakes.

I use camel case naming in function names (personal preference, not convention). This makes it clear that I am calling a local function which I've created or sourced into the environment.

Lastly, I recommend using the "source" command, in place of the older '.' character when sourcing in code from another file. If nothing else, finding all the places where I'm sourcing something is much easier with this option.

There are a lot of skills I've learned in my career, far more than are relevant to this topic (yes, I've wandered far afield), but Bash is an incredibly useful and ubiquitous programming tool on *nix systems. Learning to write clear and maintainable code by following the common conventions is a mark of professional growth.

Brian S. Wilson
  • 561
  • 4
  • 11
  • `extglob` is not regular expressions. – Paul Hodges Aug 10 '23 at 14:48
  • I didn't say extglob is a regular expression. I said "...this is great to have when I'm doing regular expressions." Turning this option on allows extended regular expressions which are very useful (See also: https://www.linuxjournal.com/content/bash-extended-globbing) – Brian S. Wilson Aug 25 '23 at 14:14
  • Sorry, I have no idea how that changes my point. I apologize if I'm being dense, I do that a lot. I was just making sure readers draw a line - extended globbing gets you a LOT closer to regex capabilities, and are designed to be as similar as possible, but they still aren't quite the same. Just being pedantic for clarity, in case someone missed that detail. – Paul Hodges Aug 25 '23 at 16:53