Bug or feature in Bash test operator [[ ... -eq ... ]]?

Question

Can somebody explain the difference between:

VAR=1xyz && [[ $VAR -eq $VAR ]] 2>/dev/null && echo "Yes, VAR = $VAR is an integer" || echo "No, VAR = $VAR is NOT an integer"
No, VAR = 1xyz is NOT an integer

And:

VAR=xyz1 && [[ $VAR -eq $VAR ]] 2>/dev/null && echo "Yes, VAR = $VAR is an integer" || echo "No, VAR = $VAR is NOT an integer"
Yes, VAR = xyz1 is an integer

Is this a bug or feature in Bash?

If instead of [[ ... ]] I use [ ... ], I am getting the expected result that $VAR is not an integer in both cases.

Please can you elaborate, the only difference is variable content, "1xyz" vs. "xyz1", but the exist status of [[ ... -eq ... ]] is DIFFERENT in two cases! I was expecting NOT an integer in both cases... — user9751447, May 07 '18 at 08:06
@user9751447, if something isn't an integer but *is* a variable name, it gets treated like the name of a variable that might contain an integer. And an empty variable effectively contains the integer 0. Which is to say that `[[ $foo -eq $foo ]]` is not a safe way to check if `foo` is an integer. See [How do I check if a variable is a number in bash?](https://stackoverflow.com/questions/806906/how-do-i-test-if-a-variable-is-a-number-in-bash) for some practices that *do* work. — Charles Duffy, May 07 '18 at 20:07

score 4 · Accepted Answer · edited Jun 20 '20 at 09:12

To understand what's going on here, you need to be clear about two things:

1. The precise meaning of conditional constructs

In most languages, there is some kind of value which can be interpreted as true or false. This might be a boolean datatype, an integer (where 0 is false and everything else is true) or some notion of "truthiness" which is implemented type by type.

But Posix shells do not have "truthy" and "falsey" values. What they have are statements which might succeed or fail. What "success" and "failure" mean is mostly up to the command itself to determine, but bash itself will classify certain behaviours as failure. For example, if the shell cannot figure out what a command-name refers to, it will regard the command as having failed:

$ undefined_command && echo Yes || echo No
undefined_command: command not found
No

Also, if a command is terminated by a signal, such as a Segmentation Fault, the shell will count that as failure:

$ ./segfault && echo Yes || echo No
Segmentation fault (core dumped)
No

But many commands also signal failure even though the error is not fatal. (They do this by setting their status to a non-zero value.) For example, ls returns failure if any of the filename arguments does not exist (even if other ones do):

$ ls no_file exists && echo Yes || echo No
ls: cannot access 'no_file': No such file or directory
-rw-rw-r-- 1 rici rici 0 May  7 13:13 exists
No

As shown, there is usually (though not always) an error message printed to stderr which gives some hint about the cause of the failure. If you want to confuse yourself, you can usually suppress the error message:

$ undefined_command 2>/dev/null && echo Yes || echo No
No
$ ls no_file exists 2>/dev/null && echo Yes || echo No
-rw-rw-r-- 1 rici rici 0 May  7 13:13 exists
No

And that's precisely what you did in the original question. If we don't hide the error message, it becomes more obvious what is going on:

$ VAR=1xyz && [[ $VAR -eq $VAR ]] && echo Yes || echo No
bash: [[: 1xyz: value too great for base (error token is "1xyz")
No
$ VAR=xyz1 && [[ $VAR -eq $VAR ]] && echo Yes || echo No
Yes

In other words, trying to use the string 1xyz as a number (since -eq is numeric equality) produces an error, which is counted as failure. However, the string xyz1 is a valid numeric value. We'll see why that's the case in the next section.

But before we get to that, we need to note that [[ ... ]] is a command (albeit a bash extension), not some exception to the rule that the shell does not have boolean values. Like any other command, [[ can succeed or fail; it's documentation indicates that it succeeds if it evaluates its arguments as "true". Although in bash [[ is a built-in command -- necessarily so, because it requires different argument parsing rules -- it is still a command, and it evaluates its arguments itself, just like [.

2. The idiosyncracies of arithmetic evaluation

Arithmetic evaluation takes place in the expansion of $(( ... )) (in any Posix shell), and in a number of other numeric contexts (in Bash and other shells which extend the Posix standard), including the arithmetic conditional (( ... )) and the arguments to numeric comparison operators inside of [[ ... ]] and $[[ ... ]]. In bash, arithmetic evaluation is also used for assignments to variables declared as arithmetic (with declare -i) and for the subscripts to arrays (not associative arrays).

For the purposes of this question, the most important feature of arithmetic evaluation is that an argument can be the name of a shell variable (just the name, without the $). In that, case the value of that variable is converted to an integer, if possible, and used as the argument. Although not required by the Posix standard, almost all shells will consider an undefined variable or a variable whose value is empty to have the numeric value 0. But if the variable has a non-empty value which cannot be converted to a number, then an error is produced.

That's subtly different from the case where the variable name is preceded by a $. If a variable name is preceded by a $, then ordinary parameter substitution will take place before as normal, before the arithmetic expression is computed. So, in the case of the second example in the question,

VAR=xyz1 && [[ $VAR -eq $VAR ]] && echo Yes || echo No

the result of parameter expansion will be

[[ xyz1 -eq xyz1 ]]

and since xyz1 is (presumably) not defined, that will be evaluated as though it were comparing 0 with 0, which is true (and therefore the command will succeed). The same result would occur if xyz1 were defined as a numeric string, but not if its value could not be converted to an integer:

$ VAR=xyz1 && xyz1=42 && [[ $VAR -eq $VAR ]] && echo Yes || echo No
Yes
$ VAR=xyz1 && xyz1=42z && [[ $VAR -eq $VAR ]] && echo Yes || echo No
bash: [[: 42z: value too great for base (error token is "42z")
No

Bash's numeric evaluation rules are actually quite a bit more complicated (and unsafe if applied to untrusted input). I won't go into all the details, but basically bash will perform arithmetic evaluation on the value of a variable whose name is used as an argument in arithmetic evaluation. In effect, this allows recursive substitution of variable names, but it also allows you to set a variable's value to something more complicated:

$ x=y+7
$ y=35
$ echo $((x))
42

Well done rici. Between you and Charles, I knew we would get to the bottom of it. — David C. Rankin, May 07 '18 at 20:35

Bug or feature in Bash test operator [[ ... -eq ... ]]?

1 Answers1

1. The precise meaning of conditional constructs

2. The idiosyncracies of arithmetic evaluation

Linked

Related