141

I'm trying to do something common enough: Parse user input in a shell script. If the user provided a valid integer, the script does one thing, and if not valid, it does something else. Trouble is, I haven't found an easy (and reasonably elegant) way of doing this - I don't want to have to pick it apart char by char.

I know this must be easy but I don't know how. I could do it in a dozen languages, but not BASH!

In my research I found this:

Regular expression to test whether a string consists of a valid real number in base 10

And there's an answer therein that talks about regex, but so far as I know, that's a function available in C (among others). Still, it had what looked like a great answer so I tried it with grep, but grep didn't know what to do with it. I tried -P which on my box means to treat it as a PERL regexp - nada. Dash E (-E) didn't work either. And neither did -F.

Just to be clear, I'm trying something like this, looking for any output - from there, I'll hack up the script to take advantage of whatever I get. (IOW, I was expecting that a non-conforming input returns nothing while a valid line gets repeated.)

snafu=$(echo "$2" | grep -E "/^[-+]?(?:\.[0-9]+|(?:0|[1-9][0-9]*)(?:\.[0-9]*)?)$/")
if [ -z "$snafu" ] ;
then
   echo "Not an integer - nothing back from the grep"
else
   echo "Integer."
fi

Would someone please illustrate how this is most easily done?

Frankly, this is a short-coming of TEST, in my opinion. It should have a flag like this

if [ -I "string" ] ;
then
   echo "String is a valid integer."
else
   echo "String is not a valid integer."
fi
codeforester
  • 39,467
  • 16
  • 112
  • 140
Richard T
  • 4,570
  • 5
  • 37
  • 49
  • 4
    FYI: `[` is old compatible `test`; `[[` is Bash's new thing, with more operations and different quoting rules. If you've already decided to stick with Bash, go for `[[` (it's really much nicer); if you need portability to other shells, avoid `[[` completely. – ephemient Feb 05 '10 at 21:29
  • 3
    http://mywiki.wooledge.org/BashFAQ/054 – hakre Aug 01 '14 at 09:18

11 Answers11

210
[[ $var =~ ^-?[0-9]+$ ]]
  • The ^ indicates the beginning of the input pattern
  • The - is a literal "-"
  • The ? means "0 or 1 of the preceding (-)"
  • The + means "1 or more of the preceding ([0-9])"
  • The $ indicates the end of the input pattern

So the regex matches an optional - (for the case of negative numbers), followed by one or more decimal digits.

References:

Ian
  • 50,146
  • 13
  • 101
  • 111
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • 3
    Thanks Ignacio, I'll try it in a second. Would you mind explaining it so I can learn a little? I gather it reads, "At the start of the string (^), a minus sign (-) is optional (?), followed by any number of characters between zero and 9, inclusive" ... and what then might the +$ mean? Thanks. – Richard T Feb 05 '10 at 21:09
  • 10
    The `+` means "1 or more of the preceding", and the `$` indicates the end of the input pattern. So the regex matches an optional `-` followed by one or more decimal digits. – Ignacio Vazquez-Abrams Feb 05 '10 at 21:14
  • *grumbles re: the ABS link* – Charles Duffy May 07 '18 at 20:11
  • It's a tangent, but note that when specifying character ranges you can get odd results; for example, `[A-z]` would not only give you `A-Z` and `a-z` but also `\ `, `[`, `]`, `^`, `_`, and `\``. – Doktor J Jul 09 '18 at 18:13
  • Additionally, based on character collation ([see this related question/answer](https://stackoverflow.com/questions/35042202/regex-collating-symbols#answer-35760191)) something like `d[g-i]{2}` could end up not only matching `dig` but also `dish` in the collation suggested by that answer (where the `sh` digraph is considered a single character, collated after `h`). – Doktor J Jul 09 '18 at 18:16
  • Correction: `[[ $var =~ ^[-+]?[0-9]+$ ]]` – Andi Oct 19 '20 at 21:33
77

Wow... there are so many good solutions here!! Of all the solutions above, I agree with @nortally that using the -eq one liner is the coolest.

I am running GNU bash, version 4.1.5 (Debian). I have also checked this on ksh (SunSO 5.10).

Here is my version of checking if $1 is an integer or not:

if [ "$1" -eq "$1" ] 2>/dev/null
then
    echo "$1 is an integer !!"
else
    echo "ERROR: first parameter must be an integer."
    echo $USAGE
    exit 1
fi

This approach also accounts for negative numbers, which some of the other solutions will have a faulty negative result, and it will allow a prefix of "+" (e.g. +30) which obviously is an integer.

Results:

$ int_check.sh 123
123 is an integer !!

$ int_check.sh 123+
ERROR: first parameter must be an integer.

$ int_check.sh -123
-123 is an integer !!

$ int_check.sh +30
+30 is an integer !!

$ int_check.sh -123c
ERROR: first parameter must be an integer.

$ int_check.sh 123c
ERROR: first parameter must be an integer.

$ int_check.sh c123
ERROR: first parameter must be an integer.

The solution provided by Ignacio Vazquez-Abrams was also very neat (if you like regex) after it was explained. However, it does not handle positive numbers with the + prefix, but it can easily be fixed as below:

[[ $var =~ ^[-+]?[0-9]+$ ]]
Olivia Stork
  • 4,660
  • 5
  • 27
  • 40
Peter Ho
  • 779
  • 5
  • 3
  • Nice! Pretty similar to [this](http://stackoverflow.com/a/2212504/2235132), though. – devnull Oct 01 '13 at 13:47
  • Yes. It is similar. However, I was looking for a one liner solution for the "if" statement. I thought that I don't really need to call a function for this. Also, I can see that the redirection of the stderr to stdout in the function. When I tried, the stderr message "integer expression expected" was displayed which was not desirable for me. – Peter Ho Oct 01 '13 at 14:33
  • @PeterHo Avoiding regular expression where they are not necessary is always a good idea, because regular expressions are in most cases expensive. This solution can be used as an one liner with an or clause `test || die invalid`. – ceving Sep 23 '15 at 12:14
  • 3
    There's a notable distinction between your solution and the regex one: the size of the integer is checked towards bash limits (on my computer it's 64bits). This limit does not hit the regexp solution. So you solution will fail on number strictly greater than 9223372036854775807 on 64bits computers. – vaab Nov 06 '15 at 02:52
  • Doesn't really work with `-ne`. I don't think it will work with `-le`, `-ge` etc. either. – ADTC Nov 30 '15 at 04:49
  • 5
    As I recently discovered, there are [some caveats](http://stackoverflow.com/a/808740/1858225). – Kyle Strand Jul 19 '16 at 21:52
  • int_check.sh A returns it as integer ??? ./test2.sh a a is an integer !! – Nrj Apr 05 '17 at 15:39
  • This is an elegant solution. Works nicely, as long as I strip decimals. – Klaatu von Schlacker May 21 '17 at 06:37
  • Gives a false positive for `int_check.sh '1 '`. There are commands which won't handle the trailing space, for instance `seq '1 '`. – Socowi Jul 11 '18 at 19:03
  • I was looking to trap volume level from `0` to `100` and this answer is a good starting point. – WinEunuuchs2Unix Dec 31 '20 at 23:16
  • Depends on what you are looking for. Technically "-100" or "+100" are not integers as the chars "-" and "+" are not integers. – James Jun 04 '23 at 15:35
45

Latecomer to the party here. I'm extremely surprised none of the answers mention the simplest, fastest, most portable solution; the case statement.

case ${variable#[-+]} in
  *[!0-9]* | '') echo Not a number ;;
  * ) echo Valid number ;;
esac

The trimming of any sign before the comparison feels like a bit of a hack, but that makes the expression for the case statement so much simpler.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • 5
    I wish I could upvote this once every time I come back to this question because of dupes. It grinds my gears that a simple yet POSIX-compliant solution is buried in the bottom. – Adrian Frühwirth Apr 24 '14 at 09:10
  • 4
    Maybe you should take care of empty strings: ````''|*[!0-9]*)```` – Niklas Peter Dec 13 '15 at 07:43
  • 2
    BTW: Here is this syntax documented: http://tldp.org/LDP/abs/html/string-manipulation.html – Niklas Peter Dec 13 '15 at 07:49
  • I don't particularly condone the ABS; this is obviously also documented in the Bash manual. Anyway, the section you linked to doesn't describe this particular construct, but rather e.g. @Nortally's answer. – tripleee Dec 13 '15 at 08:26
  • @tripleee The linked document describes the construct for removing a string prefix from a variable used in the case line. It is just at the bottom of the page, but there are no anchors, so I could not directly link to it, see section "Substring Removal" – Niklas Peter Dec 13 '15 at 08:47
  • Oh, my bad, I was looking for documentation for the `case` statement or the wildcard syntax it supports. – tripleee Dec 13 '15 at 09:18
  • @NiklasPeter I somehow missed your (obvious!) point before. I have now updated this answer to return "Not a number" for an empty input. Thanks for bringing this up, and sorry for being daft. – tripleee Aug 02 '17 at 06:35
  • For what it's worth, parameter expansions are obviously also documented in the [Bash manual.](https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html) – tripleee May 02 '23 at 05:37
13

I like the solution using the -eq test, because it's basically a one-liner.

My own solution was to use parameter expansion to throw away all the numerals and see if there was anything left. (I'm still using 3.0, haven't used [[ or expr before, but glad to meet them.)

if [ "${INPUT_STRING//[0-9]}" = "" ]; then
  # yes, natural number
else
  # no, has non-numeral chars
fi
JamesThomasMoon
  • 6,169
  • 7
  • 37
  • 63
nortally
  • 347
  • 2
  • 9
11

For portability to pre-Bash 3.1 (when the =~ test was introduced), use expr.

if expr "$string" : '-\?[0-9]\+$' >/dev/null
then
  echo "String is a valid integer."
else
  echo "String is not a valid integer."
fi

expr STRING : REGEX searches for REGEX anchored at the start of STRING, echoing the first group (or length of match, if none) and returning success/failure. This is old regex syntax, hence the excess \. -\? means "maybe -", [0-9]\+ means "one or more digits", and $ means "end of string".

Bash also supports extended globs, though I don't recall from which version onwards.

shopt -s extglob
case "$string" of
    @(-|)[0-9]*([0-9]))
        echo "String is a valid integer." ;;
    *)
        echo "String is not a valid integer." ;;
esac

# equivalently, [[ $string = @(-|)[0-9]*([0-9])) ]]

@(-|) means "- or nothing", [0-9] means "digit", and *([0-9]) means "zero or more digits".

ephemient
  • 198,619
  • 38
  • 280
  • 391
  • Thank you ephemient, much obliged. I had never seen the =~ syntax before - and still have no idea what it's supposed to mean - approximately equal?! ...I've never been excited to program in BASH but it _is_ necessary some times! – Richard T Feb 05 '10 at 21:18
  • In `awk`, `~` was the "regex match" operator. In Perl (as copied from C), `~` was already used for "bit complement", so they used `=~`. This later notation got copied to several other languages. (Perl 5.10 and Perl 6 like `~~` more, but that has no impact here.) I suppose you could look at it as some sort of approximate equality... – ephemient Feb 05 '10 at 21:21
  • Excellent post AND edit! I really appreciate explaining what it means. I wish I could mark both yours and Ignacio's posts as THE correct answer. -frown- You guys are both great. But as you have double the reputation he does, I'm giving it to Ignacio - hope you understand! -smile- – Richard T Feb 05 '10 at 21:23
  • This also works in shell scripts with `#/bin/sh` on Ubuntu. – Dohn Joe Jul 17 '23 at 13:43
5

You can strip non-digits and do a comparison. Here's a demo script:

for num in "44" "-44" "44-" "4-4" "a4" "4a" ".4" "4.4" "-4.4" "09"
do
    match=${num//[^[:digit:]]}    # strip non-digits
    match=${match#0*}             # strip leading zeros
    echo -en "$num\t$match\t"
    case $num in
        $match|-$match)    echo "Integer";;
                     *)    echo "Not integer";;
    esac
done

This is what the test output looks like:

44      44      Integer
-44     44      Integer
44-     44      Not integer
4-4     44      Not integer
a4      4       Not integer
4a      4       Not integer
.4      4       Not integer
4.4     44      Not integer
-4.4    44      Not integer
09      9       Not integer
Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
  • Hi Dennis, Thank you for introducing me to the syntax to the right of match= above. I haven't ever noticed that type syntax before. I recognize some of the syntax from tr (a utility I haven't quite mastered, but fumble my way through sometimes); where can I read up on such syntax? (ie, what's this type of thing called?) Thanks. – Richard T Feb 06 '10 at 16:00
  • You can look in the Bash man page in the section called "Parameter Expansion" for information about `${var//string}` and `${var#string}` and in the section called "Pattern Matching" for [^[:digit:]]` (which is also covered in `man 7 regex`). – Dennis Williamson Feb 06 '10 at 19:33
  • 2
    `match=${match#0*}` does *not* strip leading zeroes, it strips at most one zero. Using expansion this can only be achieved using `extglob` via `match=${match##+(0)}`. – Adrian Frühwirth Apr 24 '14 at 09:15
  • Isn't 9 or 09 an integer ? – Mike Q May 17 '18 at 02:25
  • @MikeQ: `09` is not an integer if you consider an integer to not have leading zeros. The test is whether the input (`09`) equals a sanitized version (`9` - an integer) and it does not. – Dennis Williamson May 17 '18 at 03:46
  • @DennisWilliamson ah I see your point , yours is more strict typing or whatever, just as far as I'm concerned if I can add them as ints then they are ints.... eg $((( $a + $b ))) – Mike Q May 17 '18 at 03:54
  • @MikeQ: `a=09; b=1; echo $((a + b))` gives this error: `bash: 09: value too great for base (error token is "09")` because it thinks it's an octal value because of the leading `0`. – Dennis Williamson May 17 '18 at 17:13
  • @DennisWilliamson . I see that now. Regardless it depends on your . needs. Generally I would prefer to handle 09 as an int, so you can use sed to strip it or use expr 00010 + 0002 . and that works etc... – Mike Q May 18 '18 at 02:07
  • @MikeQ: You can coerce the number to base 10: `a=09; b=1; echo $((10#$a + b))` – Dennis Williamson May 18 '18 at 02:29
5

Here's yet another take on it (only using the test builtin command and its return code):

function is_int() { test "$@" -eq "$@" 2> /dev/null; } 
 
input="-123"
 
if is_int "$input"
then
   echo "Input: ${input}"
   echo "Integer: ${input}"
else
   echo "Not an integer: ${input}"
fi
Benjamin Gruenbaum
  • 270,886
  • 87
  • 504
  • 504
hans
  • 67
  • 1
  • 1
    It's not necessary to use `$()` with `if`. This works: `if is_int "$input"`. Also, the `$[]` form is deprecated. Use `$(())` instead. Inside either, the dollar sign can be omitted: `echo "Integer: $((input))"` Curly braces aren't necessary anywhere in your script. – Dennis Williamson Sep 05 '13 at 19:13
  • I would have expected this to also handle numbers in Bash's base notation as valid integers (which of course by some definition they are; but it might not agree with yours) but `test` doesn't appear to support this. `[[` does, though. `[[ 16#aa -eq 16#aa ]] && echo integer` prints "integer". – tripleee Jan 13 '17 at 09:33
  • Note that `[[` returns false positives for this method; e.g. `[[ f -eq f ]]` succeeds. So it must use `test` or `[`. – SpinUp __ A Davis May 08 '20 at 21:12
3

For me, the simplest solution was to use the variable inside a (()) expression, as so:

if ((VAR > 0))
then
  echo "$VAR is a positive integer."
fi

Of course, this solution is only valid if a value of zero doesn't make sense for your application. That happened to be true in my case, and this is much simpler than the other solutions.

As pointed out in the comments, this can make you subject to a code execution attack: The (( )) operator evaluates VAR, as stated in the Arithmetic Evaluation section of the bash(1) man page. Therefore, you should not use this technique when the source of the contents of VAR is uncertain (nor should you use ANY other form of variable expansion, of course).

Trebor Rude
  • 1,904
  • 1
  • 21
  • 31
  • You can even go simpler with `if (( var )); then echo "$var is an int."; fi` – Aaron R. Apr 02 '14 at 22:22
  • 2
    But that will also return true for negative integers, @aaronr, not what the OP was looking for. – Trebor Rude Apr 02 '14 at 22:28
  • 2
    This is dangerous, see: n=1 ; var="n" ; if (( var )); then echo "$var is an int."; fi – jarno Jan 03 '15 at 12:25
  • 2
    This is a very bad idea and subject to arbitrary code execution: try it yourself: `VAR='a[$(ls)]'; if ((VAR > 0)); then echo "$VAR is a positive integer"; fi`. At this point you're glad I didn't enter some evil command instead of `ls`. Because OP mentions _user input_, I do really hope you're not using this with user input in production code! – gniourf_gniourf Jan 05 '15 at 23:36
  • This does not work if the string contains some digits like: `agent007` – brablc Sep 13 '17 at 09:29
1

or with sed:

   test -z $(echo "2000" | sed s/[0-9]//g) && echo "integer" || echo "no integer"
   # integer

   test -z $(echo "ab12" | sed s/[0-9]//g) && echo "integer" || echo "no integer"
   # no integer
knipwim
  • 1,338
  • 1
  • 9
  • 10
  • In Bash and some other "Bourne plus" shells you can avoid the command substitution and external command with `test -z "${string//[0-9]/}" && echo "integer" || echo "no integer"` ... though that basically duplicates [Dennis Williamson's answer](https://stackoverflow.com/a/2211068/874188) – tripleee Aug 15 '17 at 04:17
  • Thanks! The only answer which actually works around here! – Evandro Coan Jul 06 '19 at 04:11
  • Silent alternative: `if [[ -n "$(printf "%s" "${2}" | sed s/[0-9]//g)" ]]; then` – Evandro Coan Jul 06 '19 at 04:15
0

Adding to the answer from Ignacio Vazquez-Abrams. This will allow for the + sign to precede the integer, and it will allow any number of zeros as decimal points. For example, this will allow +45.00000000 to be considered an integer.
However, $1 must be formatted to contain a decimal point. 45 is not considered an integer here, but 45.0 is.

if [[ $1 =~ ^-?[0-9]+.?[0]+$ ]]; then
    echo "yes, this is an integer"
elif [[ $1 =~ ^\+?[0-9]+.?[0]+$ ]]; then
    echo "yes, this is an integer"
else
    echo "no, this is not an integer"
fi
JustinMT
  • 13
  • 3
  • Is there a reason you use two different regular expressions for positive and negative numbers, instead of `^[-+]?[0-9]`...? – tripleee Aug 15 '17 at 04:13
0

For laughs I roughly just quickly worked out a set of functions to do this (is_string, is_int, is_float, is alpha string, or other) but there are more efficient (less code) ways to do this:

#!/bin/bash

function strindex() {
    x="${1%%$2*}"
    if [[ "$x" = "$1" ]] ;then
        true
    else
        if [ "${#x}" -gt 0 ] ;then
            false
        else
            true
        fi
    fi
}

function is_int() {
    if is_empty "${1}" ;then
        false
        return
    fi
    tmp=$(echo "${1}" | sed 's/[^0-9]*//g')
    if [[ $tmp == "${1}" ]] || [[ "-${tmp}" == "${1}" ]] ; then
        #echo "INT (${1}) tmp=$tmp"
        true
    else
        #echo "NOT INT (${1}) tmp=$tmp"
        false
    fi
}

function is_float() {
    if is_empty "${1}" ;then
        false
        return
    fi
    if ! strindex "${1}" "-" ; then
        false
        return
    fi
    tmp=$(echo "${1}" | sed 's/[^a-z. ]*//g')
    if [[ $tmp =~ "." ]] ; then
        #echo "FLOAT  (${1}) tmp=$tmp"
        true
    else
        #echo "NOT FLOAT  (${1}) tmp=$tmp"
        false
    fi
}

function is_strict_string() {
    if is_empty "${1}" ;then
        false
        return
    fi
    if [[ "${1}" =~ ^[A-Za-z]+$ ]]; then
        #echo "STRICT STRING (${1})"
        true
    else
        #echo "NOT STRICT STRING (${1})"
        false
    fi
}

function is_string() {
    if is_empty "${1}" || is_int "${1}" || is_float "${1}" || is_strict_string "${1}" ;then
        false
        return
    fi
    if [ ! -z "${1}" ] ;then
        true
        return
    fi
    false
}
function is_empty() {
    if [ -z "${1// }" ] ;then
        true
    else
        false
    fi
}

Run through some tests here, I defined that -44 is an int but 44- isn't etc.. :

for num in "44" "-44" "44-" "4-4" "a4" "4a" ".4" "4.4" "-4.4" "09" "hello" "h3llo!" "!!" " " "" ; do
    if is_int "$num" ;then
        echo "INT = $num"

    elif is_float "$num" ;then
        echo "FLOAT = $num"

    elif is_string "$num" ; then
        echo "STRING = $num"

    elif is_strict_string "$num" ; then
        echo "STRICT STRING = $num"
    else
        echo "OTHER = $num"
    fi
done

Output:

INT = 44
INT = -44
STRING = 44-
STRING = 4-4
STRING = a4
STRING = 4a
FLOAT = .4
FLOAT = 4.4
FLOAT = -4.4
INT = 09
STRICT STRING = hello
STRING = h3llo!
STRING = !!
OTHER =  
OTHER = 

NOTE: Leading 0's could infer something else when adding numbers such as octal so it would be better to strip them if you intend on treating '09' as an int (which I'm doing) (eg expr 09 + 0 or strip with sed)

Mike Q
  • 6,716
  • 5
  • 55
  • 62