33

I have string contains a path

string="toto.titi.12.tata.2.abc.def"

I want to extract only the numbers from this string.

To extract the first number:

tmp="${string#toto.titi.*.}"
num1="${tmp%.tata*}"

To extract the second number:

tmp="${string#toto.titi.*.tata.*.}"
num2="${tmp%.abc.def}"

So to extract a parameter I have to do it in 2 steps. How to extract a number with one step?

Amir
  • 10,600
  • 9
  • 48
  • 75
MOHAMED
  • 41,599
  • 58
  • 163
  • 268
  • 1
    This question has been sitting around for a while now. If none of the answers provide what you're looking for, then could you update your question to clarify your requirements a little more? – ghoti Jun 03 '16 at 16:58
  • 3
    `echo ${string} | grep -o -E "[0-9]+"` i think is the most concise and easiest to understand (most everyone knows grep). from: https://stackoverflow.com/a/52947167/52074 – Trevor Boyd Smith Oct 28 '20 at 15:51

12 Answers12

30

You can use tr to delete all of the non-digit characters, like so:

echo toto.titi.12.tata.2.abc.def | tr -d -c 0-9
mti2935
  • 11,465
  • 3
  • 29
  • 33
  • 3
    The output of this appears to mash all the numbers together, making `122` in your example. How might they be separated? – ghoti Jun 03 '16 at 17:03
  • in order to set it into variable use- PARAM=\`echo toto.titi.12.tata.2.abc.def | tr -d -c 0-9 \` – Adir Dayan Jul 14 '20 at 15:03
15

To extract all the individual numbers and print one number word per line pipe through -

tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g'

Breakdown:

  • Replaces all line breaks with spaces: tr '\n' ' '
  • Replaces all non numbers with spaces: sed -e 's/[^0-9]/ /g'
  • Remove leading white space: -e 's/^ *//g'
  • Remove trailing white space: -e 's/ *$//g'
  • Squeeze spaces in sequence to 1 space: tr -s ' '
  • Replace remaining space separators with line break: sed 's/ /\n/g'

Example:

echo -e " this 20 is 2sen\nten324ce 2 sort of" | tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g'

Will print out

20
2
324
2
Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
cchamberlain
  • 17,444
  • 7
  • 59
  • 72
12

Here is a short one:

string="toto.titi.12.tata.2.abc.def"
id=$(echo "$string" | grep -o -E '[0-9]+')

echo $id // => output: 12 2

with space between the numbers. Hope it helps...

Adi Azarya
  • 4,015
  • 3
  • 18
  • 26
9

Parameter expansion would seem to be the order of the day.

$ string="toto.titi.12.tata.2.abc.def"
$ read num1 num2 <<<${string//[^0-9]/ }
$ echo "$num1 / $num2"
12 / 2

This of course depends on the format of $string. But at least for the example you've provided, it seems to work.

This may be superior to anubhava's awk solution which requires a subshell. I also like chepner's solution, but regular expressions are "heavier" than parameter expansion (though obviously way more precise). (Note that in the expression above, [^0-9] may look like a regex atom, but it is not.)

You can read about this form or Parameter Expansion in the bash man page. Note that ${string//this/that} (as well as the <<<) is a bashism, and is not compatible with traditional Bourne or posix shells.

ghoti
  • 45,319
  • 8
  • 65
  • 104
  • 2
    What exactly do you mean that it depends on the format of `$string`? I can't think of any example that would break it. – PesaThe Aug 09 '18 at 11:18
  • 1
    Heh, this is an old question. :) The only thing I can think of at this point is that if there are additional numbers, say `aa12aa34aa56`, and you only read two variables, the trailing numbers get added to the last variable, separated by spaces. If this was a concern, then a better solution might be to read the string into an array: `read -a nums <<<"${string//[^0-9]/ }"`. – ghoti Aug 09 '18 at 14:28
4

Convert your string to an array like this:

$ str="toto.titi.12.tata.2.abc.def"
$ arr=( ${str//[!0-9]/ } )
$ echo "${arr[@]}"
12 2
Ivan
  • 6,188
  • 1
  • 16
  • 23
3

This would be easier to answer if you provided exactly the output you're looking to get. If you mean you want to get just the digits out of the string, and remove everything else, you can do this:

d@AirBox:~$ string="toto.titi.12.tata.2.abc.def"
d@AirBox:~$ echo "${string//[a-z,.]/}"
122

If you clarify a bit I may be able to help more.

drldcsta
  • 413
  • 3
  • 8
  • I updated my question. I want to extraxt the 12 and then extract 2. not extract both numbers at the same time – MOHAMED Jul 26 '13 at 15:45
2

You can also use sed:

echo "toto.titi.12.tata.2.abc.def" | sed 's/[0-9]*//g'

Here, sed replaces

  • any digits (class [0-9])
  • repeated any number of times (*)
  • with nothing (nothing between the second and third /),
  • and g stands for globally.

Output will be:

toto.titi..tata..abc.def
Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
jderefinko
  • 647
  • 4
  • 6
1

Use regular expression matching:

string="toto.titi.12.tata.2.abc.def"
[[ $string =~ toto\.titi\.([0-9]+)\.tata\.([0-9]+)\. ]]
# BASH_REMATCH[0] would be "toto.titi.12.tata.2.", the entire match
# Successive elements of the array correspond to the parenthesized
# subexpressions, in left-to-right order. (If there are nested parentheses,
# they are numbered in depth-first order.)
first_number=${BASH_REMATCH[1]}
second_number=${BASH_REMATCH[2]}
chepner
  • 497,756
  • 71
  • 530
  • 681
1

Using awk:

arr=( $(echo $string | awk -F "." '{print $3, $5}') )
num1=${arr[0]}
num2=${arr[1]}
anubhava
  • 761,203
  • 64
  • 569
  • 643
1

Hi adding yet another way to do this using 'cut',

echo $string | cut -d'.' -f3,5 | tr '.' ' '

This gives you the following output: 12 2

Vivek-Ananth
  • 494
  • 4
  • 4
0

Fixing newline issue (for mac terminal):

cat temp.txt | tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed $'s/ /\\\n/g'
Obsidian
  • 3,719
  • 8
  • 17
  • 30
0

Assumptions:

  • there is no embedded white space
  • the string of text always has 7 period-delimited strings
  • the string always contains numbers in the 3rd and 5th period-delimited positions

One bash idea that does not require spawning any subprocesses:

$ string="toto.titi.12.tata.2.abc.def"

$ IFS=. read -r x1 x2 num1 x3 num2 rest <<< "${string}"
$ typeset -p num1 num2
declare -- num1="12"
declare -- num2="2"

In a comment OP has stated they wish to extract only one number at a time; the same approach can still be used, eg:

$ string="toto.titi.12.tata.2.abc.def"

$ IFS=. read -r x1 x2 num1 rest <<< "${string}"
$ typeset -p num1
declare -- num1="12"

$ IFS=. read -r x1 x2 x3 x4 num2 rest <<< "${string}"
$ typeset -p num2
declare -- num2="2"

A variation on anubhava's answer that uses parameter expansion instead of a subprocess call to awk, and still working with the same set of initial assumptions:

$ arr=( ${string//./ } )
$ num1=${arr[2]}
$ num2=${arr[4]}
$ typeset -p num1 num2
declare -- num1="12"
declare -- num2="2"
markp-fuso
  • 28,790
  • 4
  • 16
  • 36