I have a string which is read from a file and it contains all types of non-ascii characters like this
line=^AÀÀ^P^G^P^@^H15552655^@^@E$4c<84>%ÿ~^@^@^Ac<8f>/qu^Q»í&.WÈå
Now I just need to extract '15552655' number from this.
What I tried :
line=$(sed -n '1p' < file)
number=$(echo "${line//[!0-9]/}")
or
number=$(echo $line | sed 's/[^0-9]*//g')
But this returns '155526554', so I need a way to extract substring from the line that contains continuously at least 4 consecutive numbers [ Guaranteed that there will be atleast 4 numbers in that pattern ]
Any help is greatly appreciated.
Update-1 :
number=$(echo $line | sed 's/[^0-9]*\([0-9]\{1,\}\).*$/\1/')
This seems to work for the above case, but it will fail if the input is of this format
line=^AÀÀ^P^4G^P^@^H15552655^@^@E$4c<84>%ÿ~^@^@^Ac<8f>/qu^Q»í&.WÈå
In this case it returns 4 i.e. it returns first run of numbers. I need to add something that says give me longest or more than 4 numbers.