I need to detect last digits in the string, as they are indexes for my strings. They may be 2^64, So it's not convenient to check only last element in the string, then try second... etc.
String may be like asdgaf1_hsg534
, i.e. in the string may be other digits too, but there are somewhere in the middle and they are not neighboring with the index I want to get.
-
4Did you try to write something? – Maroun Nov 20 '12 at 11:14
-
3Can you post an example input/output? – PearsonArtPhoto Nov 20 '12 at 11:14
-
I have strings as an input, and I have to pass last digits in the string to other function as an argument. So I need to parse these strings and get last digits. strings are like in the question - "asdgaf1_hsg534", "asdfh23_hsjd12", "dgshg_jhfsd86", etc. For mentioned strings I need to get 534, 12 and 86 from mentioned strings – danny Nov 20 '12 at 11:36
-
2i just wrote an answer and deleted it. you should really try it yourself. it is good practise for beginners :) – devsnd Nov 20 '12 at 11:49
-
534 is not a digit. it's a number – alinsoar Nov 20 '12 at 11:51
-
15 3 and 4 are last digits in the string – danny Nov 20 '12 at 11:58
-
suggest you show what code you do have. – Paul Collingwood Nov 20 '12 at 12:09
3 Answers
Here is a method using re.sub
:
import re
input = ['asdgaf1_hsg534', 'asdfh23_hsjd12', 'dgshg_jhfsd86']
for s in input:
print re.sub('.*?([0-9]*)$',r'\1',s)
Output:
534
12
86
Explanation:
The function takes a regular expression
, a replacement string
, and the string
you want to do the replacement on: re.sub(regex,replace,string)
The regex '.*?([0-9]*)$'
matches the whole string and captures the number that precedes the end of the string. Parenthesis are used to capture parts of the match we are interested in, \1
refers to the first capture group and \2
the second ect..
.*? # Matches anything (non-greedy)
([0-9]*) # Upto a zero or more digits digit (captured)
$ # Followed by the end-of-string identifier
So we are replacing the whole string with just the captured number we are interested in. In python we need to use raw strings for this: r'\1'
. If the string doesn't end with digits then a blank string with be returned.
twosixfour = "get_the_numb3r_2_^_64__18446744073709551615"
print re.sub('.*?([0-9]*)$',r'\1',twosixfour)
>>> 18446744073709551615

- 83,387
- 30
- 160
- 202
A simple regex can detect digits at the end of the string:
'\d+$'
$
matches the end of the string. \d+
matches one or more digits. The +
operator is greedy by default, meaning it matches as many digits as possible. So this will match all of the digits at the end of the string.

- 6,576
- 2
- 18
- 29
-
Thanks for your help. I did it: is_match = re.match(r'(.*)(\D)(\d+)', myString) if is_match: print is_match.group(3) It works – danny Nov 20 '12 at 13:03
If you want to use re.sub
and make sure that there is at least a single digit present at the end of the line, you can use the quantifier +
to match 1 or more digits \d+
to not remove the whole line if there are no digits present or no digits only at the end of the line.
^.*?(\d+)$
^
Start of line.*?
Match any char except a newline as least as possible (non greedy)(\d+)
Capture group 1, match 1+ digits$
End of line
Or using a negative lookbehind
^.*(?<!\d)(\d+)$
^
Start of line.*
Match any char except a newline as much as possible(?<!\d)(\d+)
Assert no digits directly to the left, then capture 1+ digits in group 1$
End of line
When using re.match, you can omit the ^
anchor and you might also use \A
and \Z
to asert the start and the end of the string.
import re
strings = ['asdgaf1_hsg534', 'asdfh23_hsjd12', 'dgshg_jhfsd86', 'test']
for s in strings:
print (re.sub(r".*?(\d+)$", r'\1',s))
Output
534
12
86
test
If there should be a non digit present before matching a digit as in this comment you could use a negated character class with a single capture group.
^.*[^\d\r\n](\d+)
^
Start of line.*
Match any char except a newline as much as possible[^\d\r\n]
Negated character class, match any char except a digit or a newline(\d+)
Capture group 1, match 1+ digits
To get the last digits in the string (not necessarily at the end of the string)
^.*?(\d+)[^\r\n\d]*$
^
Start of line.*?
Match any char except a newline as least as possible (non greedy)(\d+)
Capture group 1, match 1+ digits[^\r\n\d]*
Negated character class, match 0+ times any char except a newline or digit$
End of line

- 154,723
- 16
- 55
- 70