0
import re

sequence = 'i have -0.03 dollars in my hand'

m = re.search('(have )(-\w[.]+)( dollars\w+)',sequence)

print m.group(0)
print m.group(1)
print m.group(2)

Looking for a way to extract text between two occurrences. In this case, the format is 'i have ' followed by - floats and then followed by ' dollars\w+'

How do i use re.search to extract this float ? Why don't the groups work this way ? I know there's something I can tweak to get it to work with these groups. any help would be greatly appreciated

I thought I could use groups with paranthesis but i got an eror

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
O.rka
  • 29,847
  • 68
  • 194
  • 309

3 Answers3

2

-\w[.]+ does not match -0.03 because [.] matches . literally because . is inside the [...].

\w after dollars also prevent the pattern to match the sequence. There no word character after dollars.

Use (-?\d+\.\d+) as pattern:

import re

sequence = 'i have -0.03 dollars in my hand'

m = re.search(r'(have )(-?\d+\.\d+)( dollars)', sequence)

print m.group(1) # captured group start from `1`.
print m.group(2) 
print m.group(3)

BTW, captured group numbers start from 1. (group(0) returns entire matched string)

falsetru
  • 357,413
  • 63
  • 732
  • 636
2

Your regex doesn't match for several reasons:

  • it always requires a - (OK in this case, questionable in general)
  • it requires exactly one digit before the . (and it even allows non-digits like A).
  • it allows any number of dots, but no more digits after the dots.
  • it requires one or more alphanumerics immediately after dollars.

So it would match "I have -X.... dollarsFOO in my hand" but not "I have 0.10 dollars in my hand".

Also, there is no use in putting fixed texts into capturing parentheses.

m = re.search(r'\bhave (-?\d+\.\d+) dollars\b', sequence)

would make much more sense.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
0

This question has already been asked in many formulations before. You're looking for a regular expression that will find a number. Since number formats may include decimals, commas, exponents, plus/minus signs, and leading zeros, you'll need a robust regular expression. Fortunately, this regular expression has already been written for you.

See How to extract a floating number from a string and Regular expression to match numbers with or without commas and decimals in text

Community
  • 1
  • 1
IceArdor
  • 1,961
  • 19
  • 20