2

This is my input string :

str = '32 -3.723 +98.6 .357 0.86'

And my regex is :

print re.findall('[+-]?\d*\.?\d*',str)

It returns :

['32', '', '-3.723', '', '+98.6', '', '.357', '', '0.86', '']

What I could not understand why all these empty strings in between.

DjaouadNM
  • 22,013
  • 4
  • 33
  • 55
Rakesh kumar
  • 609
  • 8
  • 20

1 Answers1

5

what I could not understand why all these missing comes in between

All of the elements of your regex are optional, which means the regex can (and does) match the empty string.

[+-]? - ZERO or one matches
\d*   - ZERO or more matches
\.?   - ZERO or one matches
\d*   - ZERO or more matches

At every position in the input, the regex tries to find the longest match. For example, here

'32 -3.723 +98.6 .357 0.86'
   ^

the longest match is the empty string.

There are several ways to work around this. Rather than trying to shoehorn the regex into not matching empty strings, I personally would filter them out post-matching.

NPE
  • 486,780
  • 108
  • 951
  • 1,012
  • Thank you dear NPE for such a great insight. My mind boggled over this while searching for reasons, but failed. BTW, I am also going to filter them out post-matching. – Rakesh kumar Sep 05 '17 at 13:08