[This is a heavily re-edited version. Please ignore past versions of this question.]
A small python script using a sophisticated regex was provided by eyquem to identify numbers in a string and sanitize them. The test results cover over 50 samples, which I won't repeat here.
The question is, can someone adjust that regexp or provide a new one so that commas are treated more sanely?
In particular, I would like to see the following 4 test inputs produce the associated outputs.
- ' 4,8.3,5 ' -> '4' '8.3' '5'
- ' 44,22,333,888 ' -> '44' '22,333,888' #### Note that 44,22 is never a single number.
- ' 11,333e22,444 ' -> '11,333e22' '444' #### 11,333 is accepted in front of e22, but 22,444 is not accepted after it.
- ' 1,999 people found the code "i+=1999;" to be crystal clear in meaning and to likely lead to less than 1999 kilobytes extra memory consumption; however, the gains in 1, 999, and 1999 KB disk space are anything but ideal, especially this being 1999 and us having over $1,999 to work with! ' -> '1,999' '1999' '1999' '1' '999' '1999' '1999' '1,999'