3

Our sanitisation on a salary input replaces all characters other than numerics (0-9) and points (.) with a blank string.

[^0-9.]

Which results in a string like "£63,000.00" becoming "63000.00", which can easily be converted to a float. Now I want to sanitise the case where users have used a point as the thousand separator as well as the decimal separator (eg "63.000.00"). How would I write the regex to remove all but the LAST point character in a string?

Martin Joiner
  • 3,529
  • 2
  • 23
  • 49
  • 1
    regex is the wrong tool for internationalizing numbers. – zzzzBov May 21 '15 at 14:21
  • Are you always guaranteed to have two digits after the decimal? – Mr. Llama May 21 '15 at 14:21
  • No, users input all sorts of funny stuff. But if they put 2 decimals I'm relatively confident that stripping them all but keeping the last one will achieve the desired affect. I can assure you the calculation they get back on the front end will make it quite clear how the number is being interpretted. – Martin Joiner May 21 '15 at 14:23
  • 1
    @zzzzBov is there really a country that legitimately uses the decimal point for BOTH the thousand separator and the decimal separator? I am treating this as a typo handling feature, not internationalisation feature. – Martin Joiner May 21 '15 at 14:26
  • If the input is coming from HTML, you can always [enforce data format there](http://stackoverflow.com/q/14650932/477563). Probably less painful then guessing at their format after the fact. – Mr. Llama May 21 '15 at 14:26
  • 1
    @MartinJoiner, [welcome to i18n](http://en.wikipedia.org/wiki/Decimal_mark#Examples_of_use). – zzzzBov May 21 '15 at 14:27
  • @Mr.Llama Although we leverage many HTML5 features we don't use type="number" because it doesn't support comma-separated thousands. We have found that many users do enter a salary as "50,000" rather than "50000" so restricting their freedom to use commas would needlessly damage user experience. I personally find 4 consecutive zeros very hard to read without the comma. – Martin Joiner May 21 '15 at 17:15
  • @zzzzBov We've dropped support for IE6, we're not going to start supporting an out-dated form of Spanish handwriting. – Martin Joiner May 21 '15 at 17:24
  • @MartinJoiner, you've apparently missed the point of my original statement. If you're looking to internationalize how you handle currency, use a library that has already solved the edge cases for you. A clumsy regex is the wrong tool for the job. – zzzzBov May 21 '15 at 19:20

1 Answers1

6

You can use this rgex for replacement:

\.(?![^.]+$)|[^0-9.]

RegEx Demo

replace by empty string.

anubhava
  • 761,203
  • 64
  • 569
  • 643