153

If I have the following values:

 $var1 = AR3,373.31

 $var2 = 12.322,11T

How can I create a new variable and set it to a copy of the data that has any non-numeric characters removed, with the exception of commas and periods? The values above would return the following results:

 $var1_copy = 3,373.31

 $var2_copy = 12.322,11
Makyen
  • 31,849
  • 12
  • 86
  • 121
user485783
  • 1,685
  • 3
  • 11
  • 12
  • For researchers seeking to remove all non-numeric characters from a string (including separators), see [String Sanitization: How to remove all non-numeric characters from a string?](https://stackoverflow.com/q/6936402/2943403) (circa August 2011) – mickmackusa Mar 31 '21 at 22:04

5 Answers5

346

You could use preg_replace to swap out all non-numeric characters and the comma and period/full stop as follows:

$testString = '12.322,11T';
echo preg_replace('/[^0-9,.]+/', '', $testString);

The pattern can also be expressed as /[^\d,.]+/

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
John Parker
  • 54,048
  • 11
  • 129
  • 129
  • 6
    see also php function money_function() ( http://php.net/manual/en/function.money-format.php ) – horatio Feb 09 '11 at 19:28
  • 5
    [^0-9] will match any non-numeric character, so it's not necessary to add the comma and full stop. This is sufficient: echo preg_replace('/[^0-9]/','',$testString); – billrichards Mar 18 '14 at 16:02
  • 7
    @billrichards I don't think that's correct. Remember he wants to retain the comma and full stop, along with the numeric characters, not remove them. – Richt222 Mar 31 '14 at 15:38
  • 2
    @billrichards As the OP stated (emphasis mine), "avoid alphabets or characters ***except comma and dot***". – John Parker Mar 31 '14 at 15:42
  • doesn't that period need a backslash? period means "any character" doesn't it? – Scott Sep 12 '16 at 16:19
  • To clarify what @billrichards wrote, `preg_replace( '/[^0-9]/', '' , $testString );` will remove all non-numeric characters, while the accepted answer will remove all non-number characters, but will also leave all commas and periods in place. – jg314 Jan 17 '17 at 21:57
  • @Scott, you don't have to escape metacharacters inside a character class. See http://www.regular-expressions.info/charclass.html – QuickDanger May 25 '17 at 15:27
68

I'm surprised there's been no mention of filter_var here for this being such an old question...

PHP has a built in method of doing this using sanitization filters. Specifically, the one to use in this situation is FILTER_SANITIZE_NUMBER_FLOAT with the FILTER_FLAG_ALLOW_FRACTION | FILTER_FLAG_ALLOW_THOUSAND flags. Like so:

$numeric_filtered = filter_var("AR3,373.31", FILTER_SANITIZE_NUMBER_FLOAT,
    FILTER_FLAG_ALLOW_FRACTION | FILTER_FLAG_ALLOW_THOUSAND);
echo $numeric_filtered; // Will print "3,373.31"

It might also be worthwhile to note that because it's built-in to PHP, it's slightly faster than using regex with PHP's current libraries (albeit literally in nanoseconds).

Bryan Way
  • 1,903
  • 3
  • 17
  • 27
26

Simplest way to truly remove all non-numeric characters:

echo preg_replace('/\D/', '', $string);

\D represents "any character that is not a decimal digit"

http://php.net/manual/en/regexp.reference.escape.php

mopo922
  • 6,293
  • 3
  • 28
  • 31
  • 2
    This will not work for the OP's requirements. This is the correct answer to a different question. – mickmackusa Sep 17 '20 at 22:19
  • @mickmackusa you're right. The question has been heavily edited since I wrote this answer. Still, it appears to be helpful information. – mopo922 Sep 24 '20 at 13:40
  • The requirement to retain commas and dots has been consistent throughout the edit history. I am always disappointed when incorrect answers are highly upvoted because 1. The answers are misinforming researchers or posted on the wrong page and 2. The answerers are receiving "trust points" that they should not be receiving. – mickmackusa Sep 24 '20 at 19:55
  • "The requirement to retain commas and dots has been consistent throughout the edit history" Yes, but it was obscured by the wording of the title until a few months after this answer. – mopo922 Sep 24 '20 at 20:18
  • 3
    Please consider removing this incorrect answer. Incorrect answers detract from correct ones and potentially confuse researchers and waste researchers' time reading inappropriate insights. Here is another example of content that missed being called out: https://stackoverflow.com/a/37500756/2943403 Stack Overflow is a less effective researching tool when the content is provably incorrect. All of these upvotes have been ill-gotten. – mickmackusa Sep 24 '20 at 20:22
  • The question has been edited quite a bit since this answer was posted. People are still finding this useful, so leaving it for now. – mopo922 Mar 31 '21 at 16:16
  • The question has been clear about needing to retain the thousands placeholder and decimal point since the original posting. I am saddened to hear that you think that an answer **which you know is incorrect** should stay on this page. I have posted a comment under the question to link to a page 5 years older than your answer. In doing so, researchers find the same advice without needing incorrect answers on this page. I hope you will see the folly in keeping this inappropriate answer here and show greater care for SO by removing this answer. – mickmackusa Mar 31 '21 at 22:06
5

You could use filter_var to remove all illegal characters except digits, dot and the comma.

  • The FILTER_SANITIZE_NUMBER_FLOAT filter is used to remove all non-numeric character from the string.
  • FILTER_FLAG_ALLOW_FRACTION is allowing fraction separator " . "
  • The purpose of FILTER_FLAG_ALLOW_THOUSAND to get comma from the string.

Code

$var1 = '12.322,11T';

echo filter_var($var1, FILTER_SANITIZE_NUMBER_FLOAT, FILTER_FLAG_ALLOW_FRACTION | FILTER_FLAG_ALLOW_THOUSAND);

Output

12.322,11

To read more about filter_var() and Sanitize filters

Adeel
  • 2,901
  • 7
  • 24
  • 34
1

If letters are always in the beginning or at the end, you can simply just use trim...no regex needed

$string = trim($string, "a..zA..Z"); // this also take care of lowercase

"AR3,373.31" --> "3,373.31"
"12.322,11T" --> "12.322,11"
"12.322,11"  --> "12.322,11"
Andrew
  • 2,810
  • 4
  • 18
  • 32
  • Important: This will only remove letters from the string. Other characters, such as spaces, brackets, quotes, etc. are kept inside the result. --> Only use this solution, if you know that your input string only contains letters and numbers, and no other characters! – Philipp Sep 14 '18 at 12:05