118

I need for text like "joe ($3,004.50)" to be filtered down to 3004.50 but am terrible at regex and can't find a suitable solution. So only numbers and periods should stay - everything else filtered. I use C# and VS.net 2008 framework 3.5

Ready Cent
  • 1,821
  • 3
  • 18
  • 25

5 Answers5

210

This should do it:

string s = "joe ($3,004.50)";
s = Regex.Replace(s, "[^0-9.]", "");
budi
  • 6,351
  • 10
  • 55
  • 80
josephj1989
  • 9,509
  • 9
  • 48
  • 70
41

The regex is:

[^0-9.]

You can cache the regex:

Regex not_num_period = new Regex("[^0-9.]")

then use:

string result = not_num_period.Replace("joe ($3,004.50)", "");

However, you should keep in mind that some cultures have different conventions for writing monetary amounts, such as: 3.004,50.

Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
3

You are dealing with a string - string is an IEumerable<char>, so you can use LINQ:

var input = "joe ($3,004.50)";
var result = String.Join("", input.Where(c => Char.IsDigit(c) || c == '.'));

Console.WriteLine(result);   // 3004.50
w.b
  • 11,026
  • 5
  • 30
  • 49
2

For the accepted answer, MatthewGunn raises a valid point in that all digits, commas, and periods in the entire string will be condensed together. This will avoid that:

string s = "joe.smith ($3,004.50)";
Regex r = new Regex(@"(?:^|[^w.,])(\d[\d,.]+)(?=\W|$)/)");
Match m = r.match(s);
string v = null;
if (m.Success) {
  v = m.Groups[1].Value;
  v = Regex.Replace(v, ",", "");
}
mindriot
  • 14,149
  • 4
  • 29
  • 40
  • Seems above regex has extra parenthesis. Using `(?:^|[^w.,])(\d[\d,.]+)(?=\W|$)` will also match "h25" in the string "joe.smith25 ($3,004.50)" – Rivka Nov 18 '19 at 18:38
1

The approach of removing offending characters is potentially problematic. What if there's another . in the string somewhere? It won't be removed, though it should!

Removing non-digits or periods, the string joe.smith ($3,004.50) would transform into the unparseable .3004.50.

Imho, it is better to match a specific pattern, and extract it using a group. Something simple would be to find all contiguous commas, digits, and periods with regexp:

[\d,\.]+

Sample test run:

Pattern understood as:
[\d,\.]+
Enter string to check if matches pattern
>  a2.3 fjdfadfj34  34j3424  2,300 adsfa    
Group 0 match: "2.3"
Group 0 match: "34"
Group 0 match: "34"
Group 0 match: "3424"
Group 0 match: "2,300"

Then for each match, remove all commas and send that to the parser. To handle case of something like 12.323.344, you could do another check to see that a matching substring has at most one ..

Matthew Gunn
  • 4,451
  • 1
  • 12
  • 30