42

I have a string when a telephone number is inputted - there is a mask so it always looks like "(123) 456-7890" - I'd like to take the formatting out before saving it to the DB.

How can I do that?

Gabe
  • 84,912
  • 12
  • 139
  • 238
Jerrold
  • 1,534
  • 6
  • 25
  • 43
  • BTW, Int.Parse is not part of C# – John Saunders Aug 20 '10 at 16:39
  • Err, then why is it documented? http://msdn.microsoft.com/en-us/library/system.int32.parse.aspx – Nicholas M T Elliott Aug 20 '10 at 16:44
  • 4
    John Saunders is simply wrong. – Timwi Aug 20 '10 at 16:52
  • 3
    @Nicholas @0xA3 the point John is making is that "C#" and "the .NET Framework" are different things; a question about how to extract an integer from a string using `System.Int32.TryParse` is not actually a C# question, as it was originally tagged. – AakashM Aug 20 '10 at 16:53
  • `int`(sugar) is a part of `C#` where `Int32`(class) is a part of .Net. – Lasse Espeholt Aug 20 '10 at 16:56
  • 2
    AakashM and John Saunders are both correct. `int.Parse` is part of the BCL, not C#. – Matt Greer Aug 20 '10 at 16:57
  • 2
    @AakashM, @Matt Greer: Surely true, C# is not the BCL. But why the remark? The OP didn't claim so, he is simply asking about a C# solution making use of the BCL. So what? – Dirk Vollmar Aug 20 '10 at 17:18
  • 4
    @0xA3: et. al. C# and .NET are not the same thing. As professionals, we should attempt to be clear on distinctions. This is no more a C# question than it is a VB.NET or F# question. – John Saunders Aug 20 '10 at 17:25
  • But, tagging it with c# surely got a lot more people to take a look at it. – Matt Greer Aug 20 '10 at 17:35
  • 4
    @John - I don't think the OP would appreciate an F# solution. Adding the .net tag is still fine, though. The question is both a `[c#]` and `[.net]` question. – Jon B Aug 20 '10 at 18:57
  • @Jon: why not an F# solution? Will it not be just about identical to the "C#" solution? – John Saunders Aug 20 '10 at 19:22
  • @John - maybe, maybe not. Some of the proposed solutions are more code intensive than others, and would require a certain amount of language familiarity for the OP to translate. I don't know F# myself, but I bet there are some pretty clever solutions in F# that wouldn't translate easily. – Jon B Aug 20 '10 at 19:34

14 Answers14

96

One possibility using linq is:

string justDigits = new string(s.Where(c => char.IsDigit(c)).ToArray());

Adding the cleaner/shorter version thanks to craigmoliver

string justDigits = new string(s.Where(char.IsDigit).ToArray())
Matt Greer
  • 60,826
  • 17
  • 123
  • 123
  • Very bad answer. This will only remove the first `(` in the OP’s example. – Timwi Aug 20 '10 at 16:45
  • 1
    pair programming over stackoverflow :) – Matt Greer Aug 20 '10 at 16:49
  • God - i love linq. I think ill keep it as a string - the posts under me have me worried about overflows yours did not quite work, so I ended up going with - var number = new string(numberString.Where(c => char.IsNumber(c)).ToArray()); – Jerrold Aug 20 '10 at 16:53
  • I'd be interested to see whether the Linq solution or the Regex solution has better performance characteristics. :-/ – Toby Aug 20 '10 at 16:57
  • 1
    @Toby: The Linq solution seems about four times faster in my benchmark, but even much faster is the for-loop as in the answer of EKS (five times faster than Linq). – Dirk Vollmar Aug 20 '10 at 17:13
  • 3
    +1: Clear and elegant. It's even clearer than the RegEx answer (IMO) because it spells out IsDigit rather than depending on the terse \d. (Ah, LINQ. I'm still trudging along with .NET 2.0 as you can guess by the snippet in my answer.) – Paul Sasik Aug 20 '10 at 17:15
  • Love the elegance in this solution! – Mark LeMoine Aug 20 '10 at 18:56
  • 4
    it can be rewritten as: new string(s.Where(char.IsDigit).ToArray()) – craigmoliver Nov 11 '12 at 22:49
41

You can use a regular expression to remove all non-digit characters:

string phoneNumber = "(123) 456-7890";
phoneNumber = Regex.Replace(phoneNumber, @"[^\d]", "");

Then further on - depending on your requirements - you can either store the number as a string or as an integer. To convert the number to an integer type you will have the following options:

// throws if phoneNumber is null or cannot be parsed
long number = Int64.Parse(phoneNumber, NumberStyles.Integer, CultureInfo.InvariantCulture);

// same as Int64.Parse, but returns 0 if phoneNumber is null
number = Convert.ToInt64(phoneNumber);

// does not throw, but returns true on success
if (Int64.TryParse(phoneNumber, NumberStyles.Integer, 
       CultureInfo.InvariantCulture, out number))
{
    // parse was successful
}
Dirk Vollmar
  • 172,527
  • 53
  • 255
  • 316
  • 8
    `[^\d]` should be the same as just `\D`? – Lasse Espeholt Aug 20 '10 at 16:46
  • 3
    Yes, it’s the same. It makes no difference, except that `[^\d]` is more explicit and requires less learning. – Timwi Aug 20 '10 at 16:48
  • 4
    You have to learn about `[^]`. I'd argue `\D` requires less learning. – recursive Aug 20 '10 at 16:51
  • On saving a phone number as an integer: I'm pretty sure this will work with phone numbers as I don't think any area codes start with leading zeros. With other strings, saving "001234567" as an integer would be 1234567 which may not be what you really want. – RandyB Oct 16 '21 at 18:28
14

Since nobody did a for loop.

long GetPhoneNumber(string PhoneNumberText)
{
    // Returns 0 on error

    StringBuilder TempPhoneNumber = new StringBuilder(PhoneNumberText.Length);
    for (int i=0;i<PhoneNumberText.Length;i++)
    {
        if (!char.IsDigit(PhoneNumberText[i]))
            continue;

        TempPhoneNumber.Append(PhoneNumberText[i]);
    }

    PhoneNumberText = TempPhoneNumber.ToString();
    if (PhoneNumberText.Length == 0)
        return 0;// No point trying to parse nothing

    long PhoneNumber = 0;
    if(!long.TryParse(PhoneNumberText,out PhoneNumber))
        return 0; // Failed to parse string

    return PhoneNumber;
}

used like this:

long phoneNumber = GetPhoneNumber("(123) 456-7890"); 


Update
As pr commented many countries do have zero's in the begining of the number, if you need to support that, then you have to return a string not a long. To change my code to do that do the following:

1) Change function return type from long to string.
2) Make the function return null instead of 0 on error
3) On successfull parse make it return PhoneNumberText

EKS
  • 5,543
  • 6
  • 44
  • 60
  • Thanks fixed. And yea this should preform well, and should be robust if your parsing user input. – EKS Aug 20 '10 at 17:19
  • 2
    He's storing a phone number. Why would you *ever* want to store a phone number as an integer? – Gabe Aug 20 '10 at 18:50
  • 1
    I kind of think opposite.. Why would he NOT store it as a number – EKS Aug 20 '10 at 20:02
  • 5
    Perhaps because many countries use phone numbers with leading zeros, and those zeros would be dropped if the phone number was parsed/stored as an actual number. – LukeH Aug 20 '10 at 22:18
  • Good point, i have updated my reply with information incase he needs to support that. – EKS Aug 21 '10 at 13:22
  • Almost a decade passed and this answer still works as expected :) – Shailesh Jun 30 '20 at 19:56
4

You can make it work for that number with the addition of a simple regex replacement, but I'd look out for higher initial digits. For example, (876) 543-2019 will overflow an integer variable.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
4
string digits = Regex.Replace(formatted, @"\D", String.Empty, RegexOptions.Compiled);
Toby
  • 7,354
  • 3
  • 25
  • 26
3

Aside from all of the other correct answers, storing phone numbers as integers or otherwise stripping out formatting might be a bad idea.

Here are a couple considerations:

  • Users may provide international phone numbers that don't fit your expectations. See these examples So the usual groupings for standard US numbers wouldn't fit.
  • Users may NEED to provide an extension, eg (555) 555-5555 ext#343 The # key is actually on the dialer/phone, but can't be encoded in an integer. Users may also need to supply the * key.
  • Some devices allow you to insert pauses (usually with the character P), which may be necessary for extensions or menu systems, or dialing into certain phone systems (eg, overseas). These also can't be encoded as integers.

[EDIT]

It might be a good idea to store both an integer version and a string version in the database. Also, when storing strings, you could reduce all punctuation to whitespace using one of the methods noted above. A regular expression for this might be:

// (222) 222-2222 ext# 333   ->   222 222 2222 # 333
phoneString = Regex.Replace(phoneString, @"[^\d#*P]", " ");

// (222) 222-2222 ext# 333   ->   2222222222333 (information lost)
phoneNumber = Regex.Replace(phoneString, @"[^\d]", "");

// you could try to avoid losing "ext" strings as in (222) 222-2222 ext.333 thus:
phoneString = Regex.Replace(phoneString, @"ex\w+", "#");
phoneString = Regex.Replace(phoneString, @"[^\d#*P]", " ");
Kimball Robinson
  • 3,287
  • 9
  • 47
  • 59
2

Try this:

string s = "(123) 456-7890";
UInt64 i = UInt64.Parse(
    s.Replace("(","")
     .Replace(")","")
     .Replace(" ","")
     .Replace("-",""));

You should be safe with this since the input is masked.

Paul Sasik
  • 79,492
  • 20
  • 149
  • 189
  • if the number is `"(543) 231-2322"` you would overflow `int32`. – Lasse Espeholt Aug 20 '10 at 16:44
  • You do realize that he doesn't actually want a number, right? The `int.TryParse` was just his initial guess -- it's a red herring. – Gabe Aug 20 '10 at 18:52
  • @Gabe: I know, it doesn't make sense. Especially since the input is masked and validated in the UI. It's a silly question with mostly silly answers. I think people just had fun with it. For example, the for loop solution garnered six votes... Why? Geek throwback upvoting? Entertainment. It is Friday. – Paul Sasik Aug 20 '10 at 19:29
  • 1
    It's not a silly question at all. He has an input string with formatting and he wants to know how to remove that formatting. The only silly thing is that everybody thinks he wants an integer because he thought that int.TryParse might somehow solve his problem. – Gabe Aug 20 '10 at 19:40
  • You're going to be constantly changing this. What happens when input is "543.231.2322" ? I've seen periods, slashes all kinds of non-digit characters. In fact I've had phone numbers such as "Floor 3". Users. lol – t.durden Feb 26 '18 at 14:31
1

You could use a regular expression or you could loop over each character and use char.IsNumber function.

Chuck Conway
  • 16,287
  • 11
  • 58
  • 101
1

You would be better off using regular expressions. An int by definition is just a number, but you desire the formatting characters to make it a phone number, which is a string.

There are numerous posts about phone number validation, see A comprehensive regex for phone number validation for starters.

Community
  • 1
  • 1
JYelton
  • 35,664
  • 27
  • 132
  • 191
  • I misread the question, but the same information applies. If you're *receiving* the extra characters, you can use regular expressions to remove them. I would still store the phone number as a string, since leading zeroes which might be a valid part of a phone number would otherwise be removed as an int. – JYelton Aug 20 '10 at 16:50
1

As many answers already mention, you need to strip out the non-digit characters first before trying to parse the number. You can do this using a regular expression.

Regex.Replace("(123) 456-7890", @"\D", String.Empty) // "1234567890"

However, note that the largest positive value int can hold is 2,147,483,647 so any number with an area code greater than 214 would cause an overflow. You're better off using long in this situation.

Leading zeros won't be a problem for North American numbers, as area codes cannot start with a zero or a one.

1

Alternative using Linq:

string phoneNumber = "(403) 259-7898";
var phoneStr = new string(phoneNumber.Where(i=> i >= 48 && i <= 57).ToArray());
Holystream
  • 962
  • 6
  • 12
1

This is basically a special case of C#: Removing common invalid characters from a string: improve this algorithm. Where your formatng incl. White space are treated as "bad characters"

Community
  • 1
  • 1
Rune FS
  • 21,497
  • 7
  • 62
  • 96
0
'you can use module / inside sub main form VB.net
Public Function ClearFormat(ByVal Strinput As String) As String
    Dim hasil As String
    Dim Hrf As Char
    For i = 0 To Strinput.Length - 1
        Hrf = Strinput.Substring(i, 1)
        If IsNumeric(Hrf) Then
            hasil &= Hrf
        End If
    Next
    Return Strinput
End Function
'you can call this function like this
' Phone= ClearFormat(Phone)
-1
public static string DigitsOnly(this string phoneNumber)
{
    return new string(
        new[]
            {
             // phoneNumber[0],     (
                phoneNumber[1],  // 6
                phoneNumber[2],  // 1
                phoneNumber[3],  // 7
             // phoneNumber[4],     )
             // phoneNumber[5],   
                phoneNumber[6],  // 8
                phoneNumber[7],  // 6
                phoneNumber[8],  // 7
             // phoneNumber[9],     -
                phoneNumber[10], // 5
                phoneNumber[11], // 3
                phoneNumber[12], // 0
                phoneNumber[13]  // 9
            });
}
xofz
  • 5,600
  • 6
  • 45
  • 63
  • This is extremely rigid, if the formatting of the phone number changes upstream of this, you will get all kinds of strange results from this approach. – JonesCola Nov 20 '20 at 17:00