0

I have two databases that store phone numbers. The first one stores them with a country code in the format 15555555555 (a US number), and the other can store them in many different formats (ex. (555) 555-5555, 5555555555, 555-555-5555, 555-5555, etc.). When a phone number unsubscribes in one database, I need to unsubscribe all references to it in the other database.

What is the best way to find all instances of phone numbers in the second database that match the number in the first database? I'm using the entity framework. My code right now looks like this:

    using (FusionEntities db = new FusionEntities())
    {

            var communications = db.Communications.Where(x => x.ValueType == 105);

            foreach (var com in communications)
            {
                string sRegexCompare = Regex.Replace(com.Value, "[^0-9]", "");
                if (sMobileNumber.Contains(sRegexCompare) && sRegexCompare.Length > 6)
                {
                    var contact = db.Contacts.Where(x => x.ContactID == com.ContactID).FirstOrDefault();
                    contact.SMSOptOutDate = DateTime.Now;
                }
            }
    }

Right now, my comparison checks to see if the first database contains at least 7 digits from the second database after all non-numeric characters are removed.

Ideally, I want to be able to apply the regex formatting to the point in the code where I get the data from the database. Initially I tried this, but I can't use replace in a LINQ query:

var communications = db.Communications.Where(x => x.ValueType == 105 && sMobileNumber.Contains(Regex.Replace(x.Value, "[^0-9]", "")));

ijb109
  • 942
  • 1
  • 19
  • 30
  • 1
    You might look at this [answer](http://stackoverflow.com/a/123681). – HamZa Apr 15 '14 at 20:33
  • 1
    That's helpful. I'm not so much looking for the best expression at this point as much as the best way to filter on it. Does that make sense? – ijb109 Apr 15 '14 at 20:41
  • Does each number have a canonical representation? (In any case, my first instinct would be to use a computed column.) – user2864740 Apr 15 '14 at 20:45
  • No. The second database can contain anything in the value column. One database even contained the value 'djahdh' when I looked in it today. I have looked at computed columns as seen here: http://stackoverflow.com/questions/15655813/comparing-phone-numbers-sql – ijb109 Apr 15 '14 at 20:49

1 Answers1

1

Comparing phone numbers is a bit beyond the capability of regex by design. As you've discovered there are many ways to represent a phone number with and without things like area codes and formatting. Regex is for pattern matching so as you've found using the regex to strip out all formatting and then comparing strings is doable but putting logic into regex which is not what it's for.

I would suggest the first and biggest thing to do is sort out the representation of phone numbers. Since you have database access you might want to look at creating a new field or table to represent a phone number object. Then put your comparison logic in the model.

Yes it's more work but it keeps the code more understandable going forward and helps cleanup crap data.

Twisted Mentat
  • 432
  • 4
  • 13