0

I want to clean some contact numbers in a sql server database

table name is ronnie_list and column name is "Name Phone#"

numbers can be in format example formats are

T: (985)-124-5601
(985)124-5601
985)-124-5601
985.124.5601
9851245601
985124-5601 EX 1432
985-(124)-5601

I want them in this format 985-124-5601

please let me know how to do that

mb1987
  • 437
  • 8
  • 21
  • 3
    I would strip out all non-digits. After than just leave it alone. Let your front-end apps format the phone numbers in the UI. – Dave Mason Aug 27 '14 at 13:55
  • There isn't a simple answer. If your DB contains international numbers you will need to handle specific cases for each country. Depending on who entered the data, some numbers might begin with a 1. You may want to strip that digit if this is a list of US phone numbers. Also, be sure to code a way to handle extensions and numbers that do not include area codes. Most PhoneNum fields are, technically, arbitrary text fields. I recommend cleaning up what you can, and exporting an exceptions list. Hand that list to your managers and tell them to let data-entry fix the list of outliers. – RLH Aug 27 '14 at 14:02

3 Answers3

3

Create a Function

ALTER FUNCTION dbo.udf_GetNumeric
(@strAlphaNumeric VARCHAR(1000))
RETURNS VARCHAR(1000)
AS
BEGIN
    DECLARE @intAlpha INT
      SET @intAlpha = PATINDEX('%[^0-9]%', @strAlphaNumeric)
    BEGIN
      WHILE @intAlpha > 0
        BEGIN
            SET @strAlphaNumeric = STUFF(@strAlphaNumeric, @intAlpha, 1, '' )
            SET @intAlpha = PATINDEX('%[^0-9]%', @strAlphaNumeric )
        END
    END
    RETURN ISNULL(@strAlphaNumeric,0)
END
GO

Test Data

DECLARE @temp TABLE(string NVARCHAR(1000))
INSERT INTO @temp (string)
VALUES 
('T: (985)-124-5601'),
('(985)124-5601'),
('985)-124-5601'),
('985.124.5601'),
('9851245601'),
('985124-5601 EX 1432'),
('985-(124)-5601')

Query

SELECT LEFT(OnlyNumbers,3) + '-' + SUBSTRING(OnlyNumbers,4,3) + '-' + RIGHT(OnlyNumbers, 4)
FROM (
        SELECT LEFT(dbo.udf_GetNumeric(string), 10) OnlyNumbers
        FROM @temp 
      )z

Result

985-124-5601
985-124-5601
985-124-5601
985-124-5601
985-124-5601
985-124-5601
985-124-5601
M.Ali
  • 67,945
  • 13
  • 101
  • 127
1

1. Write REPLACE() around the column untill you remove all your unwanted symbols. like this. REPLACE(REPLACE([COLUMN],')',''),'(','') --Only Brackets are removed here.

(985)124-5601 becomes 985124-5601
985124-5601 EX 1432 becomes 9851245601  1432

2. Once symbols are removed, Remove space as well, you can use REPLACE itself

985124-5601 becomes 9851245601
98512456011432 becomes 98512456011432

3. Take LEFT([COLUMN],10) and Split it to 3 using substring, and put minus symbol in between.

9851245601 becomes 985-124-5601
98512456011432 becomes 985-124-5601

You can do all these in 1 sql script, just fyi. Hope it helps..

Jithin Shaji
  • 5,893
  • 5
  • 25
  • 47
1

I would use the code in this answer to reduce the strings to just numbers, https://stackoverflow.com/a/18625635/2587452.

Then I would use

SELECT 
Phone = CASE WHEN LEN (x)=10 THEN LEFT(x, 3)+'-'+SUBSTRING(x,4,3)+'-'+SUBSTRING(x,7,4)
        ELSE '' END

This will work unless you need to handle those extensions or country codes. In that case you will need to add more WHEN clauses and there still may be edge cases that will need special code.

Community
  • 1
  • 1
Eric Hauenstein
  • 2,557
  • 6
  • 31
  • 41