2

I am trying to pull out the name of a city from a long string in my database. Here is an example of what the data looks like for a few different locations.

"701 MONROE STREET NW RUSSELLVILLE, AL 35653 (34.514971, -87.736372)"
"1825 HOLTVILLE ROAD WETUMPKA, AL 36092 (32.558544, -86.221265)"

I want to create a column for just the Name of the city. My thought was was to take everything Left of the fir comma and right of the following space. I have tried a few different ways to pull this but thing I might be missing something.

SELECT  left(Location, CHARINDEX(',', location)) as city FROM table

This is returning everything left of the first comma.

"701 MONROE STREET NW RUSSELLVILLE,
"1825 HOLTVILLE ROAD WETUMPKA,

But now I want to return everything left of the comma and everything Right of the last space in this string and I am stumped as to how I would pull that information correctly. Any help would be appreciated.

Thanks, Pat

Pat Doyle
  • 384
  • 2
  • 5
  • 16
  • "701 MONROE STREET NW RUSSELLVILLE, AL 35653 (34.514971, -87.736372)" "1825 HOLTVILLE ROAD WETUMPKA, AL 36092 (32.558544, -86.221265)" is this one complete address – Zaynul Abadin Tuhin Aug 20 '18 at 18:17
  • 4
    What about "... ATLANTIC CITY, NJ" ? – nathan_jr Aug 20 '18 at 18:25
  • 4
    This is why normalization is so important... it's often very difficult to reverse out of as @NathanSkerl has illustrated. I'd probably populate a table of all your possibilities of ST, RD, ROAD, LN, LANE, NW, SW, NE, etc... and find the last occurrence of this as the start point. It'll get you closer i'd imagine, assuming they are required. You could check if they exists as well just a a double down. – S3S Aug 20 '18 at 18:26
  • 1
    I ran into this exact same problem a few months back. My solution was to use an Address Cleaning service like [Experian](https://www.edq.com/address-verification/). There are simple too many variations to consider – Code Different Aug 20 '18 at 18:33
  • 4
    This is a harder problem than it seems. What about multiword cities ("New York City"). Or city names with a directional first word ("North Platte")? Or city names with ambiguous words ("Park City", "Junction City")? – Gordon Linoff Aug 20 '18 at 18:37
  • If you don't have a ZIP Code table, get one. Also, take a peek at https://stackoverflow.com/questions/41249742/address-standardization-within-a-database/41249971#41249971 – John Cappelletti Aug 20 '18 at 18:58
  • You need a table of all the multiword cities and an appropriate unique token for each. Replace all the multiword cities with their tokens, run your query, replace them back again. – James Aug 21 '18 at 10:41

2 Answers2

3

Using REVERSE could work with something along the lines of:

SELECT reverse(
            left(
             reverse(
               left(
                 Location, 
                 CHARINDEX(',', location)-1
               )
              ),
              CHARINDEX(' ', reverse(
                 left(
                   Location, 
                   CHARINDEX(',', location)-1
                 )
               )
             )
           )
  )as city FROM table;

Fiddle

Koen
  • 724
  • 4
  • 10
  • This doesn't work if there is a space in the city name. – Sean Lange Aug 20 '18 at 18:36
  • @Sean Lange - I agree, the underlying problem is a lot more complicated to fix. But I see two things in the original question - a functional need (which I have no idea how it could be done - and I will be very happy to learn the solution for) and a technical question regarding how to locate the last space. This is just for that part (and if a functional solution presents itself, this answer will disappear very quickly) – Koen Aug 20 '18 at 18:59
  • I don't think this is a bad answer. I just wanted to make sure to point out there is at least that hole. It would be incredibly difficult (if not impossible) with nothing more than t-sql to make this kind of thing be 100%. The problem is not with your solution, but with the situation that landed the OP in this horrific spot. :) – Sean Lange Aug 20 '18 at 19:04
3

If the Google API mentioned in my comment above is not an option. You can download (or even purchase) a ZIP Code database. The cost is nominal. I would suggest the quarterly updates because ZIP Codes change over time (add/edit/delete)

Example

Declare @YourTable table (id int,addr varchar(250))
Insert Into @YourTable values
(1,'701 MONROE STREET NW RUSSELLVILLE, AL 35653 (34.514971, -87.736372)'),
(2,'1825 HOLTVILLE ROAD WETUMPKA, AL 36092 (32.558544, -86.221265)')

Select A.ID
      ,StreetAddress =left(addr,nullif(charindex(Z.CityName,addr),0)-1)
      ,Z.CityName
      ,Z.StateCode
      ,Z.ZIPCode
 From @YourTable A
 Join [dbo].[OD-Zip] Z 
   on Z.ZipCode = substring(addr,nullif(patindex('%[0-9][0-9][0-9][0-9][0-9]%',addr),0),5)
      and charindex(Z.CityName,addr)>0
      and Z.ZipType='S'
      and Z.CityType='D'

Returns

ID  StreetAddress           CityName        StateCode   ZIPCode
1   701 MONROE STREET NW    Russellville    AL          35653
2   1825 HOLTVILLE ROAD     Wetumpka        AL          36092
John Cappelletti
  • 79,615
  • 7
  • 44
  • 66