5

I have a few million records in a mysql database with the following columns: company, address, url, phone, category

Here is a sample row: Company123 - 123 Candyland St, New York, NY 12345 - http://urltothiscompany.com - 123-456-7890 - Bakery

My question is about the address column. I'd like to split the rows up into separate address, city, state, and zip code columns: 123 Candyland St - New York - NY - 12345

However, some rows don't have a street, only city, state, and zip: New York, NY, 1235

Is there a possible way to do that in mysql? I'm not sure where to begin since some rows don't have the address. Maybe count the characters from the end of the column?

Any help is appreciated. Thank you.

dkeeper09
  • 537
  • 3
  • 11
  • 29
  • I would use your commas as separators and counting in reverse from the end of the string. The first section would be the zip, next the state, the city, and the street if exists. Read this: http://stackoverflow.com/questions/2696884/split-value-from-one-field-to-two – CodeGodie Jan 06 '17 at 01:26
  • my approach would be write a PHP script that grabs that column and use PHP's explode function on ` - ` You could also use the `isnumeric($zip)` to ensure its valid, or at least to different US ffrom Canada – Duane Lortie Jan 06 '17 at 01:37
  • If you are sure that only the street may be absent, then the solution proposed by CodeGodie is the right one. But if other fields too may be absent I suspect you need some sort of validation... – Aerendir Jan 06 '17 at 01:38
  • In your examples: the first one without comma (... NY 12345) the second has comma (..., NY , 1235) this is hard for data cut – SIDU Jan 06 '17 at 02:29

2 Answers2

9

Assume your data is actually looking like following:

addr
=======================
street, City, State ZIP

And here is the SQL:

SELECT addr,
  substr(addr, 1, length(addr) - length(substring_index(addr, ',', -2))) street,
  substring_index(substring_index(addr, ',', -2), ',', 1) city,
  substr(trim(substring_index(addr, ',', -1)),1,2) state,
  substring_index(addr, ' ', -1) zip
FROM tab

enter image description here

OOPs there is an extra comma at street, this is a homework for you to fix :)

SIDU
  • 2,258
  • 1
  • 12
  • 23
3

Quick answer for googlers as me.
Example splitting columns into 3. If you need 2 columns omit new_column_name_2 inside requests.
Works for dimensions like 100x200x300 becomes 100 200 300

  1. Create new columns.
ALTER TABLE `your_table_name` ADD `new_column_name_1` VARCHAR(250) DEFAULT '';
ALTER TABLE `your_table_name` ADD `new_column_name_2` VARCHAR(250) DEFAULT '';
ALTER TABLE `your_table_name` ADD `new_column_name_3` VARCHAR(250) DEFAULT '';
  1. Verify you're selecting correct stuff
SELECT `old_column_name`,
SUBSTRING_INDEX(`old_column_name`, 'x', 1) `new_column_name_1`,
SUBSTRING_INDEX(SUBSTRING_INDEX(`old_column_name`, 'x', 2),'x',-1) `new_column_name_2`, 
SUBSTRING_INDEX(`old_column_name`, 'x', -1) `new_column_name_3`
FROM `your_table_name`;
  1. Populate new columns
UPDATE `your_table_name` SET
`new_column_name_1` = SUBSTRING_INDEX(`old_column_name`, 'x', 1),
`new_column_name_2` = SUBSTRING_INDEX(SUBSTRING_INDEX(`old_column_name`, 'x', 2),'x',-1),
`new_column_name_3` = SUBSTRING_INDEX(`old_column_name`, 'x', -1);
Hebe
  • 661
  • 1
  • 7
  • 13