135

I've got a table field membername which contains both the last name and the first name of users. Is it possible to split those into 2 fields memberfirst, memberlast?

All the records have this format "Firstname Lastname" (without quotes and a space in between).

Rahil Wazir
  • 10,007
  • 11
  • 42
  • 64
tsiger
  • 1,607
  • 2
  • 15
  • 16
  • 7
    "All the records have this format "Firstname Lastname" (without quotes and a space in between)." ... miraculously... Please, **please**, don't forget about people like me when making database decisions. Too often I get websites telling me my last name contains an *illegal* (sic) character... :( – Stijn de Witt Jun 02 '16 at 18:53
  • See also [SQL split values to multiple rows](https://stackoverflow.com/questions/17942508/sql-split-values-to-multiple-rows) – tripleee Jan 31 '22 at 18:44

14 Answers14

238

Unfortunately MySQL does not feature a split string function. However you can create a user defined function for this, such as the one described in the following article:

With that function:

DELIMITER $$

CREATE FUNCTION SPLIT_STR(
  x VARCHAR(255),
  delim VARCHAR(12),
  pos INT
)
RETURNS VARCHAR(255) DETERMINISTIC
BEGIN 
    RETURN REPLACE(SUBSTRING(SUBSTRING_INDEX(x, delim, pos),
       LENGTH(SUBSTRING_INDEX(x, delim, pos -1)) + 1),
       delim, '');
END$$

DELIMITER ;

you would be able to build your query as follows:

SELECT SPLIT_STR(membername, ' ', 1) as memberfirst,
       SPLIT_STR(membername, ' ', 2) as memberlast
FROM   users;

If you prefer not to use a user defined function and you do not mind the query to be a bit more verbose, you can also do the following:

SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(membername, ' ', 1), ' ', -1) as memberfirst,
       SUBSTRING_INDEX(SUBSTRING_INDEX(membername, ' ', 2), ' ', -1) as memberlast
FROM   users;
joshweir
  • 5,427
  • 3
  • 39
  • 59
Daniel Vassallo
  • 337,827
  • 72
  • 505
  • 443
  • still you cannot use the IN as an "array of values" from that split operation? – Miguel Sep 08 '16 at 15:39
  • 3
    Is your use of `LENGTH` multibyte safe? "LENGTH(str): Returns the length of the string str, measured in bytes. A multibyte character counts as multiple bytes. This means that for a string containing five 2-byte characters, LENGTH() returns 10, whereas CHAR_LENGTH() returns 5." – Erk Nov 22 '16 at 18:28
  • This will not work properly when dealing with multibyte/utf8 characters, as @Erk mentioned. Only the simple solution with the two SUBSTRING_INDEX statements works with utf8 / multibyte – Michael Apr 10 '18 at 22:25
  • LENGTH(), LOCATE() or anything that relies on a position count will fail with multibyte characters. – Michael Apr 10 '18 at 22:35
73

SELECT variant (not creating a user defined function):

SELECT IF(
        LOCATE(' ', `membername`) > 0,
        SUBSTRING(`membername`, 1, LOCATE(' ', `membername`) - 1),
        `membername`
    ) AS memberfirst,
    IF(
        LOCATE(' ', `membername`) > 0,
        SUBSTRING(`membername`, LOCATE(' ', `membername`) + 1),
        NULL
    ) AS memberlast
FROM `user`;

This approach also takes care of:

  • membername values without a space: it will add the whole string to memberfirst and sets memberlast to NULL.
  • membername values that have multiple spaces: it will add everything before the first space to memberfirst and the remainder (including additional spaces) to memberlast.

The UPDATE version would be:

UPDATE `user` SET
    `memberfirst` = IF(
        LOCATE(' ', `membername`) > 0,
        SUBSTRING(`membername`, 1, LOCATE(' ', `membername`) - 1),
        `membername`
    ),
    `memberlast` = IF(
        LOCATE(' ', `membername`) > 0,
        SUBSTRING(`membername`, LOCATE(' ', `membername`) + 1),
        NULL
    );
smhg
  • 2,159
  • 19
  • 26
  • Also useful would be to see how to cut off just the last word for last name, and all the non last ones for first name, for example: Mary A. Smith which is the types I have to deal with this in an old db table fix. I'll see if I can figure it out and post the result, if not, if you can post that option too that would make your answer complete. – Lizardx Jan 28 '16 at 04:44
  • how can we cast it to integer since membername is varchar.. let memberfirst be of type int. Will it work if i directly use cast() ? – infinitywarior Feb 12 '19 at 09:35
  • how to make just last word in memberlast, and before last word make to field memberfirst? – Fredrick M T Pardosi Jul 18 '21 at 08:54
33

It seems that existing responses are over complicated or not a strict answer to the particular question.

I think, the simple answer is the following query:

SELECT
    SUBSTRING_INDEX(`membername`, ' ', 1) AS `memberfirst`,
    SUBSTRING_INDEX(`membername`, ' ', -1) AS `memberlast`
;

I think it is not necessary to deal with more-than-two-word names in this particular situation. If you want to do it properly, splitting can be very hard or even impossible in some cases:

  • Johann Sebastian Bach
  • Johann Wolfgang von Goethe
  • Edgar Allan Poe
  • Jakob Ludwig Felix Mendelssohn-Bartholdy
  • Petőfi Sándor
  • Virág Vendelné Farkas Margit
  • 黒澤

In a properly designed database, human names should be stored both in parts and in whole. This is not always possible, of course.

Dávid Horváth
  • 4,050
  • 1
  • 20
  • 34
  • That's fine if you know how many values to expect, but if the field could contain "one" or "one, two" or "one, two, three" etc and you want to split out each to a separate row, you need something more complex. (Agree that this should not happen in properly designed databases, but you know how it is.) – tripleee Jan 31 '22 at 18:39
  • @tripleee True in general, but now we are in a simpler situation, because OP stated: *all the records have this format: "Firstname Lastname"*. – Dávid Horváth Feb 27 '22 at 04:27
20

If your plan is to do this as part of a query, please don't do that (a). Seriously, it's a performance killer. There may be situations where you don't care about performance (such as one-off migration jobs to split the fields allowing better performance in future) but, if you're doing this regularly for anything other than a mickey-mouse database, you're wasting resources.

If you ever find yourself having to process only part of a column in some way, your DB design is flawed. It may well work okay on a home address book or recipe application or any of myriad other small databases but it will not be scalable to "real" systems.

Store the components of the name in separate columns. It's almost invariably a lot faster to join columns together with a simple concatenation (when you need the full name) than it is to split them apart with a character search.

If, for some reason you cannot split the field, at least put in the extra columns and use an insert/update trigger to populate them. While not 3NF, this will guarantee that the data is still consistent and will massively speed up your queries. You could also ensure that the extra columns are lower-cased (and indexed if you're searching on them) at the same time so as to not have to fiddle around with case issues.

And, if you cannot even add the columns and triggers, be aware (and make your client aware, if it's for a client) that it is not scalable.


(a) Of course, if your intent is to use this query to fix the schema so that the names are placed into separate columns in the table rather than the query, I'd consider that to be a valid use. But I reiterate, doing it in the query is not really a good idea.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • 4
    Sometimes, you have to do it. F.e. I need it in a migration script, so I don't care about performances. – Matthieu Napoli Nov 05 '13 at 11:13
  • 1
    @dfmiller, yes, I did, hence my reasoned and detailed response, and thanks for your interest. If you have a _specific_ issue with something I wrote, point it out and I'll see if it can be improved. Your current comment is pretty much useless in improving the situation, if that was indeed your intent. Or maybe you just _like_ spouting random comments on the net, it's hard to tell :-) I stand by the answer, of course, sub-columnar access is not scalable and is almost always a bad idea, unless it's used for the purpose of actually _fixing_ sub-columnar access. – paxdiablo Sep 12 '14 at 01:47
  • 3
    The question is how to split the single column into 2 and then you respond by saying "Don't to that" and then proceed to explain why they should be split. Your first paragraph sounds like you are arguing in favor or keeping them as one column, but the other paragraphs say the opposite. – dfmiller Sep 12 '14 at 15:13
  • @dfmiller, perhaps I misunderstood the question, I'm not sure now whether the separation was to be done in the query or the table. I've clarified the answer to hopefully make it clearer. – paxdiablo Sep 12 '14 at 17:04
  • Much better. I never considered using a select query except to update the database. That would be a terrible idea. – dfmiller Sep 12 '14 at 17:29
  • Great answer! If I boil it down to one gem, it's this: "If you ever find yourself having to process only part of a column in some way, your DB design is flawed." (Though, I inherited the schema, etc., such is life; using your pre-populated field suggestion.) – HoldOffHunger Sep 10 '18 at 15:56
7

use this

SELECT SUBSTRING_INDEX(SUBSTRING_INDEX( `membername` , ' ', 2 ),' ',1) AS b, 
SUBSTRING_INDEX(SUBSTRING_INDEX( `membername` , ' ', -1 ),' ',2) AS c FROM `users` WHERE `userid`='1'
Karthik
  • 3,221
  • 5
  • 28
  • 38
  • This will grab the first and the last space delimited substring from the field, which does not work in all circumstances. For example, if the name field is "Lilly von Schtupp", then you'll get 'Lilly', 'Schtupp' as the firstname, surname. – John Franklin Jul 02 '12 at 20:32
7

In MySQL this is working this option:

SELECT Substring(nameandsurname, 1, Locate(' ', nameandsurname) - 1) AS 
       firstname, 
       Substring(nameandsurname, Locate(' ', nameandsurname) + 1)    AS lastname 
FROM   emp  
WhatsThePoint
  • 3,395
  • 8
  • 31
  • 53
Irakli Kardava
  • 169
  • 2
  • 3
5

Not exactly answering the question, but faced with the same problem I ended up doing this:

UPDATE people_exit SET last_name = SUBSTRING_INDEX(fullname,' ',-1)
UPDATE people_exit SET middle_name = TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(fullname,last_name,1),' ',-2))
UPDATE people_exit SET middle_name = '' WHERE CHAR_LENGTH(middle_name)>3 
UPDATE people_exit SET first_name = SUBSTRING_INDEX(fullname,concat(middle_name,' ',last_name),1)
UPDATE people_exit SET first_name = middle_name WHERE first_name = ''
UPDATE people_exit SET middle_name = '' WHERE first_name = middle_name
commonpike
  • 10,499
  • 4
  • 65
  • 58
2

The only case where you may want such a function is an UPDATE query which will alter your table to store Firstname and Lastname into separate fields.

Database design must follow certain rules, and Database Normalization is among most important ones

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
  • 1
    Unnecessary comment as this is exactly what the poster asked for; also inaccurate as there are a million times you might need to split a string for best normalisation. Unsure why or how this ever got voted up. – daticon May 16 '18 at 12:25
  • Using indexes on split fields is about as impossible as making MySQL into a leaf mulcher, but that won't stop people from asking about it. Good answer -- database SHOULD reflect the data, not your leaf mulcher specs. – HoldOffHunger Sep 10 '18 at 16:11
2

I had a column where the first and last name were both were in one column. The first and last name were separated by a comma. The code below worked. There is NO error checking/correction. Just a dumb split. Used phpMyAdmin to execute the SQL statement.

UPDATE tblAuthorList SET AuthorFirst = SUBSTRING_INDEX(AuthorLast,',',-1) , AuthorLast = SUBSTRING_INDEX(AuthorLast,',',1);

13.2.10 UPDATE Syntax

Steve R.
  • 21
  • 1
1

This takes smhg from here and curt's from Last index of a given substring in MySQL and combines them. This is for mysql, all I needed was to get a decent split of name to first_name last_name with the last name a single word, the first name everything before that single word, where the name could be null, 1 word, 2 words, or more than 2 words. Ie: Null; Mary; Mary Smith; Mary A. Smith; Mary Sue Ellen Smith;

So if name is one word or null, last_name is null. If name is > 1 word, last_name is last word, and first_name all words before last word.

Note that I've already trimmed off stuff like Joe Smith Jr. ; Joe Smith Esq. and so on, manually, which was painful, of course, but it was small enough to do that, so you want to make sure to really look at the data in the name field before deciding which method to use.

Note that this also trims the outcome, so you don't end up with spaces in front of or after the names.

I'm just posting this for others who might google their way here looking for what I needed. This works, of course, test it with the select first.

It's a one time thing, so I don't care about efficiency.

SELECT TRIM( 
    IF(
        LOCATE(' ', `name`) > 0,
        LEFT(`name`, LENGTH(`name`) - LOCATE(' ', REVERSE(`name`))),
        `name`
    ) 
) AS first_name,
TRIM( 
    IF(
        LOCATE(' ', `name`) > 0,
        SUBSTRING_INDEX(`name`, ' ', -1) ,
        NULL
    ) 
) AS last_name
FROM `users`;


UPDATE `users` SET
`first_name` = TRIM( 
    IF(
        LOCATE(' ', `name`) > 0,
        LEFT(`name`, LENGTH(`name`) - LOCATE(' ', REVERSE(`name`))),
        `name`
    ) 
),
`last_name` = TRIM( 
    IF(
        LOCATE(' ', `name`) > 0,
        SUBSTRING_INDEX(`name`, ' ', -1) ,
        NULL
    ) 
);
Community
  • 1
  • 1
Lizardx
  • 1,165
  • 10
  • 16
0

Method I used to split first_name into first_name and last_name when the data arrived all in the first_name field. This will put only the last word in the last name field, so "john phillips sousa" will be "john phillips" first name and "sousa" last name. It also avoids overwriting any records that have been fixed already.

set last_name=trim(SUBSTRING_INDEX(first_name, ' ', -1)), first_name=trim(SUBSTRING(first_name,1,length(first_name) - length(SUBSTRING_INDEX(first_name, ' ', -1)))) where list_id='$List_ID' and length(first_name)>0 and length(trim(last_name))=0
TheSatinKnight
  • 696
  • 7
  • 16
0
UPDATE `salary_generation_tbl` SET
    `modified_by` = IF(
        LOCATE('$', `other_salary_string`) > 0,
        SUBSTRING(`other_salary_string`, 1, LOCATE('$', `other_salary_string`) - 1),
        `other_salary_string`
    ),
    `other_salary` = IF(
        LOCATE('$', `other_salary_string`) > 0,
        SUBSTRING(`other_salary_string`, LOCATE('$', `other_salary_string`) + 1),
        NULL
    );
Izuka
  • 2,572
  • 5
  • 29
  • 34
Aniket
  • 11
  • 1
0

In case someone needs to run over a table and split a field:

  1. First we use the function mention above:
CREATE DEFINER=`root`@`localhost` FUNCTION `fn_split_str`($str VARCHAR(800), $delimiter VARCHAR(12), $position INT) RETURNS varchar(800) CHARSET utf8
    DETERMINISTIC
BEGIN 
    RETURN REPLACE(
            SUBSTRING(
                SUBSTRING_INDEX($str, $delimiter, $position),
                LENGTH(
                    SUBSTRING_INDEX($str, $delimiter, $position -1)
                ) + 1
            ),
    $delimiter, '');
END
  1. Second, we run in a while loop on the string until there isn't any results (I've added $id for JOIN clause):
CREATE DEFINER=`root`@`localhost` FUNCTION `fn_split_str_to_rows`($id INT, $str VARCHAR(800), $delimiter VARCHAR(12), $empty_table BIT) RETURNS int(11)
BEGIN

    DECLARE position INT;
    DECLARE val VARCHAR(800);
    SET position = 1;
    
    IF $empty_table THEN
        DROP TEMPORARY TABLE IF EXISTS tmp_rows;    
    END IF;
            
    SET val = fn_split_str($str, ',', position);
            
    CREATE TEMPORARY TABLE IF NOT EXISTS tmp_rows AS (SELECT $id as id, val as val where 1 = 2);
        
    WHILE (val IS NOT NULL and val != '') DO               
        INSERT INTO tmp_rows
        SELECT $id, val;
        
        SET position = position + 1;
        SET val = fn_split_str($str, ',', position);
    END WHILE;
    
    RETURN position - 1;
END
  1. Finally we can use it like that:
DROP TEMPORARY TABLE IF EXISTS tmp_rows;
SELECT  SUM(fn_split_str_to_rows(ID, FieldToSplit, ',', 0))
FROM    MyTable;

SELECT * FROM tmp_rows;

You can use the id to join to other table.

In case you are only splitting one value you can use it like that

SELECT  fn_split_str_to_rows(null, 'AAA,BBB,CCC,DDD,EEE,FFF,GGG', ',', 1);
SELECT * FROM tmp_rows;

We don't need to empty the temporary table, the function will take care of that.

roies
  • 13
  • 3
-3

mysql 5.4 provides a native split function:

SPLIT_STR(<column>, '<delimiter>', <index>)
emfi
  • 649
  • 6
  • 23
  • 3
    Can you provide a link to the documentation. A search of dev.mysql.com comes up dry. Section 12.5 does have a community suggestion in the comments for this function. – DRaehal Jul 02 '13 at 22:21