2

Given the following strings which represent possible lists, how may I get an item at a specified index n

  • 1,2,3,4,5
  • word1 word2 word3
  • pipe|delimited|list

Possible reasons for this functionality are

  • Extraction of specific elements from GROUP_CONCAT output
  • Extraction of specific elements from a SET column output (when cast to string)
  • Extraction of specific elements from a poorly normalised table containing a comma-separated list
  • Use within an iteration procedure to loop over a list and perform an action on each element within
  • Really surprised this wasn't a built-in function of some kind tbh, doesn't seem like *that* outlandish an issue to be trying to solve. Couldn't find a relevant answer, so here's how I got around it. For the record, the issue I was actually trying to solve was to extract component words from phrases prior to further processing. Maybe someone else out there has a better option than my own though :) – Simon at The Access Group Jul 05 '13 at 09:31

3 Answers3

1

There is no native function for this. You can use two SUBSTRING_INDEX functions. And you need to check if that specific index item exists:

SET @string:='1,2,3,4,5';
SET @delimiter:=',';
SET @n:=6;

SELECT
  CASE WHEN
    CHAR_LENGTH(@string)-CHAR_LENGTH(REPLACE(@string, @delimiter, ''))>=
    @n*CHAR_LENGTH(@delimiter)-1
  THEN
    SUBSTRING_INDEX(SUBSTRING_INDEX(@string, @delimiter, @n), @delimiter, -1)
  END;
  • SUBSTRING_INDEX(@string, @delimiter, @n) returns the substring from string @string before @n occurrences of the @delimiter.
  • SUBSTRING_INDEX( ... , @delimiter, -1) returns everything to the right of the final delimiter
  • you need to check if delimiter @n exists. We can substract the length of the string with the delimiter, and the string with the delimiter removed - using REPLACE(@string, @delimiter, '') - and see if it is greater than @n*CHAR_LENGTH(@delimiter)-1
fthiella
  • 48,073
  • 15
  • 90
  • 106
1

Pure SQL way of doing it:-

SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(somefield, '|', 3), '|', -1)
FROM sometable a

If you want to return NULL (or some other fixed value) if there is no 3rd element (for example):-

SELECT IF((LENGTH(somefield) - LENGTH(REPLACE(somefield, '|', '')) + 1) >= 10, SUBSTRING_INDEX(SUBSTRING_INDEX(somefield, '|', 10), '|', -1), NULL)
FROM sometable a
Kickstart
  • 21,403
  • 2
  • 21
  • 33
0

update totally forgot the SUBSTRING_INDEX -1 thing (pointed out by @fthiella and @Kickstart), so updated the examples below

Solved by creating a stored function which is able to be used in-line. The stored function is able to accept an input string, any single-character delimiter, and an index of the desired item to extract.

The stored function definition is as follows

CREATE DEFINER = `user`@`%` 
FUNCTION `getListElement`(
    inString VARCHAR(255) , 
    inDelimiter CHAR(1) , 
    inIndexToReturn TINYINT UNSIGNED
) RETURNS varchar(255) READS SQL DATA DETERMINISTIC SQL SECURITY INVOKER
BEGIN

    -- Takes in as argument a string, and then breaks out the desired string
    DECLARE resultString VARCHAR(255) DEFAULT inString;
    DECLARE numberOfListElements TINYINT UNSIGNED DEFAULT 0;
    DECLARE errorMessage VARCHAR(255) DEFAULT 'Requested index was < 1 which was invalid';

    -- First of all, additional processing only needed for element 2 upwards
    IF inIndexToReturn = 1 THEN

        RETURN SUBSTRING_INDEX( resultString , inDelimiter , inIndexToReturn);

    ELSEIF inIndexToReturn > 1 THEN

        -- Count the number of elements
        -- This will count the missing delimiters based off the replace.  A list of 4 will be missing 3 characters.
        SET numberOfListElements = ( CHAR_LENGTH( resultString ) + 1 ) - CHAR_LENGTH( REPLACE( resultString , inDelimiter , '' ) );

        IF numberOfListElements >= inIndexToReturn THEN

            -- Make sure to only return the last of the elements returend by the first SUBSTRING_INDEX
            RETURN SUBSTRING_INDEX( SUBSTRING_INDEX( inString , inDelimiter , inIndexToReturn ) , inDelimiter , -1 );

        END IF;

        SET errorMessage = CONCAT('List index ',inIndexToReturn,' was requested from a list with ',numberOfListElements,' elements');

    END IF;

    SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = errorMessage;

END

In the examples above, the following could be used to extract specific list elements

  • SELECT getListElement( '1,2,3,4,5' , ',' , 2 ) returns 2
  • SELECT getListElement( REPLACE( 'word1 word2 word3' ,' ', ',' ) , ',' , 3 ) returns word3 (see note below for reason on the REPLACE)
  • SELECT getListElement( 'pipe|delimited|list' , '|' , 1 ) returns pipe

It would also be possible to use this within an iterator to loop over elements in a list. First of all you need to count the items in the list (see How to count items in comma separated list MySQL) however once you have that, it's just a matter of iterating over them as in this fragment from a stored procedure

-- Reinitialise variables
SET @list = '1,2,3,4,5';
SET @delimiter = ',';
SET @listLength = (CHAR_LENGTH( @list ) + 1 ) - CHAR_LENGTH( REPLACE( @list , @delimiter , '' ) );
SET @currentElement = 1;

listLoop: REPEAT

    -- Select the list element and do something with it
    SELECT getListElement( @list , @delimiter , @currentElement );

    -- Increment the current element
    SET @currentElement = @currentElement + 1;

UNTIL @currentElement > @listLength
END REPEAT listLoop;

important space-delimited lists seem to cause issues for this procedure, and so prior to parsing a string into the function I would recommend doing a simple REPLACE to replace the spaces with another suitable single-character separator (i.e. , or | depending on the content in the string)

Community
  • 1
  • 1