39

I can use the MySQL TRIM() method to cleanup fields containing leading or trailing whitespace with an UPDATE like so:

UPDATE Foo SET field = TRIM(field);

I would like to actually see the fields this will impact before this is run. I tried this but returns 0 results:

SELECT * FROM Foo WHERE field != TRIM(field);

Seems like this should work but it does not.

Anyone have a solution? Also, curious why this does not work...

Dan Bracuk
  • 20,699
  • 4
  • 26
  • 43
Michael
  • 3,568
  • 3
  • 37
  • 50

4 Answers4

65

As documented under The CHAR and VARCHAR Types:

All MySQL collations are of type PADSPACE. This means that all CHAR and VARCHAR values in MySQL are compared without regard to any trailing spaces.

In the definition of the LIKE operator, the manual states:

In particular, trailing spaces are significant, which is not true for CHAR or VARCHAR comparisons performed with the = operator:

As mentioned in this answer:

This behavior is specified in SQL-92 and SQL:2008. For the purposes of comparison, the shorter string is padded to the length of the longer string.

From the draft (8.2 <comparison predicate>):

If the length in characters of X is not equal to the length in characters of Y, then the shorter string is effectively replaced, for the purposes of comparison, with a copy of itself that has been extended to the length of the longer string by concatenation on the right of one or more pad characters, where the pad character is chosen based on CS. If CS has the NO PAD characteristic, then the pad character is an implementation-dependent character different from any character in the character set of X and Y that collates less than any string under CS. Otherwise, the pad character is a <space>.

One solution:

SELECT * FROM Foo WHERE CHAR_LENGTH(field) != CHAR_LENGTH(TRIM(field))
Community
  • 1
  • 1
eggyal
  • 122,705
  • 18
  • 212
  • 237
41
SELECT *
FROM 
    `foo`
WHERE 
   (name LIKE ' %')
OR 
   (name LIKE '% ')
user4035
  • 22,508
  • 11
  • 59
  • 94
  • CORRECT! As per this well documented idiosyncrasy in MySQL, `LIKE` respects leading and trailing whitespace, where as `=` does not: https://bugs.mysql.com/bug.php?id=64772 – Joshua Pinter Jul 24 '17 at 22:59
8

Here is an example with RegEx

SELECT *
FROM 
    `foo`
WHERE 
   (name REGEXP '(^[[:space:]]|[[:space:]]$)')
aaiezza
  • 1,297
  • 3
  • 11
  • 21
EpixRu
  • 201
  • 2
  • 11
  • @PatrickBassut Assuming, by whitespace the OP meant [whitespace](https://en.wikipedia.org/wiki/Whitespace_character#Unicode) rather than just U+0020, this is the only correct answer. – maaartinus Jun 11 '20 at 23:09
0

Another solution could be using SUBSTRING() and IN to compare the last and first characters of the string with a list of whitespace charaters...

(SUBSTRING(@s,  1, 1) IN (' ', '\t', '\n', '\r') OR SUBSTRING(@s, -1, 1) IN (' ', '\t', '\n', '\r'))

...where @s is any input string. Add additional whitespace characters to the comparison list as needed in your case.

Here's a simple test to demonstrate how that expression behaves with various string inputs:

SET @s_normal = 'x';
SET @s_ws_leading = '\tx';
SET @s_ws_trailing = 'x ';
SET @s_ws_both = '\rx ';

SELECT
    NOT(SUBSTRING(@s_normal,      1, 1) IN (' ', '\t', '\n', '\r') OR SUBSTRING(@s_normal,     -1, 1) IN (' ', '\t', '\n', '\r')) test_normal      #=> 1 (PASS)
  ,    (SUBSTRING(@s_ws_leading,  1, 1) IN (' ', '\t', '\n', '\r') OR SUBSTRING(@s_ws_leading, -1, 1) IN (' ', '\t', '\n', '\r')) test_ws_leading  #=> 1 (PASS)
  ,    (SUBSTRING(@s_ws_trailing, 1, 1) IN (' ', '\t', '\n', '\r') OR SUBSTRING(@s_ws_trailing,-1, 1) IN (' ', '\t', '\n', '\r')) test_ws_trailing #=> 1 (PASS)
  ,    (SUBSTRING(@s_ws_both,     1, 1) IN (' ', '\t', '\n', '\r') OR SUBSTRING(@s_ws_both,    -1, 1) IN (' ', '\t', '\n', '\r')) test_ws_both     #=> 1 (PASS)
;

If this is something you'll be doing a lot you could also create a function for it:

DROP FUNCTION IF EXISTS has_leading_or_trailing_whitespace;

CREATE FUNCTION has_leading_or_trailing_whitespace(s VARCHAR(2000))
  RETURNS BOOLEAN
  DETERMINISTIC
RETURN (SUBSTRING(s, 1, 1) IN (' ', '\t', '\n', '\r') OR SUBSTRING(s, -1, 1) IN (' ', '\t', '\n', '\r'))
;

# test
SELECT
    NOT(has_leading_or_trailing_whitespace(@s_normal     )) #=> 1 (PASS)
  ,     has_leading_or_trailing_whitespace(@s_ws_leading )  #=> 1 (PASS)
  ,     has_leading_or_trailing_whitespace(@s_ws_trailing)  #=> 1 (PASS)
  ,     has_leading_or_trailing_whitespace(@s_ws_both    )  #=> 1 (PASS)
;
raven-rock
  • 53
  • 5