4

I have some whitespace at the begining of a paragraph in a text field in MySQL.

Using trim($var_text_field) in PHP or TRIM(text_field) in MySQL statements does absolutely nothing. What could this whitespace be and how do I remove it by code?

If I go into the database and backspace it out, it saves properly. It's just not being removed via the trim() functions.

Bob Stein
  • 16,271
  • 10
  • 88
  • 101
kylex
  • 14,178
  • 33
  • 114
  • 175

7 Answers7

13
function UberTrim($s) {
    $s = preg_replace('/\xA0/u', ' ', $s);  // strips UTF-8 NBSP: "\xC2\xA0"
    $s = trim($s);
    return $s;
}

The UTF-8 character encoding for a no-break space, Unicode (U+00A0), is the 2-byte sequence C2 A0. I tried to make use of the second parameter to trim() but that didn't do the trick. Example use:

assert("abc" === UberTrim("  \r\n  \xc2\xa0  abc  \t \xc2\xa0   "));

A MySQL replacement for TRIM(text_field) that also removes UTF no-break spaces, thanks to @RudolfRein's comment:

TRIM(REPLACE(text_field, '\xc2\xa0', ' '))

UTF-8 checklist:

(more checks here)

  1. Make sure your PHP source code editor is in Encode in UTF-8 without BOM (Notepad++) UTF-8 mode without BOM. Or set in the preferences.

  2. Make sure your MySQL client is set for UTF-8 character encoding (more here and here), e.g.

    $pdo = new PDO('mysql:host=...;dbname=...;charset=utf8',$userid,$password); $pdo->exec("SET CHARACTER SET utf8");

  3. Make sure your HTTP server is set for UTF-8, e.g. for Apache:

    AddDefaultCharset UTF-8

  4. Make sure the browser expects UTF-8.

    header('Content-Type: text/html; charset=utf-8');

    or

    <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />

Community
  • 1
  • 1
Bob Stein
  • 16,271
  • 10
  • 88
  • 101
  • Thanks to @tchrist suggestion [here](http://stackoverflow.com/a/7262978/673991) about the 'u' flag in preg_replace(). – Bob Stein Mar 28 '13 at 01:18
  • YESS thank you !!! my sql data has stange spaces in it... I solved thanks to your \xc2\xa0 like this: WHERE REPLACE(rowname, '\xc2\xa0', ' ') LIKE '$searchstring%' – Jaxx0rr Jul 24 '14 at 16:25
7

If the problem is with UTF-8 NBSP, another simple option is:

REPLACE(the_field, UNHEX('C2A0'), ' ')
Gabriel
  • 161
  • 2
  • 3
3

The best solution is a combination of a few things mentioned to you already.

First run ORD() on the string in question. In my case I had to run a reverse first because my problem character was at the end of the string.

ORD(REVERSE([col name])) 

Once you discover the problematic char, run a

REPLACE([col_name], char([char_value_returned]), char(32))

Finally, call a proper

TRIM([col_name])

This will completely eradicate the problem character from all aspects of the string, and trim off the leading (in my case trailing) character.

pim
  • 12,019
  • 6
  • 66
  • 69
2

Try using the MySQL ORD() function on the text_field to check the character code of the left-most character. It can be a non-whitespace characters that appears like whitespace.

Xint0
  • 5,221
  • 2
  • 27
  • 29
  • This outputs 32, what does that mean? – kylex Mar 14 '11 at 17:17
  • 32 is the code of the normal space character, which should be removed by TRIM() MySQL function. Try `ORD(TRIM(text_field))`. – Xint0 Mar 15 '11 at 02:34
  • This helped me, my non-breaking spaces were returning code 49824. Here is what I used to track entries and do a manual cleanup `SELECT * FROM table WHERE (ord(substr(text_field,-1))) > 255 or (ord(substr(text_field,1,1))) > 255'` – Benjamin Feb 11 '13 at 21:08
  • 1
    @Benjamin, 49824 decimal = C2A0 hexadecimal. This is the 2-character UTF-8 code for NO-BREAK SPACE, Unicode (U+00A0). See my answer below. – Bob Stein Mar 28 '13 at 00:45
1

you have to detect these "whitespace" characters first. if it's some HTML entity, like &nbsp; no trimming function would help, of course.

I'd suggest to print it out like this

echo urlenclde($row['field']);

and see what it says

Well as its A0 (or 160 decimal) non-breaking space character, you can convert it to ordinal space first:

<pre><?php
$str = urldecode("%A0")."bla";
var_dump(trim($str));
$str = str_replace(chr(160)," ",$str);
$str = trim($str);
var_dump($str);

and, ta-dam! -

string(4) " bla"
string(3) "bla"
Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
0

Try to check what character each "whitespace" is by writing the charactercode out - It might be a non-visible charactertype that isn't removed by trim. Trim only removes a few characters such as whitespace, tab, newline, CR and NUL but there exist other non-visible characters that might cause this problem.

Milk78
  • 239
  • 3
  • 13
-1

try


str_ireplace(array("\r", "\n", "\t"), $var_text_field

Biraj Pandey
  • 104
  • 5
  • 11