3

I'm connecting to an external API using cfhttp, with the returned data in XML format. I have no control over the API or the format it's returned in.

When the data is returned, I loop through it and do cfquery inserts into my own MySQL database, which has a UTF8 charset.

However, some of the data appears to have unicode characters (it appears it should be the £ (pound) sign, but when I cfdump the XMLParsed data, it's showing as a diamond with a ? inside). I've attached a cropped screenshot showing part of the cfdump showing this;

enter image description here

The problem is the cfquery insert - when it gets to those characters, it's returning this error;

Error Executing Database Query.

Incorrect string value: '\xEF\xBF\xBD10 ...' for column 'voucherTitle' at row 1

I've tried setting the charset in the cfhttp call, but get the same result.

Is there any way I can either encode/decode these, or alternatively trim them out altogether (the data gets edited further down the line anyway, so manually adding the correct symbols isn't a huge issue).

Vikash Pandey
  • 5,407
  • 6
  • 41
  • 42
Lee
  • 989
  • 2
  • 14
  • 30
  • Try including `` on top of your ColdFusion page. – Pankaj Jun 01 '16 at 12:49
  • Same error unfortunately :-( – Lee Jun 01 '16 at 12:52
  • You could try replacing it with its character reference- `£` – TRose Jun 01 '16 at 15:45
  • 2
    (Edit) @Lee - I recall [something similar on another thread](http://stackoverflow.com/questions/26788570/dealing-with-mysql-nativeerror-code-1366-and-sqlstate-hy000-in-coldfusion/26792514#26792514). Do your DSN settings use `characterEncoding=UTF8` or is the INSERT using `CF_SQL_NVARCHAR`? Also, I know you said the db uses UTF-8, but be sure to verify the column charset in `information_schema.columns`. Side note, `` is only needed when the unicode characters are literally embedded in the cfml source. – Leigh Jun 01 '16 at 16:00
  • 1
    @Leigh - never in a million years would I have thought to check the column charset - I always thought it would default to whatever the DB was set to. However, that was indeed the problem - set it to UTF8 and all works a treat! EDIT - please do post that as an answer so I can accept as the answer – Lee Jun 01 '16 at 16:54
  • @Lee - Yeah, while column charset totally makes sense, I remember overlooking it too ;-) Did you also get "MySQL NativeError Code 1366 and SQLState HY000 in coldfusion" too? If so, it is probably better to just mark this as a dupe and link it to the existing answer, since the answer is the same :) – Leigh Jun 01 '16 at 17:53
  • @Leigh - no, I didn't get those errors oddly – Lee Jun 01 '16 at 18:22
  • *always thought it would default to whatever the DB was set to* According to the docs, the final charset and collation [depend on what values were used in the DDL statements](https://dev.mysql.com/doc/refman/5.7/en/charset-table.html). Most likely the IDE used to create the tables uses a different default value for collate and/or charset default than the one you have set at the database level, ie utf8. – Leigh Jun 01 '16 at 20:08

1 Answers1

1

UPDATE: As of MySQL 5.5.3, there is also UTF8mb4 which is often recommended over UTF8.


(From the comments)

I recall something similar on another thread. Double check the collation and character set for that column using the INFORMATION_SCHEMA.COLUMNS view:

 SELECT  *
 FROM    INFORMATION_SCHEMA.COLUMNS
 WHERE   TABLE_NAME = 'YourTableName'

If it is not UTF-8, you can change it using the ALTER TABLE command. Modify the column size M as needed.

 ALTER TABLE YourTableName 
    MODIFY YourColumnName VARCHAR(M) 
    CHARACTER SET utf8;

NB: If the data is important, always make a backup of the table before applying any modifications.

See also: 11.1.15 Character Sets and Collations Supported by MySQL

Community
  • 1
  • 1
Leigh
  • 28,765
  • 10
  • 55
  • 103