0

UPDATE: Here's a Pastebin of the JSON code. If anyone could please tell me why MySQL refuses to read this as a TEXT variable and how to get around it, I would appreciate it: https://pastebin.com/ju8xPPsV

SO I've got a strange one here. Running 8.0.18 on Windows. My environment is all utf8mb4 and utf8mb4_unicode_ci.

The issue is MySQL refuses to read the attached JSON text into a TEXT field, regardless of what CHARACTER SET is defined, always presenting the following error when trying to set the TEXT variable to the code:

ERROR 1366 (HY000): Incorrect string value: '\x96 2019...' for column 'json_text' at row 1

Obviously there's a character in the text MySQL doesn't like, but what character, where, and how do I 'scrub' this text so it will be accepted?

This happens on the line that tries to assign the JSON to a Text value:

set json_text= JSON_EXTRACT((injason), CONCAT('$.', 'in_JSON'));

Here's the complete code:

DROP PROCEDURE IF EXISTS `sp_jason`;
DELIMITER //
CREATE DEFINER=`root`@`localhost` 
PROCEDURE `sp_jason`(
    IN parameterName VARCHAR(10)
)
COMMENT 'schema_version_applied_to: v1.0.0'
master_block:BEGIN
 DECLARE json_text text CHARACTER  SET latin1;
 DECLARE injason JSON;

DECLARE exit handler for SQLEXCEPTION, SQLWARNING, NOT FOUND
 BEGIN
  GET DIAGNOSTICS CONDITION 1 @sqlstate = RETURNED_SQLSTATE, 
   @errno = MYSQL_ERRNO, @text = MESSAGE_TEXT;
  SET @full_error = CONCAT("ERROR ", @errno, " (", @sqlstate, "): ", @text);
  SELECT @full_error;
 END;


 SET @hugeJason='
{
    "in_JSON": {
        "data_value": {
            "name": "Wesley Snell",
            "about": "Over 15 years of R&D development on over 30 products in various categories (automotive, medical, telecom, mobil devices, navigation, security, industrial automation, consumer products, CAD, hospitality solutions, kiosk, point of sale ...",
            "email": "wesley@qualcomm.com",
            "articles": [
                {
                    "linkURL": "https://www.linkedin.com/pulse/invitation-part-state-ai-ml-december-2019-january-2020-kiran-gunnam"
                }
            ],
            "nameLast": "Snell",
            "education": [
                {
                    "degreeName": "Master\'s degree",
                    "schoolName": "University of California, San Diego - Jacobs School of Engineering",
                    "description": "The next step in the wireless revolution is the connection of everyday devices through wireless technology. Embedded wireless technology is today exploding in nearly every market sector from personal electronics and medical devices, to the transportation infrastructure and manufacturing. ...",
                    "fieldOfStudy": "Wireless Embedded Systems",
                    "schoolImageURL": "https://media-exp1.licdn.com/dms/image/C4D0BAQHEc15VMuDaGg/company-logo_100_100/0?e=1589414400&v=beta&t=VA3iTzTB4ZBq-iYp_Fd4FYbCG-4YV7gDL47JCXaMPk4",
                    "activitiesAndSocieties": "Minor study in Digital Hardware Engineering and Business Management",
                    "datesAttendedOrGraduationString": "2017 – 2019"
                }
            ],
            "interests": [
                {
                    "title": "Ford Motor Company",
                    "imageURL": "https://media-exp1.licdn.com/dms/image/C4D0BAQGUxHv2MadZ9w/company-logo_100_100/0?e=1589414400&v=beta&t=t6v_lnhbQYiUjdGSiPBSQ89PpTMges1JsxsBQHtSfSY",
                    "occupation": null,
                    "followerCount": 2254781
                },
                {
                    "title": "Bill Gates",
                    "imageURL": "https://media-exp1.licdn.com/dms/image/C5603AQHv9IK9Ts0dFA/profile-displayphoto-shrink_100_100/0?e=1586390400&v=beta&t=oC0Mo8Qep7-PeP7TSjNEx94AbX4um89htO25XGhJZFE",
                    "occupation": "Co-chair, Bill & Melinda Gates Foundation",
                    "followerCount": 24625634
                }
            ],
            "nameFirst": "Wesley",
            "experience": [
                {
                    "roles": [
                        {
                            "jobType": "full-time",
                            "jobTitle": "Senior Staff Engineer - Machine Learning , Computer Vision & IOT (R&D Division)",
                            "location": null,
                            "currentRole": true,
                            "jobDescription": "Working on machine learning based computer vision based projects on embedded platforms. Heavy use of programming in C. Modern C++, Python and MATLAB. Heavy interaction in hardware, schematic review and inter process communications.",
                            "employmentDuration": {
                                "datesEmployedEnd": "Present",
                                "datesEmployedStart": "2019",
                                "datesEmployedString": "2019 – Present",
                                "employmentDurationYears": 1,
                                "employmentDurationString": "1 yr"
                            }
                        },
                        {
                            "jobType": "full-time",
                            "jobTitle": "Senior Staff Engineer - Android Team & Wireless Communications (Wireless R&D Division)",
                            "location": null,
                            "currentRole": true,
                            "jobDescription": "A year plus of Android software development for mobile applications with coding in Java and c++ with the NDK. ...",
                            "employmentDuration": {
                                "datesEmployedEnd": "Present",
                                "datesEmployedStart": "2019",
                                "datesEmployedString": "",
                                "employmentDurationYears": 1,
                                "employmentDurationString": "1 yr"
                            }
                        },
                        {
                            "jobType": null,
                            "jobTitle": "Senior Staff Engineer - R&D Division (Platform & Tools Lead)",
                            "location": "San Diego",
                            "currentRole": false,
                            "jobDescription": "Working on new Research and Developments projects, for state of the art products. ...",
                            "employmentDuration": {
                                "datesEmployedEnd": "Present",
                                "datesEmployedStart": "Apr 2016",
                                "datesEmployedString": "Apr 2016 – Present",
                                "employmentDurationYears": 3,
                                "employmentDurationMonths": 11,
                                "employmentDurationString": "3 yrs 11 mos"
                            }
                        }
                    ],
                    "currentJob": true,
                    "companyName": "Qualcomm",
                    "employmentDurationYears": 14,
                    "employmentDurationString": "14 yrs 7 mos"
                },
                {
                    "roles": [
                        {
                            "jobType": null,
                            "jobTitle": "President/Founder",
                            "location": "San Diego",
                            "currentRole": false,
                            "jobDescription": "Many projects in the following areas. - Internet Applications (JavaScript/ HTML/ SQL). ...",
                            "employmentDuration": {
                                "datesEmployedEnd": "Feb 2008",
                                "datesEmployedStart": "Jan 1998",
                                "datesEmployedString": "Jan 1998 – Feb 2008",
                                "employmentDurationYears": 10,
                                "employmentDurationString": "10 yrs 2 mos"
                            }
                        }
                    ],
                    "currentJob": false,
                    "companyName": "Creative Logic",
                    "employmentDurationYears": 10,
                    "employmentDurationString": "10 yrs 2 mos"
                },
                {
                    "roles": [
                        {
                            "jobType": null,
                            "jobTitle": "Principal R&D Engine",
                            "location": null,
                            "currentRole": false,
                            "jobDescription": null,
                            "employmentDuration": {
                                "datesEmployedEnd": "Dec 1999",
                                "datesEmployedStart": "Jan 1998",
                                "datesEmployedString": "Jan 1998 – Dec 1999 2 yrs",
                                "employmentDurationYears": 2,
                                "employmentDurationString": "2 yrs"
                            }
                        }
                    ],
                    "currentJob": false,
                    "companyName": "Infogation",
                    "employmentDurationYears": 10,
                    "employmentDurationString": "Jan 1998 – Dec 1999"
                }
            ],
            "nameMiddle": "",
            "profileURL": "https://www.linkedin.com/in/wesleysnell",
            "currentCompany": {
                "name": "123",
                "imageURL": "https://media-exp1.licdn.com/dms/image/C4E0BAQFIzy7XU945xg/company-logo_100_100/0?e=1589414400&v=beta&t=eIPPusNFhCPBHri11dcijhuyiSsYWJUehf_KUOFCvXQ"
            },
            "accomplishments": [
                {
                    "projects": [
                        {
                            "data": "Jun 2017",
                            "title": "Data Scientist Competition winner",
                            "issuer": "DataCamp",
                            "subTitle": "Jun 2017 DataCamp",
                            "description": "Data Scientist with Python Career Track on DataCamp"
                        },
                        {
                            "data": "Jun 2017",
                            "title": "Data Scientist Competition winner",
                            "issuer": "DataCamp",
                            "subTitle": "Jun 2017 DataCamp",
                            "description": "Data Scientist with Python Career Track on DataCamp"
                        }
                    ],
                    "honorAndAwards": [
                        {
                            "data": "Jun 2017",
                            "title": "Data Scientist Competition winner",
                            "issuer": "DataCamp",
                            "subTitle": "Jun 2017 DataCamp",
                            "description": "Data Scientist with Python Career Track on DataCamp"
                        },
                        {
                            "data": "Jun 2017",
                            "title": "Data Scientist Competition winner",
                            "issuer": "DataCamp",
                            "subTitle": "Jun 2017 DataCamp",
                            "description": "Data Scientist with Python Career Track on DataCamp"
                        }
                    ]
                }
            ],
            "currentJobTitle": "Senior Staff Engineer - Machine Learning , Computer Vision & IOT (R&D Division) at Qualcomm",
            "currentLocation": "San Diego, California",
            "profileImageURL": "https://media-exp1.licdn.com/dms/image/C4D03AQGoa2edXG6hWA/profile-displayphoto-shrink_200_200/0?e=1586390400&v=beta&t=f-Rv8pyCxIAuDRfK8g6AxkNHf-t3MGbZRfCvcOFghoY",
            "recommendations": {
                "given": [
                    {
                        "recommender": {
                            "name": "Steve Jobs",
                            "headline": "Founder Apple Inc.",
                            "relation": "October 22, 2010, Steve managed Wesley directly"
                        },
                        "recommendation": "Wes is one of the sharpest engineers I have worked with. His biggest assets are his technical versatility and engineering creativity ..."
                    },
                    {
                        "recommender": {
                            "name": "Bill Gates",
                            "headline": "Founder Windows Inc.",
                            "relation": "October 22, 2015, Bill was senior to Wesley but didn\'t manage"
                        },
                        "recommendation": "Wes is one of the sharpest engineers I have worked with. His biggest assets are his technical versatility and engineering creativity ..."
                    }
                ],
                "received": [
                    {
                        "recommender": {
                            "name": "Steve Jobs",
                            "headline": "Founder Apple Inc.",
                            "relation": "October 22, 2010, Steve managed Wesley directly"
                        },
                        "recommendation": "Wes is one of the sharpest engineers I have worked with. His biggest assets are his technical versatility and engineering creativity ..."
                    },
                    {
                        "recommender": {
                            "name": "Bill Gates",
                            "headline": "Founder Windows Inc.",
                            "relation": "October 22, 2015, Bill was senior to Wesley but didn\'t manage"
                        },
                        "recommendation": "Wes is one of the sharpest engineers I have worked with. His biggest assets are his technical versatility and engineering creativity ..."
                    }
                ],
                "givenCount": 7,
                "receivedCount": 15
            },
            "connectionDegree": 3,
            "numberOfConnections": 500,
            "volunteerExperience": [
                {
                    "cause": "Environment",
                    "companyName": "Pacific Ridge Community",
                    "description": "Community board planning member for everything in the Sorrento Valley Community. ...",
                    "designation": "Board Member (past Treasurer)",
                    "volunteerDuration": "16 yrs 2 mos",
                    "datesVolunteeredString": "Jan 2004 – Present"
                }
            ],
            "connectionDateString": "June 14, 2014",
            "skillsAndEndorsements": {
                "topSkills": [
                    {
                        "skillName": "Embedded Systems",
                        "eliteEndorsers": [
                            {
                                "name": "Sean Liming",
                                "headline": "Owner of AnnaBooks, LLC."
                            },
                            {
                                "name": "Franz Fisher",
                                "headline": "Director of NASA"
                            }
                        ],
                        "endorsementCount": 79,
                        "colleagueEndorsers": [
                            {
                                "name": "Sean Liming",
                                "headline": "Owner of AnnaBooks, LLC."
                            },
                            {
                                "name": "Franz Fisher",
                                "headline": "Director of NASA"
                            }
                        ]
                    },
                    {
                        "skillName": "Wireless",
                        "endorsementCount": 78
                    }
                ],
                "otherSkills": [
                    {
                        "skillName": "Hardware",
                        "endorsementCount": 2
                    },
                    {
                        "skillName": "USB",
                        "endorsementCount": 6
                    }
                ],
                "industrySkills": [
                    {
                        "skillName": "Mobile Devices",
                        "endorsementCount": 50
                    },
                    {
                        "skillName": "Wireless",
                        "endorsementCount": 78
                    }
                ],
                "toolsAndTechnologies": [
                    {
                        "skillName": "C++",
                        "endorsementCount": 15
                    },
                    {
                        "skillName": "Andriod",
                        "endorsementCount": 9
                    }
                ]
            },
            "licensesAndCertifications": [
                {
                    "issuedDate": "Nov 2019",
                    "Issuing authority": "Udemy",
                    "certificationName": "Certified Computer Professional (CCP)",
                    "certificationLinkURL": "https://www.udemy.com/certificate/UC-8U59W6QZ",
                    "credentialIdentifier": "Credential ID UC-8U59W6QZ"
                }
            ]
        },
        "data_state_c": 1
    },
    "in_JSON_value": "data_value",
    "in_sql_insert_1": "data_state_c,",
    "in_sql_insert_2": "1,"
}
';


set injason = cast(@hugeJason as json);
set json_text= JSON_EXTRACT((injason), CONCAT('$.', 'in_JSON'));
select json_text;


END master_block//
DELIMITER ;


call sp_jason(1);

Anyone know what's causing the problem? @Nick! Are you out there? :)

Floobinator
  • 388
  • 2
  • 11

2 Answers2

0

Maybe this answer will help you: MySQL warning: Incorrect string value: '\x96

The hex 96 is presumably the latin1 encoding for an en-dash (–). But you have specified that the CSV file is utf8-encoded (or utf8mb4), this character is incomprehensible to utf8.

Plan A: Change the file. (This is probably not practical.)

Plan B: Tell MySQL that the file is latin1 (as opposed to utf8). Then MySQL will convert it correctly to the utf8-encoding E28093.

"Collation" has to do with sorting and comparing; "Character set" has to do with 'encoding'.

Perhaps the file containing the query is incorrectly encoded.

loverwien
  • 55
  • 5
  • Interesting; how would I change the "text" per your recommendation? As you can see, the TEXT variable we're trying to extract to is already defined as Latin1; wouldn't that have solved the problem if that was the issue? Or do you mean setting the core JSON variable to something other than the default of utf8mb4? How would I do that? Also, as you can see this is not a file but an entered 'string' defined in the stored procedure itself (as a JSON variable). – Floobinator Mar 12 '20 at 16:22
  • Further information, if I set the big JSON string to a TEXT value of Latin1 (e.g. change @hugeJason to var_jason (as text CHARACTER SET latin1) it gives the exact same error, so it's definitely a problem with the character set - but ultimately this makes no sense to me. The whole charset/UTF8 garbage was never, ever an issue with MSSQL. *sigh* – Floobinator Mar 12 '20 at 16:46
  • I would have guessed that you would have to change the encoding of the @hugeJason variable before it was cast to JSON to remove the invalid characters. The documentation also says: JSON text, should be encoded using the ascii, utf8, or utf8mb4 character set. Other character encodings are implicitly coerced to utf8mb4. (https://dev.mysql.com/doc/refman/5.7/en/json.html). Nevertheless I am glad that I could help you! – loverwien Mar 12 '20 at 17:09
  • the problem is even if I declare the TEXT variable as CHARACTER SET utf8mb4, setting that value to the 'big JSON string' (like you see in the above example) results in the error. MySQL simply does not want to 'fix' the characters (or encode) - from what I can tell. I have no idea how to work around this. – Floobinator Mar 12 '20 at 17:15
  • Unfortunately I cannot reproduce the problem, but maybe that will help: SET @hugeJason = SELECT CONVERT(BINARY CONVERT(@hugeJason USING latin1) USING utf8); (https://stackoverflow.com/questions/1476356/detecting-utf8-broken-characters-in-mysql) – loverwien Mar 12 '20 at 17:27
  • I'm starting to wonder if this could be a problem with my IDE, dbForge Studio 2019. Perhaps the IDE is pasting the "wrong" charset in the editor? I went through the settings and can't find what charset to default to anywhere. Since my DBA buddy can't repro this error and you can't repro it (which makes no sense) perhaps that could be the issue? Ultimately I can't even set a variable to the attached Pastebin. It's that simple of an error. – Floobinator Mar 12 '20 at 17:43
0

So, I've solved the problem by pasting the text into https://jsonformatter.curiousconcept.com/, which showed the JSON structure as being fine and in the green, but it stated "Escaped unescaped chracter". Sadly, it didn't tell me where or what character, and what's crazy is the 3 character escapes of ' (e.g. \') were already previously done. So none of this makes any sense, but it's obvious MySQL has a serious problem managing certain types of text, and even trying to define that text as utf8 or utf8mb4 or even latin1 has no impact on an issue like this, which is absurd. Sadly, I learned nothing from this "fix" and still don't have a clue why the error happened and why my buddy (a MySQL DBA) was able to run the code without any problems when the EXACT same code caused errors on my end; but now it's magically fixed when I run it through the json processing site (remember, the issue was setting it to a TEXT value and ultimately had nothing to do with JSON formatting). 95% of every issue I've encountered with MySQL is related to either Character Set or Collation problems. I NEVER had issues like this with MSSQL. I regret ever choosing MySQL.

Floobinator
  • 388
  • 2
  • 11