0

Raw data:

Account 1234 � Rent

Running this in MySQL by manually executing a script, and then also executing the script using a Python shell:

SELECT
SUBSTRING_INDEX(SUBSTRING_INDEX(Account,':',-1),' � ',-1) as Description,
SUBSTRING_INDEX(SUBSTRING_INDEX(Account,':',-1),' � ',1) as Acct_Number
FROM table1

MySQL Output (correct one)

Acct_Number    Description
1234             Rent

Python Output (incorrect one)

Acct_Number    Description
1234           1234 � Rent

Is there a way to get python to read this bizarre � character? Have successfully used Python to run script on similar Account data (also using substring index) that includes a - instead of this � character, and it works totally fine.

In case this character is not showing in this post, here is a link to which I am referring to: https://apps.timwhitlock.info/unicode/inspect?s=%EF%BF%BD

v.coder
  • 1,822
  • 2
  • 15
  • 24
dh411
  • 1
  • It would be easier for people to help you if you post some code that you tried. What code did you use in python and which variable is showing that special character to you? – Siddardha Nov 11 '17 at 00:47

1 Answers1

0

See Trouble with UTF-8 characters; what I see is not what I stored for the causes of "black diamond".

Also see http://mysql.rjweb.org/doc.php/charcoll#python for tips on using UTF-8 with Python and MySQL.

It may be important to find the hex that was actually stored -- to determine where the problem was on INSERT versus on display.

As you noted, hex EFBFBD represents the black diamond -- meaning that the problem was caused on the "storing" side of things.

Rick James
  • 135,179
  • 13
  • 127
  • 222