0

I have a serious problem with my populate. Characters are not stored correctly. My code:

def _create_Historial(self):

    datos = [self.DB_HOST, self.DB_USER, self.DB_PASS, self.DB_NAME]

    conn = MySQLdb.connect(*datos)
    cursor = conn.cursor()
    cont = 0

    with open('principal/management/commands/Historial_fichajes_jugadores.csv', 'rv') as csvfile:
        historialReader = csv.reader(csvfile, delimiter=',')
        for row in historialReader:
            if cont == 0:
                cont += 1
            else:
                #unicodedata.normalize('NFKD', unicode(row[4], 'latin1')).encode('ASCII', 'ignore'),
                cursor.execute('''INSERT INTO principal_historial(jugador_id, temporada, fecha, ultimoClub, nuevoClub, valor, coste) VALUES (%s,%s,%s,%s,%s,%s,%s)''',
                               (round(float(row[1]))+1,row[2], self.stringToDate(row[3]), unicode(row[4],'utf-8'), row[5], self.convertValue(row[6]), str(row[7])))

    conn.commit()
    cursor.close()
    conn.close()

El error es el siguiente:

Traceback (most recent call last):
File "/home/tfg/pycharm-2016.3.2/helpers/pycharm/django_manage.py",    line 41, in <module>
run_module(manage_file, None, '__main__', True)
File "/usr/lib/python2.7/runpy.py", line 188, in run_module
fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 82, in _run_module_code
mod_name, mod_fname, mod_loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/tfg/TrabajoFinGrado/demoTFG/manage.py", line 10, in  <module>
execute_from_command_line(sys.argv)
File "/usr/local/lib/python2.7/dist-    packages/django/core/management/__init__.py", line 443, in   execute_from_command_line
utility.execute()
File "/usr/local/lib/python2.7/dist -packages/django/core/management/__init__.py", line 382, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 196, in run_from_argv
self.execute(*args, **options.__dict__)
File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 232, in execute
output = self.handle(*args, **options)
File "/home/tfg/TrabajoFinGrado/demoTFG/principal/management/commands/populate_db.py", line 230, in handle
self._create_Historial()
File "/home/tfg/TrabajoFinGrado/demoTFG/principal/management/commands/populate_db.py", line 217, in _create_Historial
(round(float(row[1]))+1,row[2], self.stringToDate(row[3]), unicode(row[4],'utf-8'), row[5], self.convertValue(row[6]), str(row[7])))
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 187, in execute
query = query % tuple([db.literal(item) for item in args])
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py", line 278, in literal
return self.escape(o, self.encoders)
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py", line 208, in unicode_literal
return db.literal(u.encode(unicode_literal.charset))
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 6-7: ordinal not in range(256)

The characters was shownn as follows: Nicolás Otamendi, Gaël Clichy ....

When I print the characteros on shell of the python, its wah shown correctly.

Sorry for my english :(

kylehide65
  • 13
  • 4
  • Done! The solution here http://es.stackoverflow.com/questions/51284/codificaci%C3%B3n-mysql-y-python/51349#51349 – kylehide65 Feb 24 '17 at 00:33

1 Answers1

0

Ok, I'll keep this brief.

  1. You should convert encoded data/strs to Unicodes early in your code. Don't inline .decode()/.encode()/unicode()

  2. When you open a file in Python 2.7, it's opened in binary mode. You should use io.open(filename, encoding='utf-8'), which will read it as text and decode it from utf-8 to Unicodes.

  3. The Python 2.7 CSV module is not Unicode compatible. You should install https://github.com/ryanhiebert/backports.csv

  4. You need to tell the MySQL driver that you're going to pass Unicodes and use UTF-8 for the connection. This is done by adding the following to your connection string:

    charset='utf8',
    use_unicode=True
    
  5. Pass Unicode strings to MySQL. Use the u'' prefix to avoid troublesome implied conversion.

  6. All your CSV data is already str / Unicode str. There's no need to convert it.

Putting it all together, your code will look like:

from backports import csv
import io
datos = [self.DB_HOST, self.DB_USER, self.DB_PASS, self.DB_NAME]

conn = MySQLdb.connect(*datos, charset='utf8', use_unicode=True)
cursor = conn.cursor()
cont = 0

with io.open('principal/management/commands/Historial_fichajes_jugadores.csv', 'r', encoding='utf-8') as csvfile:
    historialReader = csv.reader(csvfile, delimiter=',')
    for row in historialReader:
        if cont == 0:
            cont += 1
        else:
            cursor.execute(u'''INSERT INTO principal_historial(jugador_id, temporada, fecha, ultimoClub, nuevoClub, valor, coste) VALUES (%s,%s,%s,%s,%s,%s,%s)''',
                  round(float(row[1]))+1,row[2], self.stringToDate(row[3]), row[4], row[5], self.convertValue(row[6]), row[7]))

conn.commit()
cursor.close()
conn.close()

You may also want to look at https://stackoverflow.com/a/35444608/1554386, which covers what Python 2.7 Unicodes are.

Community
  • 1
  • 1
Alastair McCormack
  • 26,573
  • 8
  • 77
  • 100