1

I am using pandas.to_sql to write data to an existing MySQL table. The code has been running in a crontab job for weeks without fail.

I started to get the following error: ValueError: MySQL identifier cannot be entirely numeric

My Code:

thisweek.to_sql(name='bs_reporting_weeklymetrics', con = cnx, flavor = 'mysql', if_exists = 'append', index=False)

As you can see the table name is not numeric.

AsAP_Sherb
  • 1,661
  • 1
  • 13
  • 16

1 Answers1

2

This is caused by an update in pandas 0.16.1, where I was using a previous version prior (I think 0.14.XX) EDIT: this will be fixed in pandas 0.16.2

With this update there is new code in the io.sql package for to_sql that is checking the table name and all column names for numeric characters:

def _get_valid_mysql_name(name):
# Filter for unquoted identifiers 
# See http://dev.mysql.com/doc/refman/5.0/en/identifiers.html
uname = _get_unicode_name(name)
if not len(uname):
    raise ValueError("Empty table or column name specified")

basere = r'[0-9,a-z,A-Z$_]'
for c in uname:
    if not re.match(basere, c):
        if not (0x80 < ord(c) < 0xFFFF):
            raise ValueError("Invalid MySQL identifier '%s'" % uname)
if not re.match(r'[^0-9]', uname):
    raise ValueError('MySQL identifier cannot be entirely numeric')

return '`' + uname + '`'

re.match(r'[0-9], uname) returns as None if the uname value is only numeric OR STARTS with a numeric character. I think this is a bug because MySQL supports column names which contain and start with numeric characters (which I had '90DayTrailingAvgRevenue')

You can update the pandas code to use:

if re.match(r'[0-9][0-9]*$', uname):

inplace of that line. This changes the regular expression to look for 1 or more numeric characters and the end of line, so that it qualifies the entire name as numeric and not just the first character. It also switches to a positive match instead of negative, so I removed 'not'

If you don't want to mess with the pandas package, then I suggest renaming your columns to not start with numerics.

joris
  • 133,120
  • 36
  • 247
  • 202
AsAP_Sherb
  • 1,661
  • 1
  • 13
  • 16
  • Can you open an issue for this on github? https://github.com/pydata/pandas/issues – joris Jun 02 '15 at 08:40
  • BTW, using `flavor='mysql'` is deprecated and will be removed in a future version. You will have to start using sqlalchemy to use ``to_sql`` – joris Jun 02 '15 at 08:40
  • Thanks @joris! I had been delaying that conversion. I am trying to convert now to using SQLAlchemy and am unable to write to sql without error (I was able to read_sql using a sqlalchemy engine and raw_connection() ) Would love any guidance on the best way to do so. I am opening another question, couldn't find an answer that addressed this issue and will link it when complete – AsAP_Sherb Jun 03 '15 at 21:31
  • new question on implementing sqlalchemy with to_sql: http://stackoverflow.com/questions/30631325/writing-to-mysql-database-with-pandas-using-sqlalchemy-to-sql – AsAP_Sherb Jun 03 '15 at 21:46
  • By the way, a fix for the above will be in 0.16.2 (bugfix release that will be released in one or two weeks), see the linked issue – joris Jun 03 '15 at 22:06
  • I am using pandas 0.17.1, and I am encountering similar problems with mysql table name starting with numeric characters. I had to comment part of the function def _get_valid_mysql_name(name): in pandas' sql.py to make it work. – tagoma Dec 09 '15 at 09:48