2

Scope: I am creating a data pipeline which requires me to SSH into a linux VM (all my code is there) and then send the data to a MariaDB Server

My python script extracts data from a CRM, saves it into a dataframe and then sends it to the DB.

when I run:

from sqlalchemy import create_engine
f= open("db credentials,txt", "r")  # parse credentials in txt to hide sensitive data
sql_conn = f.read()
conn = create_engine(sql_conn)
df.to_sql(name='test_trx_1day', con=conn, if_exists='replace', index=False)

I get the following error:

sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (3750, "Unable to create or change a table without
a primary key, when the system variable 'sql_require_primary_key' is set. Add a primary key to the table or unset t
his variable to avoid this message. Note that tables without a primary key can cause performance problems in row-ba
sed replication, so please consult your DBA before changing this setting.")

I'm running:
python: 3.7.3
pyCharm IDE
using pymysql as mysql connector \

When I run my code on my machine and the destination is my local MYSQL DB it works!

Note: I am not that familiar with linux env so I am learning as I go along.

Dimitris
  • 41
  • 4
  • @snakecharmerb the table doesn't exist. I am creating it using the .to_sql method – Dimitris Aug 16 '21 at 18:08
  • 1
    Based on [other](https://stackoverflow.com/q/39407254/2144390) [questions](https://stackoverflow.com/q/50469391/2144390) it looks like you'll probably have to first create the table with a primary key and then use `to_sql(… , if_exists="append")`. – Gord Thompson Aug 16 '21 at 18:15
  • 2
    This seems to be a MySQL [setting](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_sql_require_primary_key) - presumably the setting is different on your development server and the remote. – snakecharmerb Aug 16 '21 at 18:18
  • My experience with .to_sql is that it shouldn't care about primary key, makes me think there is something else triggering the error. Are you using pymysql to connect to MariaDB instead of MariaDB connector? – Ben Dickson Aug 16 '21 at 19:41
  • 2
    @BenDickson - It's true that `.to_sql()` doesn't care about primary keys. It doesn't even create a PK column with `index=True` which would be a (somewhat) reasonable assumption. The problem is that `.to_sql()` is getting SQLAlchemy to emit a CREATE TABLE statement with no primary key defined, which the server is rejecting because it has been configured with `sql_require_primary_key` set to `ON`. – Gord Thompson Aug 16 '21 at 20:09
  • @GordThompson you are correct. Creating the empty table and structure first (with a PK) in mysql cmd solved this issue. Many thanks all! – Dimitris Aug 17 '21 at 02:36

1 Answers1

2

Kudos to @gordthompson for the solution. Writing it here as I can't mark this as solved from the comments section.

  1. Create the empty table with appropriate data types in mysql first.
  2. Create your primary key. (I had an issue where I could not set an existing column as a PK and so created a default INT auto_increment one. I later altered the table to switch PK)
  3. Ran the same df.to_sql and no errors.
Dimitris
  • 41
  • 4