8

When I write Pandas DataFrame to my SQLite database using to_sql method it changes the .schema of my table even if I use if_exists='append'. For example after execution

with sqlite3.connect('my_db.sqlite') as cnx:
    df.to_sql('Resolved', cnx, if_exists='append')

original .schema:

CREATE TABLE `Resolved` (
`Name` TEXT NOT NULL COLLATE NOCASE,
`Count` INTEGER NOT NULL,
`Obs_Date` TEXT NOT NULL,
`Bessel_year` REAL NOT NULL,
`Filter` TEXT NOT NULL,
`Comments` TEXT COLLATE NOCASE
);

changes to:

CREATE TABLE Resolved (
                  [Name] TEXT,
  [Count] INTEGER,
  [Obs_Date] TEXT,
  [Bessel_year] REAL,
  [Filter] TEXT,
  [Comments] TEXT

                  );

How to save the original scheme of my table? I use pandas 0.14.0, Python 2.7.5

emesday
  • 6,078
  • 3
  • 29
  • 46
drastega
  • 1,581
  • 5
  • 30
  • 42
  • I don't think that `to_sql` supports `if_exists` param, it doesn't appear in the doc strings – EdChum Jun 05 '14 at 15:45
  • help(pd.DataFrame.to_sql) – drastega Jun 05 '14 at 15:46
  • sorry my mistake, this could be a bug then, do you have sqlalchemy installed – EdChum Jun 05 '14 at 15:47
  • `cnx`, is this a sqlite connection object or a sqlalchemy engine? + What version of pandas are you using? – joris Jun 05 '14 at 16:19
  • There is possibly a bug that the table name is not escaped. Can you try `pd.io.sql.has_table('Resolved', cnx)` and report the result? – joris Jun 05 '14 at 16:25
  • @joris, the result is `True` – drastega Jun 05 '14 at 16:30
  • @user2579566 no matter, it was something else but related. I posted an answer, can you check if that works? – joris Jun 05 '14 at 16:42
  • @user2579566 Actually I was wrong, I don't know where the bug is, and I also cannot reproduce this. Can you provide some more info? `df.info()`, `pd.show_versions()`, a `cursor.execute("select sql from sqlite_master where type = 'table' and name = 'Resolved';"); cursor.fetchall()` before and after the `to_sql` statement. But maybe we can continue the discussion on the issue I opened: https://github.com/pydata/pandas/issues/7355 – joris Jun 05 '14 at 20:37
  • Sorry @joris! I'm afraid it was my mistake because I also cannot reproduce this. I can remove this question. But anyway it was very helpful for me. I appreciate your help! – drastega Jun 06 '14 at 00:09

1 Answers1

9

Starting from 0.14 (what you are using), the sql functions are refactored to use sqlalchemy to improve the functionality`. See the whatsnew and docs on this.
The raw sqlite3 connection is still supported as a fallback (but that is the only sql flavor that is supported without sqlalchemy).

Using sqlalchemy should solve the issue. For this you can just create a sqlalchemy engine instead of the direct sqlite connection cnx:

engine = sqlalchemy.create_engine('sqlite:///my_db.sqlite')
df.to_sql('Resolved', engine, if_exists='append')

But I filed an issue for the case with the sqlite cnx fallback option: https://github.com/pydata/pandas/issues/7355

joris
  • 133,120
  • 36
  • 247
  • 202
  • When I execute your code I get /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/base.py:860: SAWarning: Could not instantiate type with reflected arguments [u'4']; using no arguments. coltype = self._resolve_type_affinity(type_) – drastega Jun 05 '14 at 16:55
  • It is a warning I see, but it does also not work? Are the data appended and the scheme not changed? – joris Jun 05 '14 at 18:50
  • Glad it works! But, without using the sqlalchemy engine, you still have the issue? Would you like to further discuss this? (I would like to fix this bug, but I can't reproduce it) See http://github.com/pydata/pandas/issues/7355 – joris Jun 05 '14 at 23:07
  • Now I have no bug without using the sqlalchemy engine. – drastega Jun 06 '14 at 00:39