1

I recently started porting a SQLite database over to PostGreSQL for a Flask site built with SQLAlchemy. I have my schemas in PGSQL and even inserted the data into the database. However, I am unable to run my usual INSERT commands to add information to the database. Normally, I insert new records using SQL Alchemy by leaving the ID column to be NULL and then just setting the other columns. However, that results in the following error:

sqlalchemy.exc.IntegrityError: (psycopg2.IntegrityError) null value in column "id" violates not-null constraint
DETAIL:  Failing row contains (null, 2017-07-24 20:40:37.787393+00, 2017-07-24 20:40:37.787393+00, episode_length_list = [52, 51, 49, 50, 83]

sum_length = 0

for ..., 0, f, 101, 1, 0, 0, , null).
 [SQL: 'INSERT INTO submission (date_created, date_modified, code, status, correct, assignment_id, course_id, user_id, assignment_version, version, url) VALUES (CURRENT_TIMESTAMP, CURRENT_TIMESTAMP, %(code)s, %(status)s, %(correct)s, %(assignment_id)s, %(course_id)s, %(user_id)s, %(assignment_version)s, %(version)s, %(url)s) RETURNING submission.id'] [parameters: {'code': 'episode_length_list = [52, 51, 49, 50, 83]\n\nsum_length = 0\n\nfor episode_length in episode_length_list:\n    pass\n\nsum_length = sum_length + episode_length\n\nprint(sum_length)\n', 'status': 0, 'correct': False, 'assignment_id': 101, 'course_id': None, 'user_id': 1, 'assignment_version': 0, 'version': 0, 'url': ''}]

Here is my SQL Alchemy table declarations:

class Base(Model):
    __abstract__  = True
    @declared_attr
    def __tablename__(cls):
        return cls.__name__.lower()
    def __repr__(self):
        return str(self)

    id =  Column(Integer(), primary_key=True)
    date_created  = Column(DateTime, default=func.current_timestamp())
    date_modified = Column(DateTime, default=func.current_timestamp(),
                                     onupdate=func.current_timestamp())

class Submission(Base):
    code = Column(Text(), default="")
    status = Column(Integer(), default=0)
    correct = Column(Boolean(), default=False)
    assignment_id = Column(Integer(), ForeignKey('assignment.id'))
    course_id = Column(Integer(), ForeignKey('course.id'))
    user_id = Column(Integer(), ForeignKey('user.id'))
    assignment_version = Column(Integer(), default=0)
    version = Column(Integer(), default=0)
    url = Column(Text(), default="")

I created the schema by calling db.create_all() in a script.

Checking the PostGreSQL side, we can see the constructed table:

                                     Table "public.submission"
       Column       |           Type           | Modifiers | Storage  | Stats target | Description
--------------------+--------------------------+-----------+----------+--------------+-------------
 id                 | bigint                   | not null  | plain    |              |
 date_created       | timestamp with time zone |           | plain    |              |
 date_modified      | timestamp with time zone |           | plain    |              |
 code               | text                     |           | extended |              |
 status             | bigint                   |           | plain    |              |
 correct            | boolean                  |           | plain    |              |
 assignment_id      | bigint                   |           | plain    |              |
 user_id            | bigint                   |           | plain    |              |
 assignment_version | bigint                   |           | plain    |              |
 version            | bigint                   |           | plain    |              |
 url                | text                     |           | extended |              |
 course_id          | bigint                   |           | plain    |              |
Indexes:
    "idx_16881_submission_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "submission_course_id_fkey" FOREIGN KEY (course_id) REFERENCES course(id)
    "submission_user_id_fkey" FOREIGN KEY (user_id) REFERENCES "user"(id)
Has OIDs: no

I'm still new to this, but shouldn't there be a sequence?

Any insight or suggestions on what to look for next would be super appreciated.

Austin Cory Bart
  • 2,159
  • 2
  • 19
  • 32
  • 2
    Please provide a verifiable example. A single-column primary key of integer type will have autoincrementing behaviour by default in SQLAlchemy. Using your custom Base, a stripped down version of Submission without foreign keys, and a call to `db.create_all()` the table is created as it should. The name of your primary key index seems fishy as well. The default naming convention would not add an "idx_16881" prefix, I think. – Ilja Everilä Jul 25 '17 at 05:44
  • This was the right advice. I ended up verifying (via logs) that SQLAlchemy WAS generating sequence columns. However, the tool I used to load in the old data was wiping out some of the settings. Removing the tool from the equation solved my problems. Thanks for your advice! – Austin Cory Bart Jul 25 '17 at 06:29
  • 1
    Glad that you found out what's going on :) – Ilja Everilä Jul 25 '17 at 06:40

2 Answers2

6

It is standard SQL that a PRIMARY KEY is UNIQUE and NOT NULL. PostgreSQL enforces the standard, and does not allow you to have any (not a single one) NULL on the table. Other databases allow you to have one NULL, therefore, the different behaviour.

PostgreSQL current documentation on Primary Keys clearly states it:

5.3.4. Primary Keys

A primary key constraint indicates that a column, or group of columns, can be used as a unique identifier for rows in the table. This requires that the values be both unique and not null.

If you want your PRIMARY KEY to be a synthetic (i.e.: not natural) sequence number, you should define it with type BIGSERIAL instead of BIGINT. I don't know the details on how this is achieved using SQLAlchemy, but look at the references.

When you then INSERT into your table, the id should NOT be in the INSERT column list (it should not be set to null, just not be there). I.e.:

This will generate a new id:

INSERT INTO public.submission (code) VALUES ('Some code') ;

will work.

This won't:

INSERT INTO public.submission (id, code) VALUES (NULL, 'Some code') ;

I guess SQLAlchemy should be smart enough to generate the proper SQL INSERT statements, once properly configured.


Reference:

joanolo
  • 6,028
  • 1
  • 29
  • 37
  • Alright, but what do I need to do to properly configure it? The reference doesn't seem to shed any light on what I'm doing wrong. That poster was defining their own sequences, which I don't think I should have to do. Shouldn't it just work? – Austin Cory Bart Jul 25 '17 at 01:25
  • 2
    @AustinCoryBart "That poster was defining their own sequences" in order to explain what happens when you do that. They also show how to do it correctly, i.e. [define a single-column primary key of an integer type, which will have autoincrementing behaviour by default](http://docs.sqlalchemy.org/en/latest/core/metadata.html#sqlalchemy.schema.Column.params.autoincrement). – Ilja Everilä Jul 25 '17 at 05:35
  • To end the mystery, it was my fault. SQLAlchemy WAS doing the right thing, I had simply overwritten it's work via my use of pgloader. I'm an idiot, basically. – Austin Cory Bart Jul 25 '17 at 06:27
  • 2
    @AustinCoryBart: You're a human being like the rest of us. If you think you have an answer (probably your own one), mark it as accepted, so that people can concentrate on the large collection of *unanswered* ones. – joanolo Jul 25 '17 at 06:34
  • I posted a rundown of what happened, but it's not letting me mark it as approved for another 2 days :/ – Austin Cory Bart Jul 25 '17 at 06:37
2

Ultimately, I discovered what went wrong, and it was definitely my fault. The process I used to load the old data into the database (pgloader) was doing more than just loading data - it was somehow overwriting parts of the table definitions! I was able to pg_dump the data out, reset the tables, and then load it back in - everything works as expected. Thanks for sanity checks!

Austin Cory Bart
  • 2,159
  • 2
  • 19
  • 32