0

I'm trying to use the function distinct with SQLAlchemy but it doesn’t seem to work. I prepared a little example where you can see my problem:

#-*- coding: utf-8 -*-
from sqlalchemy import create_engine,Column, Integer
from sqlalchemy.orm import sessionmaker,load_only
from sqlalchemy.ext.declarative import declarative_base

engine = create_engine('sqlite:///:memory:')
Session = sessionmaker(bind=engine)
session = Session()
Base = declarative_base()

class my_class(Base):
    __tablename__ = 'my_table'
    id= Column(Integer, primary_key=True)
    data= Column(Integer)

Base.metadata.create_all(engine)

for i in range(10):
    p=my_class()
    p.id=i
    p.data=55
    session.add(p)
session.commit()

s=session.query(my_class).distinct(my_class.data).options(load_only(my_class.data))

print (s)
for a in s.all():
    print (a.id,a.data)

Executing this I would expect an output like this:

SELECT my_table.data AS my_table_data 
FROM my_table
None 55

But instead I’m getting this:

SELECT DISTINCT my_table.id AS my_table_id, my_table.data AS my_table_data 
FROM my_table
0 55
1 55
2 55
3 55
4 55
5 55
6 55
7 55
8 55
9 55

What I'm doing wrong?

Ilja Everilä
  • 50,538
  • 7
  • 126
  • 127

1 Answers1

3

Deferring the primary key would not make sense, if querying complete ORM entities, because an entity must have an identity so that a unique row can be identified in the database table. So the query includes the primary key though you have your load_only(). If you want the data only, you should query for that specifically:

In [12]: session.query(my_class.data).distinct().all()
2017-06-30 12:31:49,200 INFO sqlalchemy.engine.base.Engine SELECT DISTINCT my_table.data AS my_table_data 
FROM my_table
2017-06-30 12:31:49,200 INFO sqlalchemy.engine.base.Engine ()
Out[12]: [(55)]

There actually was an issue where having load_only() did remove the primary key from the select list, and it was fixed in 0.9.5:

[orm] [bug] Modified the behavior of orm.load_only() such that primary key columns are always added to the list of columns to be “undeferred”; otherwise, the ORM can’t load the row’s identity. Apparently, one can defer the mapped primary keys and the ORM will fail, that hasn’t been changed. But as load_only is essentially saying “defer all but X”, it’s more critical that PK cols not be part of this deferral.

Ilja Everilä
  • 50,538
  • 7
  • 126
  • 127
  • But I need the class, so I can get the JSON later. – Javier Alaminos Maza Jun 30 '17 at 10:07
  • That sounds like you should expand your question to better describe what you're really after, or in other words [an XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). But anyhow, the answer still stands: you cannot have kind of "DISTINCT ON" queries and ORM entities, not easily at least. You could map over a query using DISTINCT, but that's a different thing. In the end ORM entities must have an identity. – Ilja Everilä Jun 30 '17 at 10:11
  • So you're not doing anything wrong, just that what you tried to do cannot be done with an ORM. – Ilja Everilä Jun 30 '17 at 10:17
  • Well, I'm creating a webservice in rest, to connect an aplication with the database, so the comunication is always via JSON. I'm puting the ORM betwen the WS and the database, so any change in the future could be easier. – Javier Alaminos Maza Jun 30 '17 at 10:17
  • To reiterate: an ORM entity object instance uniquely maps to a single row in the database table. To accomplish that it requires the primary key, so it cannot be deferred when fetching object instances. You probably have a reason for trying to fetch distinct values for some column(s), but you've yet to describe it. When you say `session.query(my_class)`, you create a query for object instances, unless later `with_entities()` calls etc. modify that. – Ilja Everilä Jun 30 '17 at 10:23