1

I'm experiencing some odd behavior with SQLAlchemy not iterating over all of the results from a query.

For example, I have the following python code:

engine = create_engine(<connection string>)
Session = sessionmaker(bind=engine)
session = Session() 

columns = session.query(Column)

counter = 1
for c in columns
    print(counter)
    counter = counter + 1

print('count: ' + str(columns.count()))

where Column is a class that I've defined and mapped in the usual SQLAlchemy way:

from base import Base
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String, Boolean

class Column(Base):
    __tablename__ = 'COLUMNS'
    __table_args__ = {'schema' : 'INFORMATION_SCHEMA'}

    table_catalog = Column(String)
    table_schema = Column(String)
    table_name = Column(String)
    column_name = Column(String, primary_key=True)
    data_type = Column(String)

From my query, I'm expecting 7034 rows to be returned and this is what the final print statement prints out (for columns.count()), but the for loop only ever gets up to 2951 printing out counter.

If I do anything else with the returned data in the for loop, only the 2951 get processed, not all 7034.

Does anyone know why I'm experiencing this discrepancy, and how can I iterate over all 7034 rows, not just the 2951?

kyle
  • 691
  • 1
  • 7
  • 17
t_warsop
  • 1,170
  • 2
  • 24
  • 38
  • @AokiAhishatsu I have, I can access the data returned by the query without error, it's just the for loop doesn't iterate over all of it – t_warsop Feb 12 '19 at 08:17
  • Can you write your query in SQL? – Aoki Ahishatsu Feb 12 '19 at 08:20
  • @AokiAhishatsu yes, the query is effectively 'select * from table' – t_warsop Feb 12 '19 at 08:21
  • @AokiAhishatsu just tried it, I get the same result – t_warsop Feb 12 '19 at 08:23
  • @AokiAhishatsu print(len(columns)) isn't valid, it gives the error: object of type 'Query' has no len() – t_warsop Feb 12 '19 at 08:28
  • Have you seen these answers: [SELECT * in SQLAlchemy?](https://stackoverflow.com/questions/636548/select-in-sqlalchemy) – Aoki Ahishatsu Feb 12 '19 at 08:34
  • @AokiAhishatsu why are you deleting the previous comments you have made? It means when other people view this question these comments won't make sense to them – t_warsop Feb 12 '19 at 08:36
  • What happens if you do `columns = session.query(Column).all()`? – shmee Feb 12 '19 at 08:37
  • @shmee I've tried that, it doesn't make any difference I still get the same result. (someone else suggested it but then deleted their comment) – t_warsop Feb 12 '19 at 08:38
  • Ah .. hm. Can you show us that mapping as well? Your class name bothers me a bit, as it is also the name of a schema class in SQLAlchemy. Also, you have verified that 2951 elements in `columns` are in fact records from that underlying table, I assume? – shmee Feb 12 '19 at 08:54
  • @shmee I've updated the question with the mapping and yes I have confirmed those 2951 elements contain the expected data – t_warsop Feb 12 '19 at 08:56
  • @shmee also I have renamed the Column class and it still produces the same result – t_warsop Feb 12 '19 at 08:57
  • @the_doc Does your user have access to the schema? And can you try changing the classname? What database is this? – Nihal Sangeeth Feb 12 '19 at 08:58

1 Answers1

2

I've figured out why I wasn't getting the results I was expecting (I did something silly).

The 'column_name' field in the table the Column class maps to ins't unique, therefore picking it as a primary key filtered out only unique values - which since there are duplicates results in less rows being returned than I expected.

I fixed it by updating the definition of the Column mapping to:

from base import Base
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String, Boolean

class Column(Base):
    __tablename__ = 'COLUMNS'
    __table_args__ = {'schema' : 'INFORMATION_SCHEMA'}

    table_catalog = Column(String, primary_key=True)
    table_schema = Column(String, primary_key=True)
    table_name = Column(String, primary_key=True)
    column_name = Column(String, primary_key=True)
    data_type = Column(String)
t_warsop
  • 1,170
  • 2
  • 24
  • 38