I currently populate a personal tennis statistics database by pulling SQL statements from an API and then running them. The statements from the API assume that there are 6 tables:
- Male players
- Female players
- Male tournaments
- Female tournaments
- Male matches
- Female matches
Note: There is no unique primary key for the match tables - they use a composite primary key consisting of player ids, tournament ids and a round id.
As the male and female versions of the tables are identical I'm trying to import the data into single tables for players, tournaments and matches. I can also utilise an auto-generated primary key for further operations and a tour_id
field to distinguish between male and female when loading new data.
For the combined match
table I'd like to use the combination of the original API ids for players and tournaments along with the tour_id
as composite foreign keys. This means I can easily execute the API statement as I'll know what the original id is and whether it is male or female. To that end I've got as far as building by table classes as follows:
class Player(Base):
__tablename__ = "tournament"
__table_args__ = ({"schema": "test"})
id_ = sa.Column(sa.Integer, primary_key=True)
orig_id = sa.Column(sa.Integer)
tour_id = sa.Column(sa.Integer)
match = relationship(
"Match",
primaryjoin=(
f"or_(and_(Player.orig_id == Match.player_orig_id_p1, Player.tour_id == Tournament.tour_id), "
f"and_(Player.orig_id == Match.player_orig_id_p2, Player.tour_id == Tournament.tour_id))"
),
overlaps="fk_p1, fk_p2",
)
class Match(Base):
__tablename__ = "match"
__table_args__ = (
sa.ForeignKeyConstraint(
["tournament_orig_id", "tournament.tour_id"],
["tournament.orig_id", "tournament.tour_id"],
"fk_tournament",
),
sa.ForeignKeyConstraint(
["player_orig_id_p1", "tournament.tour_id"],
["player.orig_id", "player.tour_id"],
"fk_p1",
),
sa.ForeignKeyConstraint(
["player_orig_id_p2", "tournament.tour_id"],
["player.orig_id", "player.tour_id"],
"fk_p2",
),
{"schema": "test"},
)
id_ = sa.Column(sa.Integer, primary_key=True)
tournament_orig_id = sa.Column(sa.Integer)
round_id = sa.Column(sa.Integer)
player_orig_id_p1 = sa.Column(sa.Integer)
player_orig_id_p2 = sa.Column(sa.Integer)
tournament = relationship("Tournament", back_populates="match")
class Tournament(Base):
__tablename__ = "tournament"
__table_args__ = ({"schema": "test"})
id_ = sa.Column(sa.Integer, primary_key=True)
orig_id = sa.Column(sa.Integer)
tour_id = sa.Column(sa.Integer)
match = relationship("Match", back_populates="tournament")
The error I'm currently running into is:
Can't create ForeignKeyConstraint on table 'match': no column named 'tournament.tour_id' is present.
This is perfectly understandable as the docs are clear:
columns – A sequence of local column names.
I'm clearly trying to refer to the tour_id
field in the tournament
table which SA doesn't like. However, I can't figure out a way to create the composite foreign key without adding a tour_id
field to the match
table which wouldn't be in line with a normalised table structure.
Any ideas on possible ways forward?