35

I'm trying to implement a self-referential many-to-many relationship using declarative on SQLAlchemy.

The relationship represents friendship between two users. Online I've found (both in the documentation and Google) how to make a self-referential m2m relationship where somehow the roles are differentiated. This means that in this m2m relationships UserA is, for example, UserB's boss, so he lists him under a 'subordinates' attribute or what have you. In the same way UserB lists UserA under 'superiors'.

This constitutes no problem, because we can declare a backref to the same table in this way:

subordinates = relationship('User', backref='superiors')

So there, of course, the 'superiors' attribute is not explicit within the class.

Anyway, here's my problem: what if I want to backref to the same attribute where I'm calling the backref? Like this:

friends = relationship('User',
                       secondary=friendship, #this is the table that breaks the m2m
                       primaryjoin=id==friendship.c.friend_a_id,
                       secondaryjoin=id==friendship.c.friend_b_id
                       backref=??????
                       )

This makes sense, because if A befriends B the relationship roles are the same, and if I invoke B's friends I should get a list with A in it. This is the problematic code in full:

friendship = Table(
    'friendships', Base.metadata,
    Column('friend_a_id', Integer, ForeignKey('users.id'), primary_key=True),
    Column('friend_b_id', Integer, ForeignKey('users.id'), primary_key=True)
)

class User(Base):
    __tablename__ = 'users'

    id = Column(Integer, primary_key=True)

    friends = relationship('User',
                           secondary=friendship,
                           primaryjoin=id==friendship.c.friend_a_id,
                           secondaryjoin=id==friendship.c.friend_b_id,
                           #HELP NEEDED HERE
                           )

Sorry if this is too much text, I just want to be as explicit as I can with this. I can't seem to find any reference material to this on the web.

juliomalegria
  • 24,229
  • 14
  • 73
  • 89
wocoburguesa
  • 738
  • 1
  • 5
  • 7

2 Answers2

30

Here's the UNION approach I hinted at on the mailing list earlier today.

from sqlalchemy import Integer, Table, Column, ForeignKey, \
    create_engine, String, select
from sqlalchemy.orm import Session, relationship
from sqlalchemy.ext.declarative import declarative_base

Base= declarative_base()

friendship = Table(
    'friendships', Base.metadata,
    Column('friend_a_id', Integer, ForeignKey('users.id'), 
                                        primary_key=True),
    Column('friend_b_id', Integer, ForeignKey('users.id'), 
                                        primary_key=True)
)


class User(Base):
    __tablename__ = 'users'

    id = Column(Integer, primary_key=True)
    name = Column(String)

    # this relationship is used for persistence
    friends = relationship("User", secondary=friendship, 
                           primaryjoin=id==friendship.c.friend_a_id,
                           secondaryjoin=id==friendship.c.friend_b_id,
    )

    def __repr__(self):
        return "User(%r)" % self.name

# this relationship is viewonly and selects across the union of all
# friends
friendship_union = select([
                        friendship.c.friend_a_id, 
                        friendship.c.friend_b_id
                        ]).union(
                            select([
                                friendship.c.friend_b_id, 
                                friendship.c.friend_a_id]
                            )
                    ).alias()
User.all_friends = relationship('User',
                       secondary=friendship_union,
                       primaryjoin=User.id==friendship_union.c.friend_a_id,
                       secondaryjoin=User.id==friendship_union.c.friend_b_id,
                       viewonly=True) 

e = create_engine("sqlite://",echo=True)
Base.metadata.create_all(e)
s = Session(e)

u1, u2, u3, u4, u5 = User(name='u1'), User(name='u2'), \
                    User(name='u3'), User(name='u4'), User(name='u5')

u1.friends = [u2, u3]
u4.friends = [u2, u5]
u3.friends.append(u5)
s.add_all([u1, u2, u3, u4, u5])
s.commit()

print u2.all_friends
print u5.all_friends
zzzeek
  • 72,307
  • 23
  • 193
  • 185
  • This seems to be a bit error-prone: you can accidentally append to `all_friends` and you won't get any warning. Any suggestions? – Vladimir Keleshev Nov 20 '13 at 13:16
  • Also this allows for duplicate friendships with swapped ids (like `1, 2` and `2, 1`). You can put a constraint that one id is greater than another, but then you need to keep track of which users can be appended to which users `friends` attribute. – Vladimir Keleshev Nov 20 '13 at 13:53
  • 1
    viewonly=True has no bearing on the behavior of the collection in Python. If you're truly concerned about appends to this collection, you can use collection_cls and apply a list or set type that has the mutation methods overridden to throw a NotImplementedError or similar. – zzzeek Nov 20 '13 at 14:55
  • as far as 1->2 + 2->1, different systems can take different opinions on this. In the example above, it won't cause any "problems" directly because User.all_friends when it populates will de-duplicate User objects based on identity. A real-world "friends" system may want to apply additional data onto each "friend" relationship - User 1 may say (s)he knows User 2 via "work", whereas User 2 might report knowing User 1 via "school", and the system might want to store both of those facts, e.g. this is a directed graph. (cont) – zzzeek Nov 20 '13 at 15:04
  • 1
    If OTOH you want to limit it to one edge between any two User objects, it could be as easy as applying a SQL-level constraint (though this would require a SELECT-per-insert and I might be concerned about performance), and on the Python side you just check the "all_friends" collection upon append using an [append event](http://docs.sqlalchemy.org/en/rel_0_9/orm/events.html?highlight=append#sqlalchemy.orm.events.AttributeEvents.append) – zzzeek Nov 20 '13 at 15:05
  • @zzzeek Can you confirm that this is still the best way or point to the current best way to get this kind of unordered/non-directional self-referential many to many relationship with sqlalchemy? – KobeJohn Sep 26 '16 at 13:11
  • I just got my code working with your first half of the answer, in what case would you need the `all_friends` bit over just `friends`? – Peter May 15 '19 at 17:09
  • if you don't need the all_friends part then you don't need it! – zzzeek May 16 '19 at 18:27
  • @zzzeek Why do you need a separate relationship for persistence vs view-only? Normally 1 relationship works fine for both. – Wise Shepherd Jan 21 '20 at 01:25
15

I needed to solve this same problem and messed about quite a lot with self referential many-to-many relationship wherein I was also subclassing the User class with a Friend class and running into sqlalchemy.orm.exc.FlushError. In the end instead of creating a self referential many-to-many relationship, I created a self referential one-to-many relationship using a join table (or secondary table).

If you think about it, with self referential objects, one-to-many IS many-to-many. It solved the issue of the backref in the original question.

I also have a gisted working example if you want to see it in action. Also it looks like github formats gists containing ipython notebooks now. Neat.

friendship = Table(
    'friendships', Base.metadata,
    Column('user_id', Integer, ForeignKey('users.id'), index=True),
    Column('friend_id', Integer, ForeignKey('users.id')),
    UniqueConstraint('user_id', 'friend_id', name='unique_friendships'))


class User(Base):
    __tablename__ = 'users'

    id = Column(Integer, primary_key=True)
    name = Column(String(255))

    friends = relationship('User',
                           secondary=friendship,
                           primaryjoin=id==friendship.c.user_id,
                           secondaryjoin=id==friendship.c.friend_id)

    def befriend(self, friend):
        if friend not in self.friends:
            self.friends.append(friend)
            friend.friends.append(self)

    def unfriend(self, friend):
        if friend in self.friends:
            self.friends.remove(friend)
            friend.friends.remove(self)

    def __repr__(self):
        return '<User(name=|%s|)>' % self.name
penchant
  • 2,263
  • 3
  • 14
  • 20