4

Background Info :

I'm trying to retrieve images from people I follow, sort by latest time. It's like a twitter news feed where they show the latest feed by your friends.

Plans:

Currently there is only 1 item i need to keep in consideration, which is the images. In future i'm planning to analyse user's behavior and add in other images they might like into their feed, etc.

http://www.quora.com/What-are-best-practices-for-building-something-like-a-News-Feed

I personally feel that "Pull" Model, or Fan-out-on-load where i pull all info at real time would be worst than the push model. Because imagine i have 100 following, i would have to fetch and sort by time. (Let me know if i'm wrong eg, Read is 100x better than Write(Push Model)

The current design of the push model i have in mind is as follows

Table users_feed(ID, User_ID, Image_ID,datetime)

Option 1 : Store A list of Image_ID

Option 2 : Store one image ID and duplicate rows(More Rows of same User_ID but different Image_ID)

The plan is to limit each Row a user can have in this feed , which means , there would always be a max of 50 images. If they want more items beyond the 50 images in their news feed. They cant(I might code a alternative to store more so they can view more in future)

Question 1

Since when user following users add a item into their "collection" i have to push it into each of their follower's feed. Wont there be a problem in Write? 200 followers = 200 writes?

Question 2

Which method would be better for me keeping in consideration that i only have one type of data which is images. Feeds of images.

Question 3

If i choose to store the feed in advance(push method) how do i actually write it into all my friends?

Insert xxx into feeds whereIn (array of FriendsID)?

Any form of advice would be greatly appreciated. Thanks in advance!

Chethan N
  • 1,110
  • 1
  • 9
  • 23
CodeGuru
  • 3,645
  • 14
  • 55
  • 99

1 Answers1

4

I would recommend you to follow pull method over push method for the following reasons:

  • It gives to more freedom for extencibility in the future.

  • Less number of writes ( imagine 10M followers then there has to be
    10M writes for just 1 post).

  • You can get all feed of a user simply by query similar to:

    SELECT * FROM users_feed as a WHERE a.user_id in ( < //select all user_ids of followers of loged in user// > )

    (Syntax not followed as table structure of followers is not known)

Chethan N
  • 1,110
  • 1
  • 9
  • 23
  • 1
    what would happen if a user follows 5000 users? – Chaudhry Junaid Oct 06 '15 at 12:07
  • If this sort of architecture is being used, it can be optimized by first selecting a subset of all followed users usually by some sort of affinity between the two users and then loading posts only for that subset. – Chaudhry Junaid Oct 06 '15 at 12:11
  • It should work even if a user follows 5000 users. In this case the number of feed items will be huge. It is recommended to follow some sort of filtering / ranking technique as the user base grows. – Chethan N Oct 12 '15 at 05:36
  • 1
    It is more common to have more followers than followings. because users tend to have more followers than following others :) @ChaudhryJunaid – Shahin Ghasemi Apr 29 '20 at 07:54
  • 2
    Twitter Is now Mixing the things together. Pull method for people with many followers, and push method for people with average number of followers. So when one is getting the feed, by pull method he will get items from Famous people, and also will get Items from His/her/.. own Bucket, and then will Aggregate and mix them based on some Criteria for pagination. – hossein bakhtiari Jun 28 '20 at 14:02