3

The following is my query:

SELECT *
FROM (
    SELECT f.max, f.min, p.user_id, p.id, p.title, p.rating,
    RANK() OVER (
        PARTITION BY p.user_id
        ORDER BY p.rating DESC, p.id DESC
    ) AS rnk
    FROM posts AS p
    INNER JOIN friends AS f ON (p.user_id = f.friend_id)
    WHERE f.user_id=1
) AS subq
WHERE (subq.rnk <= subq.max)
LIMIT 10

It searches for posts of my friends, sorted by their rating and date. The window function implemented in this query lets me limit the number of rows returned for each friend according to the MAX field on Friends table.

However, I also have a field MIN, which is used to specify the minimum number of posts I want from the query for a given friend. How is that possible?

I also wonder if SQL is the best option for those types of queries? I already tried Neo4j Graph database, and while it seemed as a good solution, I would rather avoid using 2 separate databases.

SQLFiddle

Schema:

CREATE TABLE friends(
    user_id int,
    friend_id int,
    min int,
    max int
);

CREATE TABLE posts(
   id int,
   title varchar(255),
   rating int,
   date date,
   user_id int
);

Suppose we have the following data:

INSERT INTO friends VALUES
  (1,2,1,3)
, (1,3,0,5)
, (1,4,2,10);

INSERT INTO posts VALUES
  (1,  'posts1',  2,  now(), 2)
, (2,  'posts2',  1,  now(), 2)
, (3,  'posts3',  5,  now(), 2)
, (4,  'posts4',  2,  now(), 2)
, (5,  'posts5',  11, now(), 2)
, (6,  'posts6',  7,  now(), 2)
, (7,  'posts7',  3,  now(), 2)
, (8,  'posts8',  4,  now(), 3)
, (9,  'posts9',  1,  now(), 3)
, (10, 'posts10', 0,  now(), 3)
, (11, 'posts11', 7,  now(), 3)
, (12, 'posts12', 3,  now(), 3)
, (13, 'posts13', 2,  now(), 3)
, (14, 'posts14', 4,  now(), 4)
, (15, 'posts15', 9,  now(), 4)
, (16, 'posts16', 0,  now(), 4)
, (17, 'posts17', 3,  now(), 4)
, (18, 'posts18', 2,  now(), 4)
, (19, 'posts19', 1,  now(), 4)
, (20, 'posts20', 2,  now(), 4);

Hence I would like to see (post_id, title, rating, date, friend_id) combinations with the following conditions, if possible:

  1. between 1 and 3 posts from the friend with id=2
  2. between 0 and 5 posts from the friend with id=3
  3. between 2 and 10 posts from the friend with id=4

So basically, if my friend with friend_id=2 posted 1 or more articles, I want at least 2 of them. If he posted more than 3 articles, I want no more than 3.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
nickbusted
  • 1,029
  • 4
  • 18
  • 30
  • 2
    Can you provide sample table schema along with data and a sample output ? – Vivek S. May 02 '15 at 10:47
  • 1
    _However, I also have a field MIN, which is used to specify the minimum number of posts I want from the query for a given friend. How is that possible?_, I dint get actually what you want(_at least for me_) ,What you need to query using _column_ `MIN` – Vivek S. May 02 '15 at 10:55
  • @vivek Thanks, the question is updated now – nickbusted May 02 '15 at 10:57
  • Between 5 and 10 posts? That is if you have less than 5 posts avaiable you don't want to display any? – Jakub Kania May 02 '15 at 10:59
  • @JakubKania no, it means at least 5, if possible – nickbusted May 02 '15 at 11:00
  • Paste executable `create table ` and `insert` statements for both tables into your question. As written, your query can't possibly work--there are syntax errors and references to nonexistent columns. – Mike Sherrill 'Cat Recall' May 02 '15 at 11:35
  • @MikeSherrill'CatRecall' There is SQLFiddle now at http://sqlfiddle.com/#!15/c9f2a/1 – nickbusted May 02 '15 at 11:51
  • Your query returns 3 rows for id=2, 5 rows for id=3, and 7 rows for id=4. Isn't that what you wanted? – Mike Sherrill 'Cat Recall' May 02 '15 at 12:43
  • @MikeSherrill'CatRecall' It was a special case, so I have changed the numbers(`min` and `max`), so now the result is not what I wanted – nickbusted May 02 '15 at 12:47
  • @nickbusted and what is the difference between 5 and at least 5? Is it something like top 10 posts UNION top 5 posts from B which can give us 10-15 records? – Jakub Kania May 02 '15 at 16:56
  • @JakubKania If my friend with `id=2` posted 5 or more posts, I want at least 5 of them. – nickbusted May 02 '15 at 17:07
  • @nickbusted What do you mean by "at least"? Why would you get more that 5 posts if you want 5? What causes you not to get exactly 5? – Jakub Kania May 02 '15 at 17:13
  • @JakubKania I want users to specify the range of the number of articles they want to receive from a given user. Let's say I want to have 2-5 posts from you everyday, if you post that much. If you post only one, it is alright, and I will have the only one post. – nickbusted May 02 '15 at 17:17

3 Answers3

5

Let's say I want to have 2-5 posts from you everyday, if you post that much. If you post only one, it is alright, and I will have the only one post.

Your explanation in the comment still does not add up. Your min number would be noise without effect according to this explanation.

This is not what you wrote, but this would make sense:

Given a maximum of display slots for posts (the outer LIMIT), I want to get min posts from each friend first (if available). If there are free slots after that, fill up with up to max posts per friend.

In the example that would be 1 (min) post from friend 2 with top priority and another 2 (max - min) posts if more slots are still available.

It would be arbitrary which posts make the cut if there are not enough slots for each priority. I went ahead and assumed that the first post from each should be selected first, etc.

The rest is still arbitrary, but can be solved easily if you manage to formulate a requirement.

SELECT *
FROM   friends f
,  LATERAL (
   SELECT *
        , row_number() OVER (ORDER  BY rating DESC NULLS LAST, id DESC) AS rn
   FROM   posts p
   WHERE  user_id = f.friend_id  -- LATERAL reference
   ORDER  BY rating DESC NULLS LAST, date DESC NULLS LAST
   LIMIT  f.max  -- LATERAL reference
   ) p
WHERE  f.user_id = 1
ORDER  BY (p.rn > f.min)  -- minimum posts from each first
        , p.rn
LIMIT  10;  -- arbitrary total maximum

SQL Fiddle.

Notes

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0

Am I missing something? Can't you just add and subq.rnk >= subq.min to the where clause?

SELECT *
FROM (SELECT f.max, f.min, p.friend_id, p.id, p.title, p.rating
             row_number() over (partition by p.user_id
                                order by p.rating DESC, p.id DESC
                               ) as rnk
      FROM posts p INNER JOIN
           friends f
           ON p.friend_id = f.user_id
      WHERE f.user_id = 1
     )  subq
WHERE subq.rnk <= subq.max and subq.rnk >= subq.min
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thanks. Unfortunately, it will not work. The `rnk` is a sequence number, so writing `subq.rnk >= subq.min` will exclude the top `subq.min` posts. – nickbusted May 02 '15 at 11:24
0

I think from a neo4j/cypher perspective, this is really what you want to do...

match (u:User {id: 1})-[r:FOLLOWS]->(p:Publisher)
with u, p, r
match p-[:PUBLISHED]-(i:Item)
with u, p, r, i
order by i.name
return u.name, p.name, i.name
skip 5
limit 2

You just would need to parameterize out the min and max and bind them at run time which would involve two queries rather than one but I think this is still an elegant solution. I have tried to include properties in skip and limit before but cypher obviously does not support that (yet). It wants a parameter or an unsigned integer.

match (u:User {id: 1})-[r:FOLLOWS]->(p:Publisher)
with u, p, r
match p-[:PUBLISHED]-(i:Item)
with u, p, r, i
order by i.name
return u.name, p.name, i.name
skip {min}
limit {max}
Dave Bennett
  • 10,996
  • 3
  • 30
  • 41