0

I have a SQLite-table with columns author, book, release_date and seller.

Now I need to find a way to get only the top 10 books of each author with the latest release_date.

If possible, I also need the seller, that appears "most often" in the top 10 latest released books per author. The result should be a simple table with author and seller only.

This problem is really driving me crazy. Is at minimum one part possible in a single SQLite-query???

Carsten
  • 1,612
  • 14
  • 21
  • Possible duplicate of [select top n record from each group sqlite](https://stackoverflow.com/questions/28119176/select-top-n-record-from-each-group-sqlite) – Tim Biegeleisen Aug 05 '19 at 08:38
  • Read the above link, which contains a number of ways to approach your problem. – Tim Biegeleisen Aug 05 '19 at 08:38
  • @TimBiegeleisen SQLIte supports windowing and ranking functions yet none of the answers to the linked question uses them – Panagiotis Kanavos Aug 05 '19 at 08:44
  • @PanagiotisKanavos Yes, a good link, if we assume that the versions of SQLite for which those answers were written do not support window functions. – Tim Biegeleisen Aug 05 '19 at 08:44
  • @PanagiotisKanavos Someone can always add an example using window functions to that question. I might when I have a chance to if nobody beats me to it. – Shawn Aug 05 '19 at 08:54
  • the solution from the link might work, but in my sample, i have database with 10 mio. entries. the below solution via window_function finished in 10 seconds. the solution from the link runs some hours (I never waited till the end). – Carsten Aug 06 '19 at 20:00

1 Answers1

1

SQLite v3.25 which came out in September 2018 added window functions.

You can calculate the rank of a book by date for each author with :

CREATE TABLE books
(
  author varchar(10),
  title varchar(10),
  release date
);
INSERT INTO books VALUES 
('aaa','ta1','2018-01-01'),
('aaa','ta2','2018-02-01'),
('aaa','ta3','2018-03-01'),
('aaa','ta4','2018-05-01'),
('bbb','tb1','2018-05-01'),
('bbb','tb2','2018-06-01')
;

SELECT
    author,
    title,
    release, 
    row_number() OVER (partition by author ORDER BY release desc) AS row_number 
FROM books

The function row_number() OVER (partition by author ORDER BY release desc) AS row_number calculates the row number for each author if the rows are ordered by release date.

This produces :

author  title   release     row_number
aaa     ta4     2018-05-01  1
aaa     ta3     2018-03-01  2
aaa     ta2     2018-02-01  3
aaa     ta1     2018-01-01  4
bbb     tb2     2018-06-01  1
bbb     tb1     2018-05-01  2

Once you have the row number, you can filter the top N items with a simple WHERE row_number <= N, eg for the last 2 books per author :

select * from (
SELECT
    author,
    title,
    release, 
    row_number() OVER (partition by author ORDER BY release desc) AS row_number 
FROM books )
where row_number<=2

This returns :

author  title   release     row_number
aaa    ta4      2018-05-01  1
aaa    ta3      2018-03-01  2
bbb    tb2      2018-06-01  1
bbb    tb1      2018-05-01  2
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • outstanding! many thanks for this huge help. any solution idea for the 2nd challenge to get the most appearing seller in the latest top 10? how can i combine counting and grouping here (in case this is the right direction)? – Carsten Aug 06 '19 at 20:04