2

Using data explorer to create queries:

SELECT P.id, creationdate,tags,owneruserid,answercount
--SELECT DISTINCT TAGNAME ,TAGID
FROM TAGS  AS T
JOIN POSTTAGS AS PT
ON T.ID = PT.TAGID
JOIN POSTS AS P
ON PT.POSTID = P.ID
--WHERE CAST(P.TAGS AS VARCHAR) IN('JAVA')
WHERE PT.TAGID = 3143

How is it possible to add pagination in the query in order to take not only the first 50,000 results, but then run the query again to take the next remaining results?

Brock Adams
  • 90,639
  • 22
  • 233
  • 295
Stiar
  • 45
  • 1
  • 8

1 Answers1

2

There are a few ways to "page" through TSQL results; see:

Here I will use the CTE method as:

  • It uses convenient row numbers to page through results, rather than trying to track less predictable factors such as creationdate.
  • It reportedly performs faster than the OFFSET method.

So, that question's query becomes this SEDE query:

-- StartRow: Starting row for paging
-- EndRow: Ending row for paging (Max 50K rows at a time)
WITH allData AS (
    SELECT
                ROW_NUMBER() OVER (ORDER BY P.creationdate) AS row
                , P.id
                , P.creationdate
                , P.tags
                , P.owneruserid
                , P.answercount
    FROM        Posttags    AS PT
    JOIN        Posts       AS P    ON PT.postid = P.id
    WHERE       PT.tagid    = 3143  -- tag [scala]
)
SELECT      *
FROM        allData
WHERE       row    >= ##StartRow:INT?1##
AND         row    <= ##EndRow:INT?50000##
ORDER BY    row
Brock Adams
  • 90,639
  • 22
  • 233
  • 295
  • Super useful! I tweaked this to make it a bit faster and easier to use by 1) removing the condition `row <= ##EndRow:INT?50000##` and 2) changing the first condition to `row >= 1 + 50000*(##StartRow:INT?1## - 1)` so that my inputs can be 1, 2, 3, ... instead of 1, 50001, 100001, ... – Ben Jan 22 '20 at 21:02