1

In SQL Server, I have this query

SELECT TOP 10 * 
FROM xTable
ORDER BY NEWID()

which basically returns 10 random rows. I also have the top rows parameterized by a number but that's not the problem

I want to do this in a DB2 query, since I have to get data from DB2 using OpenQuery and store in a local temporary table and from it I do the select top x rows.

If I can achieve this directly in DB2 that would be amazing since I have to get over a thousand rows.

Mark Barinstein
  • 11,456
  • 2
  • 8
  • 16
Jackal
  • 3,359
  • 4
  • 33
  • 78
  • Possible duplicate of [db2 select x random rows for a given id](https://stackoverflow.com/questions/40769451/db2-select-x-random-rows-for-a-given-id)? – Thom A Jul 19 '19 at 13:48
  • 1
    Or this dupe: https://stackoverflow.com/questions/19412/how-to-request-a-random-row-in-sql SELECT column, RAND() as IDX FROM table ORDER BY IDX FETCH FIRST 10 ROWS ONLY – smoore4 Jul 19 '19 at 14:30
  • this actually worked – Jackal Jul 19 '19 at 15:07
  • 1
    Possible duplicate of [How to request a random row in SQL?](https://stackoverflow.com/questions/19412/how-to-request-a-random-row-in-sql) – mustaccio Jul 19 '19 at 15:15

2 Answers2

1

Is this DB2 LUW or AS400?

try

SELECT *
  FROM xTable
  ORDER BY ROW_NUMBER()
  FETCH FIRST 10 ROWS ONLY;
  • Note that this **will not be "random" ** - the query is going to iterate over the table in some undefined order (likely the base table itself) and return numbers based on that order. Subsequent runs are _likely, although not guaranteed,_ to return the same rows every time the query is run, which is almost certainly not desired. – Clockwork-Muse Jul 19 '19 at 18:52
1

Ordering is not a good idea, especially for large tables.
We have an ability to use built-in Db2 for LUW feature of subselect - tablesample-clause.

BERNOULLI

BERNOULLI sampling considers each row individually. It includes each row in the sample with probability P/100 (where P is the value of numeric-expression1), and excludes each row with probability 1 - P/100, independently of the other rows. So if the numeric-expression1 evaluated to the value 10, representing a ten percent sample, each row would be included with probability 0.1, and excluded with probability 0.9.

SYSTEM

SYSTEM sampling permits the database manager to determine the most efficient manner in which to perform the sampling. In most cases, SYSTEM sampling applied to a table-name means that each page of table-name is included in the sample with probability P/100, and excluded with probability 1 - P/100. All rows on each page that is included qualify for the sample.

Example:

SELECT * 
FROM MYTABLE TABLESAMPLE SYSTEM (0.1) 
FETCH FIRST 10 ROWS ONLY;

This feature doesn't exist in DB2 for IBM i/iSeries/OS400. Use the order by rand() fetch first 10 rows clause for this platform instead.

Mark Barinstein
  • 11,456
  • 2
  • 8
  • 16
  • I had to order by because one of my requirements is to get different x rows not duplicated everytime i query it, that's how i did in sql at least, but now i'm trying to do this using dynamic sql inside the open query so i can just query those X rows instead getting all data. However this is good to know since none of my coworkers are familiar with DB2 features besides just plain dynamic sql to get raw data – Jackal Jul 20 '19 at 19:52
  • Also just wondering since i can't test it out cause i don't have DB2 installed nor am I at the office right now. Is the randomization applied tafter or before limiting the row size? – Jackal Jul 20 '19 at 22:43
  • 1
    The randomization is applied before limiting of course. – Mark Barinstein Jul 21 '19 at 07:26
  • Interesting, I guess i will try this tomorrow instead rand(), there will be a big difference in performance for sure. Thanks a bunch – Jackal Jul 21 '19 at 19:13
  • I just tried to use this and does not recognize system on the open query `SELECT * FROM OPENQUERY(linkedServer, 'SELECT * FROM myTable TABLESAMPLE SYSTEM (0.1)')` The provider is IBMDA400, not sure if it has any impact – Jackal Jul 22 '19 at 07:31
  • I forgot to add, it seems that it's DB2 AS400, which i searched about and tablesample is for LUW – Jackal Jul 22 '19 at 07:39
  • 1
    The example I provided is not applicable to DB2 for IBM i. I've added the corresponding tag to your question. – Mark Barinstein Jul 22 '19 at 08:58
  • I see, is there anything similar in 400 or just plain use rand() like many examples i have seen? – Jackal Jul 22 '19 at 09:22