0

I am using two tables having one to many mapping (in DB2) .

I need to fetch 20 records at a time using ROW_NUMBER from two table using LEFT JOIN. But due to the one to many mapping, result is not consistent. I might be getting 20 records but those records does not contains 20 unique records of first table

SELECT 
 A.*, 
 B.*, 
 ROW_NUMBER() OVER (ORDER BY A.COLUMN_1 DESC) as rn 
from 
 table1 
LEFT JOIN 
 table2 ON A.COLUMN_3 = B.COLUMN3 
where 
 rn between 1 and 20

Please suggest some solution.

mustaccio
  • 18,234
  • 16
  • 48
  • 57
  • Please show your query. – Ansgar Wiechers Feb 21 '14 at 20:25
  • Use the distinct keyword. –  Feb 21 '14 at 20:27
  • SELECT A.*, B*, ROW_NUMBER() OVER (ORDER BY A.COLUMN_1 DESC) as rn from table1 LEFT JOIN table2 ON A.COLUMN_3 = B.COLUMN3 where rn between 1 and 20 –  Feb 21 '14 at 20:27
  • You make it easier to help you when you format your question (including code) properly. By the way, the query you posted would not run. You may want to post the actual query. – mustaccio Feb 21 '14 at 21:02
  • So you want distinct records from table1, before you apply the row_number? – Andrew Feb 21 '14 at 21:08
  • I want 20 records from table1....it does not matter if I get more than 20 records from table 2 –  Feb 21 '14 at 21:17
  • @IftikharKhan - The `DISTINCT` keyword will have **no** effect here - presumably, the data in the joined tale is also unique, so the rows will _already_ be unique. Personally, I find that `DISTINCT` tends to be overused, and should be restricted to certain situations (in favor of figuring out what's giving you duplicate rows in the first place). – Clockwork-Muse Feb 22 '14 at 12:31
  • @Clockwork-Muse you are right! I misinterpreted the question as I was going by skimpy information in the original post - my bad for making assumptions. –  Feb 22 '14 at 14:49

1 Answers1

1

Sure, this is easy... once you know that you can use subqueries as a table reference:

SELECT <relevant columns from Table1 and Table2>, rn
FROM (SELECT <relevant columns from Table1>,
             ROW_NUMBER() OVER (ORDER BY <relevant columns> DESC) AS rn
      FROM table1) Table1
LEFT JOIN Table2
       ON <relevant equivalent columns>
WHERE rn >= :startOfRange
      AND rn < :startOfRange + :numberOfElements

For production code, never do SELECT * - always explicitly list the columns you want (there are several reasons for this).
Prefer inclusive lower-bound (>=), exclusive upper-bound (<) for (positive) ranges. For everything except integral types, this is required to sanely/cleanly query the values. Do this with integral types both to be consistent, as well as for ease of querying (note that you don't actually need to know which value you "stop" on). Further, the pattern shown is considered the standard when dealing with iterated value constructs.

Note that this query currently has two problems:

  1. You need to list sufficient columns for the ORDER BY to return consistent results. This is best done by using a unique value - you probably want something in an index that the optimizer can use.
  2. Every time you run this query, you (usually) have to order the ENTIRE set of results before you can get whatever slice of them you want (especially for anything after the first page). If your dataset is large, look at the answers to this question for some ideas for performance improvements. The optimizer may be able to cache the results for you, but it's not guaranteed (especially on tables that receive many updates).
Community
  • 1
  • 1
Clockwork-Muse
  • 12,806
  • 6
  • 31
  • 45