0

Right now I'm building a huge db2 query, where I'm selecting a lot of data from a lot of tables (LEFT OUTER JOIN on 10+ tables). Most of the data is nicely selected and working fine, but then most of it is pretty straightforward

But there is one table giving me a bit of a headache. If we are just looking at the sql and result when we are not selecting from this table, it would look a bit like this: (Because of the nature of the data I can't give example from my SQL)

SQL 1:

SELECT Person.Name, Date.NameOfDate, Location.City
FROM Person LEFT OUTER JOIN 
Date ON Person.Id = Date.PersonId LEFT OUTER JOIN 
Location ON Date.LocationId = Location.Id
WHERE Person.IsAlive = True

Result 1:

Name - NameOfDate - City
Peter - Anna - Athen
Peter - Caroline - Washington
Simone - Carl - Athen

Now I have a table (let's call it 'Comment'). This table has some extra information on the cities. There can be several rows for each city. Example:

SQL 2:

SELECT Location.City, Comment.Text, Comment.SortingId, Comment.TypeOfData
FROM Location LEFT OUTER JOIN 
Comment ON Location.Id = Comment.LocationId

Result 2:

City - Text - SortingId - TypeOfData
New York - Do not read this - 1 - 777
Washington - This text is irrelevant - 1 - 555
Washington - Make sure you visit the White House - 2 - 777
Washington - On saturdays there is a market near the docks - 3 - 777
Athen - Bring translator - 1

My problem is I need to retrieve Comment.Text in SQL 1, but only the row with the lowest SortingId where TypeOfData is 777. The result should be something like this:

Name - NameOfDate - City - Text
Peter - Anna - Athen - Bring translator
Peter - Caroline - Washington - Make sure you visit the White House
Simone - Carl - Athen - Bring translator

The closest thing I have managed to make to get this data is the following:

SELECT Person.Name, Date.NameOfDate, Location.City, Comment.Text
FROM Person LEFT OUTER JOIN 
Date ON Person.Id = Date.PersonId LEFT OUTER JOIN 
Location ON Date.LocationId = Location.Id LEFT OUTER JOIN
(SELECT Comment.Text FROM Comment ORDER BY Comment.SortingId FETCH FIRST 1 ROWS ONLY) AS Comment ON Location.Id = Comment.LocationId
WHERE Person.IsAlive = True

But as some might have noticed, this does not give me any result. The subselect will return the New York row, then it will filter on LocationId where it will remove the New York, leaving nothing

Any other idea?

EDIT:

The Comment table does NOT have any unique ID field. You could have two rows where the only difference is the SortingId, but the same SortingId could be used in many rows e.g. SortingId could be 1 in two rows with different LocationId

Adagio_Hpt
  • 21
  • 1
  • 6

5 Answers5

0

You can try Joining Location table inside subquery with comment, something like this

(SELECT Comment.Text FROM Locat LEFT OUTER JOIN 
Comment ON Locat.Id = Comment.LocationId
WHERE Locat.City = Location.City 
ORDER BY Comment.SortingId 
FETCH FIRST 1 ROWS ONLY)
Vishal Gupta
  • 124
  • 6
  • I believe this would not work from what I can see. It doesn't have any correlation with the main query. It's not selecting anything based on the current row of the main query. This would cause the subquery again to always return the first row no matter what row it is working on. In this case the subquery will always return New York. Or am I missing something? – Adagio_Hpt Sep 15 '16 at 11:18
  • Actually, In "WHERE Locat.City = Location.City " of subquery, Location.City is from main query. – Vishal Gupta Sep 15 '16 at 11:41
  • Unfortunately that doesn't work. The subquery doesn't know Location as it is outside of the scope of the subquery – Adagio_Hpt Sep 15 '16 at 11:57
0

Your SELECT subquery produces a table that has 1 row with the lowest SortingId overall. What you presumably want is the 1 row with the lowest SortingId for the city at hand.

Add a WHERE clause to your SELECT subquery that applies this condition. It may be the case (in fact, I think it is very likely to be the case) that this condition cannot be written because your Location.Id is "out of scope" inside that WHERE clause.

If so, remove the ORDER and FETCH clauses from your subquery, and reinstate the required filtering in the "outermost" WHERE in the form of

AND NOT EXISTS (
  SELECT * 
  FROM Comment AS SECONDCOMMENT 
  WHERE cities-the-same AND SECONDCOMMENT.SortingId > Comment.SortingId
  )
Kamil Gosciminski
  • 16,547
  • 8
  • 49
  • 72
Erwin Smout
  • 18,113
  • 4
  • 33
  • 52
  • I'm not sure what do you mean with WHERE cities-the-same ? – Adagio_Hpt Sep 15 '16 at 10:55
  • You are correct with your assumptions. The subquery will only return one row (New York), but as this result does not fit with the LocationId I'm searching for, it will not return anything. Also as you mention, I can't but the condition inside the subquery, as the subquery doesn't know Location. If that was possible it would have been an easy fix – Adagio_Hpt Sep 15 '16 at 11:02
  • cities-the-same as a placeholder for some correct and appropriate SQL syntax that represents that condition. (In the big NOT EXISTS I wrote '>' but it should be '<' actually.) – Erwin Smout Sep 15 '16 at 12:46
0

You could use a DENSE_RANK() analytic function for that. Your sample data seems a bit inconsistent (there's no Comment.TypeOfData value for Athens) but try something along these lines:

SELECT * FROM (
  SELECT 
    l.city, c.text, c.sortingid, c.typeofdata ,
    DENSE_RANK() OVER (PARTITION BY c.locationid ORDER BY c.sortingid) rnk 
  FROM location l
  LEFT JOIN comment c
    ON  l.id = c.locationid 
    AND c.typeofdata = 777
) t  
WHERE rnk = 1

PS. Your question explicitly asks for a subselect, and in this solution there is a subselect, although probably not where you thought it might be.

mustaccio
  • 18,234
  • 16
  • 48
  • 57
0

DB2 as of 9.7 has a way to do this with windowing functions.

Specifically FIRST_VALUE. It works like this:

SELECT Location.ID, Location.City, C.Text, C.SortingID, C.TypeOfData
FROM Location 
LEFT OUTER JOIN (
   SELECT DISTINCT LocationID 
          FIRST_VALUE(Text) OVER (PARTITION BY LocationID  ORDER BY SortingId) as Text,
          FIRST_VALUE(SortingId) OVER (PARTITION BY LocationID  ORDER BY SortingId) as SortingID,
          FIRST_VALUE(TypeOfData) OVER (PARTITION BY LocationID  ORDER BY SortingId) as Typeofdata
   FROM Comment
) AS C ON Location.Id = C.LocationId

This will return 1 row per ID, per city

Hogan
  • 69,564
  • 10
  • 76
  • 117
  • That would actually return multiple rows for a city if there were multiple comments, wouldn't it? They would all contain the same value though. – mustaccio Sep 15 '16 at 19:08
  • I don't think the latest will work either (even if you add the missing `FROM`). You're still returning all rows from `Comment`. – mustaccio Sep 15 '16 at 19:11
  • Ok I added in a distinct :D – Hogan Sep 15 '16 at 19:14
  • You're not handling `TypeOfData`. You can't reference `ID` in the subselect. You now have 4 sorts going on (that's not incorrect but less efficient). May be you can test your query before you post? I'm not saying you _can't_ use `FIRST_VALUE()`, but it seems less convenient to me. – mustaccio Sep 15 '16 at 19:19
  • fixed the field name issue -- but I don't think the number of "sorts" matter to the compiler since they are the same. This should have the same performance as yours since they are functionally the same. I'd have to test to be sure of course.... – Hogan Sep 15 '16 at 20:07
  • @mustaccio -- seems like one of those wonderful style issues, I like this because it does not have the where external to the sub-select. Yours (of course) works the same -- so it really is style unless someone goes and tests and proves one is faster. – Hogan Sep 15 '16 at 20:09
0

try this

   SELECT Person.Name, Date.NameOfDate, Location.City, tmp.Text
   FROM Person 
   LEFT OUTER JOIN Date ON Person.Id = Date.PersonId 
   LEFT OUTER JOIN Location ON Date.LocationId = Location.Id
   LEFT OUTER JOIN LATERAL(
   select Text FROM Comment
   where Location.Id = Comment.LocationId and Comment.TypeOfData=777
   order by Comment.SortingId
   fetch first rows only 
   ) tmp on 1=1
   WHERE Person.IsAlive = True
Esperento57
  • 16,521
  • 3
  • 39
  • 45