How to efficiently retrieve data in one to many relationships

Question

I am running into an issue where I have a need to run a Query which should get some rows from a main table, and have an indicator if the key of the main table exists in a subtable (relation one to many).

The query might be something like this:

select a.index, (select count(1) from second_table b where a.index = b.index) 
from first_table a;

This way I would get the result I want (0 = no depending records in second_table, else there are), but I'm running a subquery for each record I get from the database. I need to get such an indicator for at least three similar tables, and the main query is already some inner join between at least two tables...

My question is if there is some really efficient way to handle this. I have thought of keeping record in a new column the "first_table", but the dbadmin don't allow triggers and keeping track of it by code is too risky.

What would be a nice approach to solve this?

The application of this query will be for two things:

Indicate that at least one row in second_table exists for a given row in first_table. It is to indicate it in a list. If no row in the second table exists, I won't turn on this indicator.
To search for all rows in first_table which have at least one row in second_table, or which don't have rows in the second table.

Another option I just found:

select a.index, b.index 
from first_table a 
left join (select distinct(index) as index from second_table) b on a.index = b.index

This way I will get null for b.index if it doesn' exist (display can finally be adapted, I'm concerned on query performance here).

The final objective of this question is to find a proper design approach for this kind of case. It happens often, a real application culd be a POS system to show all clients and have one icon in the list as an indicator wether the client has open orders.

score 6 · Accepted Answer · answered Jun 14 '13 at 11:57

6

Try using EXISTS, I suppose, for such case it might be better then joining tables. On my oracle db it's giving slightly better execution time then the sample query, but this may be db-specific.

SELECT first_table.ID, CASE WHEN EXISTS (SELECT * FROM second_table WHERE first_table.ID = second_table.ID) THEN 1 ELSE 0 END FROM first_table

answered Jun 14 '13 at 11:57

Mikhail

1,540
2
13
13

Wow, this one seems quite nice, tried it and got nice results on the test environments, will see how it behaves in a productive environment. – Martin Jun 14 '13 at 12:03
Won't this case run a subquery for each row in the parent table? I just want to know what we will be compromising for each case... – Martin Jun 14 '13 at 12:06
Indeed, it's running for every row, but depending on your data, EXIST may work faster, cause it will run till first row, metting condition. So, if you have most of your rows from parent table in the second one, performance will be much better then joining, while degrading with reducing of number of rows in the second table. I think, this link will help to understand the difference: http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:953229842074 – Mikhail Jun 14 '13 at 12:24
If I use exists and integrate this query in some framework that will only retrieve certain records from the resultset (paging), it would be interesting to know if this scheme would only run the nested query on the actually retrieved rows... – Martin Jun 14 '13 at 14:35
1

Changing Exists (select *.... to To Exists (select null..... will make the query slightly more efficient as it won't have to select any columns. – Adam Jun 23 '13 at 14:23

score 2 · Answer 2 · answered Jun 14 '13 at 11:12

2

why not try this one

select a.index,count(b.[table id])  
from first_table a
left join second_table b
    on a.index = b.index
group by a.index

answered Jun 14 '13 at 11:12

Nilesh Nikumbh

302
3
15

Looks nice. Tried it and it does what it should. However I get higher execution times even than with the first sample I posted. I posted a second sample which is equivalent... I will be running execution plans on some different options to see how it behaves on the final environment... – Martin Jun 14 '13 at 11:25

Robert Co · Answer 3 · 2013-06-22T20:05:50.940

1

Or you can avoid join altogether.

WITH comb AS (
SELECT index
     , 'N' as exist_ind
  FROM first_table
UNION ALL
SELECT DISTINCT 
       index
     , 'Y' as exist_ind
  FROM second_table
)
SELECT index
     , MAX(exist_ind) exist_ind
  FROM comb
 GROUP BY index

edited Jun 22 '13 at 20:05

answered Jun 21 '13 at 23:20

Robert Co

1,715
8
14

score 1 · Answer 4 · answered Jun 22 '13 at 14:07

1

Two ideas: one that doesn't involve changing your tables and one that does. First the one that uses your existing tables:

SELECT
  a.index,
  b.index IS NOT NULL,
  c.index IS NOT NULL
FROM
  a_table a
LEFT JOIN
  b_table b ON b.index = a.index
LEFT JOIN
  c_table c ON c.index = a.index
GROUP BY
  a.index, b.index, c.index

Worth noting that this query (and likely any that resemble it) will be greatly helped if b_table.index and c_table.index are either primary keys or are otherwise indexed.

Now the other idea. If you can, instead of inserting a row into b_table or c_table to indicate something about the corresponding row in a_table, indicate it directly on the a_table row. Add exists_in_b_table and exists_in_c_table columns to a_table. Whenever you insert a row into b_table, set a_table.exists_in_b_table = true for the corresponding row in a_table. Deletes are more work since in order to update the a_table row you have to check if there are any rows in b_table other than the one you just deleted with the same index. If deletes are infrequent, though, this could be acceptable.

answered Jun 22 '13 at 14:07

jph

2,181
3
30
55

Wow, thank you, your first suggestion seems quite interesting, will give it a try tomorrow and see how it behaves, I'm thinking it will give me repeated data, but I need to look at it closer to be sure... About changing the tables, I had already thought of that, but it's a PITA to mantain all insertions and deletions as triggers are not allowed. What I'm looking for with this question is for good solution approaches, and this seems like a good idea... – Martin Jun 23 '13 at 18:36
The only way you could get more than one row returned by that query is if the combination of a.index, b.index and c.index is not unique. The join statements constrain the latter two to be either "equal to a.index" or "null". The only way they're non-null is if a.index is found in the table. The only way they're null is if a.index is *not* found in the table. If you want to look for a *particular* a.index then just tack on "where a.index = {some value}". – jph Jun 23 '13 at 19:17
Correction on the above comment: the query is only guaranteed to return one result when you specify a value for a.index. Otherwise it will return a result for each unique value of a.index in a_table. – jph Jun 23 '13 at 19:21

score 0 · Answer 5 · answered Jun 17 '13 at 15:43

0

The application of this query will be for two things:

Indicate that at least one row in second_table exists for a given row in first_table. It is to indicate it in a list.

To search for all rows in first_table which have at least one row in second_table.

Here you go:

SELECT  a.index, 1 as c_check  -- 1: at least one row in second_table exists for a given row in first_table
FROM    first_table a
WHERE   EXISTS
        (
            SELECT  1
            FROM    second_table b
            WHERE   a.index = b.index
        );

answered Jun 17 '13 at 15:43

the_slk

2,172
1
11
10

OK, you're right given my restrictions, will correct my question so it's not confusing. I really need to know all rows from table a that match a certain criteria (not discussed in this post), and be able to indicate whether there is a child in table 2... – Martin Jun 18 '13 at 14:57
@Martin: The answers will be just a blind shots. You would have to post some more information, i.e.: table structure - is it partitioned? Indexes? tables size? Database version - is it EE? And maybe execution plans. Without the details it is almost impossible to answer your question. – the_slk Jun 18 '13 at 15:41

score 0 · Answer 6 · answered Jun 20 '13 at 20:20

I am assuming that you can't change the table definitions, e.g. partitioning the columns.

Now, to get a good performance you need to take into account other tables which are getting joined to your main table.

It all depends on data demographics.

If the other joins will collapse the rows by high factor, you should consider doing a join between your first table and second table. This will allow the optimizer to pick best join order , i.e, first joining with other tables then the resulting rows joined with your second table gaining the performance.
Otherwise, you can take subquery approach (I'll suggest using exists, may be Mikhail's solution).
Also, you may consider creating a temporary table, if you need such queries more than once in same session.

MarmiK · Answer 7 · 2013-06-24T03:49:58.067

0

I am not expert in using case, but will recommend the join...

that works even if you are using three tables or more..

SELECT t1.ID,t2.name, t3.date
FROM  Table1 t1 
LEFT OUTER JOIN Table2 t2 ON t1.ID = t2.ID
LEFT OUTER JOIN Table3 t3 ON t2.ID = t3.ID
--WHERE t1.ID = @ProductID -- this is optional condition, if want specific ID details..

this will help you fetch the data from Normalized(BCNF) tables.. as they always categorize data with type of nature in separate tables..

I hope this will do...

edited Jun 24 '13 at 03:49

answered Jun 21 '13 at 09:59

MarmiK

5,639
6
40
49

you are not expert in joins either..inner join will not solve his problem as he need all ids from table A. – cosmos Jun 21 '13 at 12:33
read the question man..OP needs all the rows from Table 'A' and an indicator whether it exists in table 'B' or not. Doing an inner join will make he lose the ids which are not in table 'B'. BTW, I am not concerned about the optional ST condition rather "Join condition". – cosmos Jun 21 '13 at 13:07
it should be a left outer join. – cosmos Jun 21 '13 at 21:47
Ok I have updated my answer left join or left outer will perform same as per this link http://www.w3schools.com/sql/sql_join_left.asp – MarmiK Jun 24 '13 at 03:48
An Old post on outer keyword.. http://stackoverflow.com/questions/406294/left-join-and-left-outer-join-in-sql-server – MarmiK Jun 24 '13 at 06:17
I understand left outer join and left join are same..your's was inner join which brought my attention..thanks for updating the answer..removing the downvote..!! – cosmos Jun 24 '13 at 08:17
Yey, thank you pointing it out, in ignorance of details we might misguide someone, again thank you for correcting me.. :) – MarmiK Jun 24 '13 at 09:38

How to efficiently retrieve data in one to many relationships

7 Answers7