-1

My boss sent us a query and asked us to replace it with a more efficient version and to achieve following objective:

Get all records from PACKAGE table which has at least 1 record in PASS_PACKAGE_DETAILS table.

Given SQL:

SELECT DISTINCT 
    pckg.*  
FROM 
    PASS_PACKAGE pckg 
JOIN 
    PASS_PACKAGE_DETAILS pckg_dtl ON (pckg.PACKAGE_ID = pckg_dtl.PACKAGE_ID) 
WHERE 
    IS_ACTIVE = 1 
    AND '2015/04/22' BETWEEN DATE_START AND DATE_END 
ORDER BY 
    PACKAGE_NAME

Correct me if I'm wrong, but I believe query above will slow down the performance due to the JOIN method. After reading this, I'm wondering which of my query does really achieved my boss requirement and why.

My SQL :

Attempt #1 - using IN:

SELECT
    pckg.*
FROM 
    PASS_PACKAGE pckg
WHERE 
    IS_ACTIVE = 1
    AND '2015/04/22' BETWEEN DATE_START AND DATE_END 
    AND pckg.PACKAGE_ID IN (SELECT DISTINCT pckg_dtl.PACKAGE_ID 
                            FROM PASS_PACKAGE_DETAILS pckg_dtl)
ORDER BY 
    PACKAGE_NAME

Attempt #2 - using EXISTS:

SELECT 
    pckg.*
FROM 
    PASS_PACKAGE pckg
WHERE 
    IS_ACTIVE = 1
    AND '2015/04/22' BETWEEN DATE_START AND DATE_END 
    AND EXISTS (SELECT pckg_dtl.PACKAGE_ID 
                FROM PASS_PACKAGE_DETAILS pckg_dtl
                WHERE pckg_dtl.package_id = pckg.package_id)
ORDER BY 
    PACKAGE_NAME

Hope to get some valuable information from experts here!

EDIT: I'm using SQL Server Management Studio. My execution time is kinda weird. For IN method, 1st execution will take around 90+ ms, 2nd execution like 200 ms, 3rd execution back to 90+ ms. So I'm not sure which execution time I should refer to. Same goes for the EXISTS method.

Community
  • 1
  • 1
VincentPzc
  • 31
  • 3
  • 1
    Have you looked at the query execution plans for your 3 variants of the query? – Jackson Apr 23 '15 at 03:57
  • 1
    Compare execution plans, and test with data similar to the actual set. There's no need to assume which would be best when it can be measured. – Pablo Romeo Apr 23 '15 at 03:57
  • I'm using SQL management studio. My execution time is kinda weird. For using IN method, 1st execution will me around 90+ ms, 2nd execution got me like 200 ms, 3rd execution back to 90+ ms. So im not sure which execution time I should refer to. – VincentPzc Apr 23 '15 at 04:04
  • 2
    Neither, look at the plan. – Blindy Apr 23 '15 at 04:11
  • Where is your schema definition? Your table and index stats? Your EXPLAIN plans? Maybe you should hire a professional. – symcbean Apr 24 '15 at 21:22

1 Answers1

0

exists and in are equivalent, and they both at best get optimized to a join, and at worst.. well, not.

You spent a few good minutes typing that entire post, I suggest looking at the execution plan instead as a better use of your time. You lost all credibility at "I believe" (even before reading the rest and seeing you're wrong) instead of showing hard data to support each alternative.

Edit: with the note that those queries aren't equivalent. The top one does an extra (useless) sort and filter (distinct), where the other two don't.

Blindy
  • 65,249
  • 10
  • 91
  • 131