1

I have the following table (simplified from my actual code):

CREATE TABLE `policy` (
  `policy_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `policy_number` varchar(25) DEFAULT NULL,
  `effective_date` datetime NOT NULL,
  PRIMARY KEY (`policy_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

I want to be able to either select a single row from the table by passing a policy id:

SELECT * FROM policy WHERE policy_id = pol_id;

Or I want to be able to fetch multiple rows. I decided to use another table to store the rows and then do a join on the that table to my policy table. This eliminates the need to use FIND_IN_SET.

The table:

CREATE TABLE `policy_population_ids` (
  `policy_population_id` int(10) unsigned NOT NULL,
  `policy_id` int(10) unsigned NOT NULL,
  PRIMARY KEY (`policy_id`,`policy_population_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

And the query:

SELECT * FROM policy JOIN policy_population USING (policy_id) WHERE policy_population_id = pp_id;

I currently have a stored procedure that takes in both pol_id and pp_id and based on which one is populated, executes one query or the other. However, this means I'm maintaining double the code (the actual query is joining to several tables and is more complex and some work to maintain) and I'd like to combine them.

The combined query I came up with is:

SELECT 
    policy_id,
    policy_number,
    effective_date
FROM 
    policy AS p
LEFT JOIN 
    policy_population AS pp ON p.policy_id = pp.policy AND pp_id IS NOT NULL 
WHERE 
    (pol_id IS NULL OR policy_id = pol_id) AND
    (pp_id IS NULL OR policy_population_id = pp_id);

This query works, however, is very inefficient when using the population table (using pp_id) because the policy table has millions of rows. Is there any way to make this more efficient? Should I go back to using the two separate queries with an if / else?

UPDATE

Below is sample data and expected results.

policy table:

policy_id   policy_number   effective_date
203         ABC1            12/16/2011 18:42
204         ABC2            12/16/2011 21:16
205         ABC3            12/16/2011 23:11
207         ABC4            12/17/2011 17:06
208         ABC5            12/17/2011 17:17

policy_population_ids table:

policy_population_id    policy_id
23456                   203
23456                   204
23456                   205
23457                   207
23457                   208

If I pass a pol_id of 204, I would get:

204         ABC2            12/16/2011 21:16

If I pass a pp_id of 23456, I would get:

203         ABC1            12/16/2011 18:42
204         ABC2            12/16/2011 21:16
205         ABC3            12/16/2011 23:11

2 Answers2

0

You should have an index on policy_population.policy_id. That way the join will work much faster.

Dave
  • 3,273
  • 1
  • 17
  • 15
  • Apologies. The table does have an index. I just left it off for space. With over a million rows, even with the index, the query is slow. – Jeff_Lipeles Apr 04 '17 at 02:17
0

Build your query as a string and only include the join to the policy_population table if the relevant parameter is passed. Then, execute the string.

DECLARE  @include_policy_population VARCHAR(200)
        ,@sql NVARCHAR(MAX)

IF (@pol_id IS NOT NULL)
    SELECT @include_policy_population = 'INNER JOIN policy_population ON policy_population.policy_id = policy.policy_id'
ELSE
    SELECT @include_policy_population = ''

SELECT @sql = N'SELECT *
                    FROM policy
                    ' + @include_policy_population + '
                    INNER JOIN ....
                    INNER JOIN ....'

EXEC sp_executesql @sql
airrabian
  • 1
  • 1
  • 1