0

I've got read-only access to a MySQL database, and I need to loop through the following query about 9000 times, each time with a different $content_path_id. I'm calling this from within a PERL script that's pulling the '$content_path_id's from a file.

SELECT an.uuid FROM alf_node an WHERE an.id IN 
    (SELECT anp.node_id FROM alf_node_properties anp WHERE anp.long_value IN 
        (SELECT acd.id FROM alf_content_data acd WHERE acd.content_url_id = $content_path_id));

Written this way, it's taking forever to do each query (approximately 1 minute each). I'd really rather not wait 9000+ minutes for this to complete if I don't have to. Is there some way to speed up this query? Maybe via a join? My current SQL skills are embarrassingly rusty...

Ian Crew
  • 117
  • 1
  • 12
  • Why not have all the `$content_path_id` in a single table, join that table with `alf_content_data` (on `content_url_id`), then join the result with `alf_node_properties`, then join the result with `alf_node`? – Nitzan Shaked Sep 01 '13 at 20:45
  • `EXISTS` could be faster than `IN`. See http://stackoverflow.com/questions/14190788/subqueries-with-exists-vs-in-mysql – PM 77-1 Sep 01 '13 at 20:47
  • Thanks Nitzan. Unfortunately, I only have read-only access to the DB, so I can't add a table. – Ian Crew Sep 01 '13 at 21:01

2 Answers2

1

This is an equivalent query using joins. It depends what indexes are defined on the tables how this will perform.

If your Perl interface has the notion of prepared statements, you may be able to save some time by preparing once and executing with 9000 different binds.

You could also possibly save time by building one query with a big acd.content_url_id In ($content_path_id1, $content_path_id2, ...) clause

Select
    an.uuid 
From
    alf_node an
        Inner Join
    alf_node_properties anp
        On an.id = anp.node_id 
        Inner Join
    alf_content_data acd 
        On anp.long_value = acd.id 
Where
    acd.content_url_id = $content_path_id
Laurence
  • 10,896
  • 1
  • 25
  • 34
  • Thanks Laurence: That seems somewhat, but not a whole lot, faster. I wonder if I could make it faster by putting all 9000 queries as "OR"s in the "where" line and executing it as one jumbo query. Would that help? – Ian Crew Sep 01 '13 at 21:30
  • If there's no index on `alf_content_data (content_url_id, acd.id)` then it will almost certainly make it faster. My suggestion with `acd.content_url_id in ($content_path_id1, ..) etc is equivalent to this. – Laurence Sep 01 '13 at 21:41
  • Sorry--I didn't notice the "acd.content_url_id In" bit in your original answer. That did indeed help on a smaller test, so the full-size one is running now. Thanks! – Ian Crew Sep 01 '13 at 22:04
  • The whole, longer, query just completed in under 2 minutes, so I can confirm that it the "acd.content_url_id IN" bit does in fact greatly speed things up. Thanks again Laurence! – Ian Crew Sep 01 '13 at 22:15
0

Try this extension to Laurence's solution which replaces the long list of OR's with an additional JOIN:

Select
    an.uuid 
From alf_node an
Join alf_node_properties anp
    On an.id = anp.node_id 
Join alf_content_data acd 
    On anp.long_value = acd.id 
Join (
    select "id1" as content_path_id union all
    select "id2" as content_path_id union all
    /* you get the idea */
    select "idN" as content_path_id 
) criteria   
    On acd.content_url_id = criteria.content_path_id

I have used SQL Server syntax above but you should be able to translate it readily.

Pieter Geerkens
  • 11,775
  • 2
  • 32
  • 52