1

I'm using PDO. So I have a statement that prepares a query to select a bunch of records. Example:

   //select duplicates
   $stmt = $this->db->prepare('SELECT Name COUNT( * ) AS CNT
            FROM  `Test`
            GROUP BY Name
            HAVING CNT > 1');

   $stmt2 = $this->db->prepare('SELECT * FROM Test2 WHERE TName = ?');

Note that I do need all data selected, not certain columns. The table contains 5k records, sometimes more. I need to select all 5k records and for each of them I need to execute another query to select something else.

   $arr = array();
   while ($row = $stmt->fetch(DB::FETCH_ASSOC)) {
       $stmt2->execute($row['Name']);
       $arr[] = $stmt2->fetchAll(DB::FETCH_ASSOC);
   }

I know I can use joins but for my case joins will not work, because I need to loop through the data and execute stmt2 for each stmt1 row.

When I run this it takes up to 10-15 minutes which I can't allow, I need it to be faster. Can anyone tell me what the problem is? 5k records does not seem that much to loop through.

      STMT1 returns up to 5.5k records

I need data for each duplicate returned:

Giorgi
  • 609
  • 2
  • 15
  • 29
  • What is you second query in execute? – Vyacheslav Voronchuk Nov 08 '12 at 15:18
  • Do you have proper indexes on the tables you're querying? – Michael Irigoyen Nov 08 '12 at 15:19
  • yes i do use indexes and i updated the content of second query – Giorgi Nov 08 '12 at 15:20
  • 2
    "but for my case joins will not work" - I highly doubt it. Based on your code, joins are exactly what you need. And it's slow because on one page load, you're executing **five thousand queries** per page load, since you execute one query per record. That's absurd. – nickb Nov 08 '12 at 15:20
  • but the actual code im using will not be able to join since im selecting the duplicates and for each duplicate i need to run the code – Giorgi Nov 08 '12 at 15:21
  • @Giorgi - Nobody can properly help you if you don't explain your problem clearly and provide the code (and db schema in this case) that you're looking at. – nickb Nov 08 '12 at 15:22
  • This is the so-called [n+1 problem](http://stackoverflow.com/questions/97197/what-is-the-n1-selects-problem), you are running 5.5k + 1 queries on every page request. – Nic Wortel Nov 08 '12 at 15:32
  • Make sure Test2 has an index starting with the TName column. – Scott Saunders Nov 08 '12 at 15:29

2 Answers2

6

Joins won't work? HIGHLY HIGHLY doubtful:

SELECT test2.*, count(test.name) AS cnt
FROM test
LEFT JOIN test2 ON test2.id = test.id
GROUP BY test.name
HAVING (cnt > 1)

another try, given the OP's troubles:

SELECT *
FROM test2
WHERE name IN (
   SELECT DISTINCT name
   FROM test
   GROUP BY name
   HAVING (count(*) > 1)
)
Marc B
  • 356,200
  • 43
  • 426
  • 500
  • i tried this, it works but it does not return records for both of the records, it only returns 1 record when i need both records i mean both duplicates – Giorgi Nov 08 '12 at 15:48
  • still taking long the second one seems to be the solution but its taking really long time i've been waiting for it to return rows for over 5 mins already – Giorgi Nov 08 '12 at 16:19
0

If for some reason you can't use JOINS, create a stored procudure, which will take an id as argument and return the data you need.

You can also use dictionary encoding for your Name, because operations with integers are much more faster.

Vyacheslav Voronchuk
  • 2,403
  • 19
  • 16