3

I have a pretty big table with contains about 3 million records. When running a very simple query, joining this table on a few others (all with indexes and/or primary keys), the query will take about 25 seconds to complete! The value of "Handler_read_next" is about 7 million!

Number of requests to read the next row in key order, incremented if you are querying an index column with a range constraint or if you are doing an index scan.

This problem have only started since this table began to grow big.

Now if I do an "optimize tables" on this table, the query will run in about 0.02 seconds and "Handler_read_next" will have a value of about 1500.

How can the difference be so extreme, and do I really have to setup a scheduled query, optimizing this table once a week or so? Even so, I would like to know the meaning behind this and why mysql behaves like this. Sure, rows are deleted and updated pretty much in this table, but should it get so badly fragmented in only one week that the query goes from 0.02 sec to 25 sec?

Edit: After request, here comes the query in question:

SELECT *
FROM budget_expenses 
  JOIN budget_categories 
    ON  budget_categories.BudgetAreaId = budget_expenses.BudgetAreaId 
    AND budget_categories.BudgetCategoryId = budget_expenses.BudgetCategoryId
  LEFT JOIN budget_types 
    ON  budget_types.BudgetAreaId = budget_expenses.BudgetAreaId 
    AND budget_types.BudgetCategoryId = budget_expenses.BudgetCategoryId 
    AND budget_types.BudgetTypeId = budget_expenses.BudgetTypeId
WHERE budget_expenses.BudgetId = 1 
  AND budget_expenses.ExpenseDate >= '2012-11-25' 
  AND budget_expenses.ExpenseDate <= '2012-12-24' 
  AND budget_expenses.BudgetAreaId = 2 
ORDER BY budget_expenses.ExpenseDate DESC, 
         budget_expenses.ExpenseTime IS NULL ASC, 
         budget_expenses.ExpenseTime DESC

(BudgetAreaId, BudgetCategoryId) is the primary key in budget_categories and (BudgetAreaId, BudgetCategoryId, BudgetTypeId) is the primary key in budget_types. In budget_expenses these 3 keys are indexes and also ExpenseDate has an index. This query returns about 20 rows.

Show create table:

CREATE TABLE `budget_areas` (
  `BudgetAreaId` int(11) NOT NULL,
  `Name` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`BudgetAreaId`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `budget_categories` (
  `BudgetAreaId` int(11) NOT NULL,
  `BudgetCategoryId` int(11) NOT NULL AUTO_INCREMENT,
  `Name` varchar(255) DEFAULT NULL,
  `SortOrder` int(11) DEFAULT NULL,
  PRIMARY KEY (`BudgetAreaId`,`BudgetCategoryId`),
  KEY `BudgetAreaId` (`BudgetAreaId`,`BudgetCategoryId`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1


CREATE TABLE `budget_types` (
  `BudgetAreaId` int(11) NOT NULL,
  `BudgetCategoryId` int(11) NOT NULL,
  `BudgetTypeId` int(11) NOT NULL,
  `Name` varchar(255) DEFAULT NULL,
  `SortId` int(11) DEFAULT NULL,
  PRIMARY KEY (`BudgetAreaId`,`BudgetCategoryId`,`BudgetTypeId`),
  KEY `BudgetAreaId` (`BudgetAreaId`,`BudgetCategoryId`,`BudgetTypeId`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

 CREATE TABLE `budget_expenses` (
  `ExpenseId` int(11) NOT NULL AUTO_INCREMENT,
  `BudgetId` int(11) NOT NULL,
  `TempId` int(11) DEFAULT NULL,
  `BudgetAreaId` int(11) DEFAULT NULL,
  `BudgetCategoryId` int(11) DEFAULT NULL,
  `BudgetTypeId` int(11) DEFAULT NULL,
  `Company` varchar(255) DEFAULT NULL,
  `ImportCompany` varchar(255) DEFAULT NULL,
  `Sum` double(50,2) DEFAULT NULL,
  `ExpenseDate` date DEFAULT NULL,
  `ExpenseTime` time DEFAULT NULL,
  `Inserted` datetime DEFAULT NULL,
  `Changed` datetime DEFAULT NULL,
  `InsertType` int(1) DEFAULT NULL,
  `AccountId` int(11) DEFAULT NULL,
  `BankCardId` int(11) DEFAULT NULL,
  PRIMARY KEY (`ExpenseId`),
  KEY `BudgetId` (`BudgetId`),
  KEY `AccountId` (`AccountId`),
  KEY `Company` (`Company`) USING BTREE,
  KEY `ExpenseDate` (`ExpenseDate`),
  KEY `BudgetAreaId` (`BudgetAreaId`),
  KEY `BudgetCategoryId` (`BudgetCategoryId`),
  KEY `BudgetTypeId` (`BudgetTypeId`),
  CONSTRAINT `budget_expenses_ibfk_1` FOREIGN KEY (`BudgetId`) REFERENCES `budgets`    (`BudgetId`)
) ENGINE=InnoDB AUTO_INCREMENT=3604462 DEFAULT CHARSET=latin1

After I copy pasted this I changed from MyIsam to Innodb on the budget_categories table.

Edit: The change from myisam to innodb didn't make any difference. The query is now very slow, just 12 hours after i optimized the budget_expenses table!

Here is the explain for the query which now takes about 9 seconds:

http://jsfiddle.net/dmVPY/1/

Andreas
  • 3,212
  • 4
  • 25
  • 33
  • 2
    Is the query hitting the query cache the second time? – G-Nugget Dec 11 '12 at 22:39
  • 1
    Are you regularly deleting hundreds of thousands or even millions of rows between optimizations? – drew010 Dec 11 '12 at 22:39
  • @G-Nugget: No, i change a date range each time when testing so the cache will not be hit. – Andreas Dec 11 '12 at 22:45
  • @drew010: No, I would say a few thousand deletes and updates all in all. – Andreas Dec 11 '12 at 22:45
  • Which engine are you using? MyISAM? – Mike Purcell Dec 11 '12 at 23:15
  • 1
    Okay maybe at this point if you could post the query and explain output that would be helpful. I wonder if the range is so large its just forcing a full table scan or near full scan. Of course these will always be fast when the table is small but now that its grown larger in size it is taking longer. – drew010 Dec 11 '12 at 23:41
  • @MikePurcell: When building this DB I used only MyIsam. Since then I've been converting as much tables as possible to Innodb because of different reasons. It seems like I have not done this on two of the tables that is joined upon in this query though! Could joining between myisam and innodb cause these kind of problems you think? – Andreas Dec 12 '12 at 07:50
  • @drew010: I have added the query to the question. I think I'll have to wait with the explain until it's slow again. – Andreas Dec 12 '12 at 07:51
  • You probably need an index on `(BudgetAreaId, BudgetId, ExpenseDate)` but please add the `SHOW CREATE TABLE` output for the 3 tables. – ypercubeᵀᴹ Dec 12 '12 at 07:56
  • @Andreas I've edited your question. Tables cannot have 2 or 3 PKs. I guess you meant that they have composite (compound) PKs (made of 2 or 3 columns). – ypercubeᵀᴹ Dec 12 '12 at 08:02
  • @ypercube: Thank you for the correction :) I have added the show create table output. – Andreas Dec 12 '12 at 08:27
  • You missed the `budget_expenses` table (you added the `areas` table) – ypercubeᵀᴹ Dec 12 '12 at 08:30
  • @ypercube8: There, added show table data for budget_expenses. – Andreas Dec 12 '12 at 08:38
  • The query got slow again =/ Just 10 hours after optimizing the budget_expenses table. And this is after I changed from myisam to innodb on the one table that didn't have innodb. I added the explain to the question, see my edit above.. – Andreas Dec 12 '12 at 09:56
  • How about switching to MSSQL or Oracle? MySQL's performance degrades as data size grows, even with proper indexing. – Mukus Dec 12 '12 at 09:58
  • @ypercube: I completely missed your advice about adding an index. WOW! That made the query even faster and I didn't have to optimize the table first. Could you please write this advice as an answer, and maybe (if you want) explain why just this index combination was needed. I don't quite understand why budgetcategoryid wasn't needed. – Andreas Dec 12 '12 at 10:08
  • 1
    @TejaswiRana: Do you have a link for this claim? (performance degrading with increasing size of tables) – ypercubeᵀᴹ Dec 12 '12 at 10:37
  • @ypercube Benchmarking claims are often biased to start with. So I can't really point to a specific link and say this is true. For example, if it is a Microsoft site, of course it will say MSSQL is better. But here's a link that actually might help a bit in comparing different versions of SQL Server with MyISAM and InnoDB engines for selects, updates, deletes and inserts. http://bloggersnetwork.net/mysql-vs-ms-sql/ Of course it all depends on database structure (fragmentation and indexing included, queries (joins) and volume that was tested too. – Mukus Dec 12 '12 at 11:33
  • 1
    @Tejaswi: Sorry but a test that has `innodb_buffer_pool_size = 8M` in 2012 is not worth read further. My 5-year old mobile phone gives more memory to the applications I run, a database server should be configured with a bit more, don't you think? – ypercubeᵀᴹ Dec 12 '12 at 13:48
  • The reason why I asked if you were using MyIsam engine is because I have heard that performance takes a hit if a table is constantly undergoing heavy changes, (inserts and deletes), which is why an optimize must be executed to regain some of the performance (rebuilding indices etc). I was looking for more concrete answer and came across this post: http://stackoverflow.com/questions/20148/myisam-versus-innodb You should read all of the answers as there are several good responses. – Mike Purcell Dec 12 '12 at 18:09

1 Answers1

1

Ahhh MyISAM....

Try changing the table type (aka 'storage engine') to InnoDB instead.

If you do this, make sure innodb_buffer_pool_size in your my.cnf is a sensible value - the default is too small.

Rimu Atkinson
  • 775
  • 4
  • 15
  • 1
    Apparently I had myisam on budget_categories but innodb on the rest of the tables. I have now changed to innodb on this table also. I guess I have to wait a week before I know it worked so please be patient with me accepting an answer. – Andreas Dec 12 '12 at 08:28
  • No problem. When I encountered this issue on a large site I maintain, I just set up a cron job to optimize the tables once per day. But if you're already using InnoDB then you might as well go all the way down that route. – Rimu Atkinson Dec 12 '12 at 08:46
  • So you mean that this is a known problem and when mixing engines like this, the table get more fragmented and optimization needs to be done regularly? – Andreas Dec 12 '12 at 09:09
  • I'd like to upvote this as you are correct, there are inherit performance issues regarding MyISAM vs InnoDB, but you should cite reason(s) why, so others will learn why MyISAM is a bad choice for most applications. – Mike Purcell Dec 12 '12 at 18:12
  • It's a known problem with MyISAM. I don't know what causes it. – Rimu Atkinson Dec 12 '12 at 19:27