8

Is there any way to figure out when a SQL query will end after it starts? In other words, is there any way to figure out how much time a query will take before query execution? (rough estimate is fine!)

ALTER IGNORE TABLE mytbl ADD UNIQUE (c);

I ran this query on innodb table that has 35 million records. c is varchar(255)

This command has been running for 10 hours and it still is running. I don't want to cancel if I'm close to the end.

I appreciate any help. Thank you!

adamdc78
  • 1,153
  • 8
  • 18
mmswe
  • 707
  • 2
  • 8
  • 20
  • http://dev.mysql.com/doc/refman/5.6/en/show-processlist.html – Álvaro González Dec 27 '14 at 19:36
  • 1
    @ÁlvaroG.Vicario It shows me the duration since the query starts. My question is completely different. I don't understand why would you mention it? – mmswe Dec 27 '14 at 19:40
  • 1
    No. There is no way to see that, because the time that query will run depends from many other factors than mysql itself (like free CPU resources, free RAM, etc...). – bksi Dec 27 '14 at 19:47
  • @bksi - In principle there is no reason why this information can't be provided. In SQL Server you can see how many rows have been written to the new index mid way through by looking at the DMVs which is handy for getting an idea of how far through it is. – Martin Smith Dec 27 '14 at 19:54
  • @user4220128 - Weird, not in my servers. I also get a `State` column that (sometimes) gives a clue about what's going on. – Álvaro González Dec 27 '14 at 19:54
  • @Martin Smth sure. if the query is INSERT or UPDATE, you can figure out how many rows are added/updated, but in this particular case how the OP will find out how much of the index is built? Share if you know. – bksi Dec 27 '14 at 19:56
  • @bksi - The OP isn't on SQL Server so the methods of use there won't apply here. – Martin Smith Dec 27 '14 at 19:58
  • may be it is faster to create new table as you expect and add records to new tables in a scripting language and do this in multiple steps for example insert records to new table in chunks of 10000 records . – Majid Abdolhosseini Dec 27 '14 at 20:03
  • @Martin Smth . That's why i told that there is no stable indicator how precentage of the query is ran. The tags included mysql. – bksi Dec 27 '14 at 20:18
  • @bksi - The OP isn't after anything astonishingly precise. a "rough estimate is fine". Your original comment implied this was inherently impossible. It certainly wouldn't be impossible to provide greater visibility into the execution process that allowed an informed decision to be made. – Martin Smith Dec 27 '14 at 20:26
  • 1
    Do note that [SQL is turing complete](http://stackoverflow.com/a/7580013/732016), so it [can't be possible](https://en.wikipedia.org/wiki/Halting_problem) to decide whether an *arbitrary* command will halt—or, by extension, *when* it will do so. – wchargin Dec 27 '14 at 23:42

3 Answers3

6

Excerpt from the ServerFault solution:

The solution was to measure how fast the query was scanning rows in the table scan of the fact table. This is shown by the Handler_read_rnd_next status variable. Here’s an easy way to watch it (innotop is another handy way):

mysqladmin extended -r -i 10 | grep Handler_read_rnd_next
-- ignore the first line of output...
| Handler_read_rnd_next             | 429224      |

So the server was reading roughly 43K rows per second, and there were 150 million rows in the table. A little math later, and you get 3488 seconds to completion, or a little less than an hour. And indeed the query completed in about 55 minutes.

3

The big problem here is that the most important factor (by far) in determining how long a query will take has nothing to do with the query itself. Rather, the biggest factor in determining the answer is the load placed on the database while the query is running, especially including load from other queries. For complex queries on active systems, this load may even change significantly over the course of the query.

What you can do is get an execution plan for your query, using the EXPLAIN keyword. You can use this to get a relative idea, all else being equal, for how much a particular query might cost. Even here, though, MySql doesn't make it easy, in that it won't just give you a nice plan cost or execution time numbers you can use, like sql server does. You need to infer those numbers based on rows and disk reads.

Again, though, even if you generate an estimated cost and then run the query, you could get results in greatly longer or shorter time than expected because the load changes on the server after the estimate was created and as the query runs.


In this case, after 10 hours, it sounds like you might have a blocking/locking issue. If you have another active (possibly long-running or frequently recurring) query that uses this table, try killing it and there's a good chance your query will finish naturally a few minutes later.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
1

Use test dabatabase, Smaller set of data.
This will yield a very rough estimate. But I believe in your case the time will be non-linear eg: exponential. Perhaps you can rework the command so that you have guaranteed unique c.

terary
  • 940
  • 13
  • 30