8

I am using PDO, and the user should have the option of stop the request that he previous triggered.

For example, Now I click generate a report, however after the request I forgot to select one more field that makes the report useless. So I need to cancel the request, and make a new one.

Basically, how can I cancel the MYSQL query that is running?

if (!isset($_POST['cancel_request'])) { 
    //query
}else{
    //the user cancel the query
}

I know that I can use the kill command and the PID process, but this should running through PDO, and I don't know what is the PID.

user2990084
  • 2,699
  • 9
  • 31
  • 46
  • This might help you a bit: http://stackoverflow.com/questions/3573576/get-pdo-connection-id – Alfons Dec 31 '14 at 10:40
  • Does PDO::rollBack not work for your scenario? – Dan Jan 02 '15 at 15:57
  • 3
    Why *must* you kill the query? Why not just let the `SELECT` finish and simply ignore the result? – bishop Jan 02 '15 at 16:09
  • Killing a query assumes that you generate reports asynchroniously, that is you start a query, return its id, and poll the results each n seconds. In this case you should already have some id. Otherwise, stopping the query is not the only problem you have. – newtover Jan 02 '15 at 22:38
  • @bishop, ok, but does not make sense with large queries or multiples queries pending, that have useless data. – user2990084 Jan 02 '15 at 22:46
  • What happens when you press the "Generate report" button? Is a form submitted to a PHP script that generates the report and displays it in the page when completed or is something asynchronous, using AJAX? And how long does the report generation need to complete? – axiac Jan 08 '15 at 16:36
  • @user2990084 why start large queries or multiple queries pending if they are useless in the first place. Why not trying not to start them at all? I mean isn't there a way to see which queries are useful in the first place and eliminate all others before they are even started. This has to do with the logical backbone structure of your application and not the programmatic one. –  Jan 09 '15 at 12:11

4 Answers4

5

The main problem here is to share PID between your async request that generates the report and the script that should stop it.

You can get your PID using:

    $stmt = $dbh->prepare("SELECT CONNECTION_ID()");
    $stmt->execute();
    $pid = $stmt->fetchColumn();

And you can use something like php-shared-memory to create a shared variable between your scripts. If you're not using Composer for your project dependancies, this library has a standalone release (1.5.0, here).

Implementation sample:

<?php

if (!include __DIR__ . '/vendor/autoload.php')
{
    die('You must set up the project dependencies.');
}

use Fuz\Component\SharedMemory\SharedMemory;
use Fuz\Component\SharedMemory\Storage\StorageFile;

// your intializations here

$storage = new StorageFile("/tmp/shared.{$user_id}.sync");
$shared = new SharedMemory($storage);

if (!isset($_POST['cancel_request']))
{
    $stmt = $dbh->prepare("SELECT CONNECTION_ID()");
    $stmt->execute();
    $pid = $stmt->fetchColumn();

    $shared->pid = $pid;

    // your long query here

    $shared->destroyStorage();
}
else
{
    // kills pid
    $pid = $shared->pid;
    if (!is_null($pid))
    {
        $dbh->exec("KILL {$pid}");
    }
}
Alain Tiemblo
  • 36,099
  • 17
  • 121
  • 153
  • What will happen if two users want to generate a report? – newtover Jan 02 '15 at 22:45
  • As you can see in my sample, I use an hypothetical `$user_id` to distinguish shared resources. This mean that the shared object instance will be different for each user. Anyway, I let the developer adapt the code to fit his needs. – Alain Tiemblo Jan 02 '15 at 22:47
  • That's precisely because you are using sessions that your ajax request cannot run in parallel. Request that stops your query is delayed until the session file is released to avoid concurrent access. The behaviour is widely documented, see [this thread](http://stackoverflow.com/a/13651667/731138) for example. – Alain Tiemblo Jan 05 '15 at 11:20
  • ok Alain, I will try with php shared memory. sorry about the delete, but i was testing with sessions, and i don't want to post a comment with incorrect description. – user2990084 Jan 05 '15 at 11:31
  • @AlainTiemblo, the method `$shared->removeStorage();` doesn't exist in the source code of the plugin. I am correct? Should be used `$shared->remove();`? – user2990084 Jan 05 '15 at 12:31
  • Nope it exists, i just coded it 2 days ago. take the master branch. – Alain Tiemblo Jan 05 '15 at 13:28
  • Well, you have destroyStorage not removeStorage. https://github.com/ninsuo/php-shared-memory/commit/f38368f3a95b8605fd4882e24b2471244f63f008 – user2990084 Jan 05 '15 at 14:50
  • What if the client code handles the connection failure, reconnects and continues the processing? (A report generator should do it, in my opinion.) The UI will report it as killed while it will continue to consume processor cycles and stress the database. And there will be no way to stop it other than logging into the MySQL server and manually searching and killing it. – axiac Jan 08 '15 at 17:15
  • When reconnecting, you'll update the `connection_id()` in the shared memory. As it will not be null, why this would set "failed" in the UI? – Alain Tiemblo Jan 08 '15 at 17:18
  • Right, it can update the ID into the shared memory. But it will not be killed, right? Or it can not reconnect and in this case a real connectivity issue will appear as a request from the user to kill the reporting. There is a big risk here: if something starts to go wrong and the connectivity has problems the developer will never notice. – axiac Jan 08 '15 at 17:23
  • Of course my code is a quick & dirty implementation, I did it so the code is simple enough to be understood and used. I agree that such task needs strong validation and security, OP will need to adapt the code to its own environment/implementation, following his specific requirements. – Alain Tiemblo Jan 08 '15 at 19:41
5

As others mention, you can log in to your MySQL server, issue SHOW PROCESSLIST, locate the number for the query you want to abandon, and then issue a KILL number command.

It seems like you want this function to be self-service for your report-designing users, rather than having them call somebody in your net ops and ask them to do it, or teach them to use an administration tool.

I've done this sort of thing before. It takes three parts.

First, you need to arrange to insert a tag into the kind of queries your users might want to abandon. For example, if your user is doing this

SELECT cust_id, SUM(orders), COUNT(orders)
  FROM big_fat_table
 GROUP BY cust_id

you'll need to change the text of the query so it has a tag embedded in a comment, for example like so.

SELECT /*report:user2290084*/ cust_id, SUM(orders), COUNT(orders)
  FROM big_fat_table
 GROUP BY cust_id

Notice this tag has two items in it: report: and a user id. The user id needs to be something that does not relate to the connection upon which the query is run -- that connection will be tied up if you're trying to abandon the query running on it. The tag doesn't affect the execution of the query at all. It just shows up in the server's process table.

Second: your PDO code that runs those queries is going to need good and user friendly exception handling, because having the queries blow up underneath your code will become routine. You're going to need to experiment with that to get this abandon-query feature to be useful to your users.

Third: issue this query when you need to abandon the report query for user2290084

select id from information_schema.processlist where info like '%/*%report:user2290084*/%'

It looks through the process list to find the process id for a query with the appropriate tag. Then issue

kill <<that process id>>

and you've abandoned the query.

In php, it might look like this:

$q = 'select id from information_schema.processlist where info like :tag';
$tag = '%/*report:' . $userid . '*/%'
$stmt = $dbh->prepare($q);
$stmt->bindParam(':tag', $tag);
$stmt->execute();
$pids = $stmt->fetchAll(PDO::FETCH_COLUMN, 0);

foreach ($pid in $pids) {
    $dbh->query('KILL ' . $pid);
}        

Notice there's a loop in this code. It's possible that you will have more than one report query running for this user, or none.

O. Jones
  • 103,626
  • 17
  • 118
  • 172
  • Nice idea using an identifying comment in the query. – axiac Jan 08 '15 at 17:17
  • There is a problem here: either the report generating script re-connects and attempts to complete its job or it aborts when the connection breaks. In the first case the killing script has no effect. In the second case it may be the killing script that did it or it may be a real connectivity issue or a problem on the MySQL server. It's not that easy to find out. The two scripts (the one that generates the report and the one that stops it) need a way to communicate. – axiac Jan 08 '15 at 17:29
  • That's true, @axiac. A fully productized and scaleable system based on this would take quite a bit of wrapper work around the reporting script. I kinda sorta alluded to that in my words about "good and friendly exception handling." – O. Jones Jan 08 '15 at 18:59
0

as you are already running a query and making a second transation to cancel it, you will need to know some information about the running query.

http://dev.mysql.com/doc/refman/5.1/en/show-processlist.html

Show processlist will show all queries, Find the correct one and issue the kill command (in MySQL)

http://dev.mysql.com/doc/refman/5.1/en/kill.html

You will need some info to match against though. The info box shows the query as typed, You could add a comment in with a random number which you store in the session. Then match the session random value to the rows from the result of show processlist this would allow query cache to still be used and match uniquely on what you need to

Though the fact your queries are long enough to require something like this suggests you should look at why the queries are that slow.

exussum
  • 18,275
  • 8
  • 32
  • 65
0

This is a solution we are using on one of our systems. There are two distinct components (the UI and the report processor) that use the database to communicate.

The UI collects the report parameters from the user then creates a new entry in the reports table, with status new. The report generation does not start automatically.

The UI has a listing of all the reports, with their current status (new, waiting, running, killed, completed, failed). When the customer is pleased with the report parameters they select it form the list of reports and presses the "Start report" button that changes the report status from new to waiting.

A cron job that runs on the server every minute picks the first report in the waiting state, change its state to running and starts gathering its data from the database. When it completes it changes the status to completed (or failed if something went wrong) and updates a comment field with the result of the operation (the error message on failure).

The customer sees the current status of all the reports in the UI. If they want to stop a running report they use the "Kill" button and the UI changes the report status in the database to killed.

The report generation involves multiple phases and a lot of requests to the database. Between these phases and even during them, when a phase has many steps, the script checks the status of the report it processes. While the status is running it continues to gather the data and compose the report. If the status changed to killed it knows the user changed their mind; it stops processing, changes the report status to failed, updates the comment (to "killed by the user"), cleans the things up and exits.

This is just a sketch. There are more process statuses and also some code that keeps two instances of the cron job to process the same report and so on, but this model works for us for more than four years without significant issues.

Regarding the running time, small reports need several seconds to complete but bigger reports sometimes runs 2-3 hours. While 5-10 seconds reports could be let to complete and deleted afterwards, killing early a report that is not needed and would otherwise take a couple of hours to complete justifies implementing a complex mechanism.

axiac
  • 68,258
  • 9
  • 99
  • 134