4

Supposing our accounts table has a column called balance, each transaction is logged in the transactions table. Of course, we should verify there is sufficient funds prior to doing any transactions to sell products. Therefore, for performance purposes we should check the balance column of the user, deduct the amount on a successful sale, and update his balance.

However, if the user asynchronously bought 2 products, could not that potentially cause fraud? I've wrote a script that will deduct funds from an account and cloned it to another file. I executed both scripts at the same time and the results were surprising.

Deduct.php

<?php
//database connection...
$amount = 11;
$deducted = 0;
$blocked = 0;

for($i = 0; $i < 5000; $i++){
    $sql = $dbh->prepare('SELECT balance FROM accounts WHERE id = ?');
    $sql->execute(array(1));
    while($u = $sql->fetch()){
        $balance = $u['balance'];
        $deduct = $balance - $amount;
        if($deduct >= 0){
            $sql2 = $dbh->prepare('UPDATE accounts SET balance = ? WHERE id = ?');
            $sql2->execute(array($deduct,1));
            echo $balance . ' -> ' . $deduct . "\n";
            $deducted += $amount;
        } else {
            $blocked++;
        }
    }
}

echo 'Deducted: '.$deducted. "\n";
echo 'Blocked: '.$blocked;

Before running the scripts my balance was 1000000. I've executed two processes of this script with different $amount values.

Here are the results:

As you can see both scripts deducted a total of 125000 and my balance is 879778.00 which is a proof of fraud

  • Any alternative solutions to overcome this? I understand that logging each transaction and calculating the final balance is precise but is also intensive.
MFerguson
  • 1,739
  • 9
  • 17
  • 30
  • 1
    have you tried using transactions? start one before your operations and commit it afterwards, so there won't be outside changes inbetween. – Franz Gleichmann Feb 11 '16 at 13:44
  • Or a semaphore to lock the person's account while a transaction is taking place. – i-CONICA Feb 11 '16 at 13:55
  • Yup, **transactions** is the sane solution here. Your use case is pretty much the most common example on *why* you'd want to use them. You might also want to look into *constraints* and/or *triggers* (to avoid accounts from going to negative balance etc.). – Piskvor left the building Feb 11 '16 at 13:56
  • @i-CONICA: Database transactions are already doing that under the hood, much more efficiently. Semaphore/mutex is actually a low-tech solution here - a reinvented square wheel. – Piskvor left the building Feb 11 '16 at 13:57
  • 1
    "I understand that if logging each transaction and calculate the final balance s precise but is intensive." - **don't worry about optimization**. Seriously, don't. No, really: would you rather have an ultra-fast system that sometimes just drops your money and messes up your account, or a normally-fast system that keeps track of how much money you had at any point and why it changed? If money is on the line, you want to optimize for *safety* first, not for speed; the opposite has very nasty side effects (hordes of lawyers descending upon you etc.). – Piskvor left the building Feb 11 '16 at 14:17
  • @Piskvor +1 for this great explanation – Alessandro Genovese Feb 11 '16 at 14:47
  • @Piskvor Well that's me told. lol Thanks, you're right - transaction is much better. – i-CONICA Feb 11 '16 at 14:52
  • @i-CONICA: I didn't mean to offend you, and I apologize. Your line of reasoning is indeed correct; just that there exist built-in tools that are much better capable of implementing this ("closer to the metal," as it were) with minimum effort on the part of the developer. – Piskvor left the building Feb 11 '16 at 15:05
  • I wasn't offended at all, I just realised that transactions do the necessary locking, but in a much more efficient way. So thanks for pointing it out, it'd have been wrong for the OP to follow my suggestion. – i-CONICA Feb 11 '16 at 15:11

3 Answers3

3

If this is a school homework problem and you just want your script to work, do this:

$sql2 = $dbh->prepare('UPDATE accounts SET balance = balance - ? WHERE id = ?');
$sql2->execute(array($amount,1));

If you are doing anything real, you should be recording all the individual transactions. To speed things up, you can sum the transactions each night and update the account balance, then create a view which includes a query like this to get the current balance:

create or replace view current_accounts as
select a.id as account_id
, max(a.balance) + ifnull(sum(t.amount), 0) as current_balance
from accounts a
left join transactions t on t.account_id = a.id and t.transaction_at > a.updated_at
group by a.id

When inserting each transaction, if you require that the balance never goes negative:

insert into transactions (transaction_at, amount, account_id)
select now(), ?, v.account_id
from current_accounts v
where v.account_id = ?
and v.current_balance + ? >= 0

When binding this, make sure that the amount being deducted is negative if you are withdrawing money and positive if you are depositing money into the account. You will need an index on transactions.transaction_date and accounts.updated_at for this to have any speed advantage.

The nightly update should look like:

drop table accounts_old if exists;
create table accounts_new as
select t.account_id as id
, sum(t.amount) as balance
, max(t.transaction_at) as updated_at
from transactions t
group by t.account_id;
rename table accounts to accounts_old;
rename table accounts_new to accounts;

Also, the primary key in the accounts table should be called account_id and you should have a primary key in the transactions table called transaction_id. Resist the convention to name anything "id" because it will confuse you eventually.

Joseph Bui
  • 1,701
  • 15
  • 22
  • Good points! 1. audit trail - **absolutely essential** where any money is concerned, and 2. accounting-transactions following the usual business rules for bookkeeping. (As for performance, keeping a running total would be a microoptimization, or worse: a hole for race conditions) – Piskvor left the building Feb 11 '16 at 14:14
  • I like the concept but why are you using max? since there will be only 1 value to the balance column for each user – Alessandro Genovese Feb 11 '16 at 14:24
  • 1
    I used max because there is a group by clause. I could have used avg or min, or in MySQL probably nothing at all, but I prefer to be close to ANSI when possible. I was lazy using ifnull(). Change it to coalesce() for better ANSI compliance. – Joseph Bui Feb 11 '16 at 14:56
  • #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'select a.id as account_id , max(a.balance) + ifnull(sum(t.amount), 0) as curr' at line 2 – Alessandro Genovese Feb 11 '16 at 16:41
  • you forgot an `as` in your first line – Alessandro Genovese Feb 11 '16 at 18:37
2

Why would you fetch the balance in one query and then set it in another? Just use:

UPDATE accounts
    SET balance = balance - ?
    WHERE id = ?;

Do the arithmetic in the database.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Why should I do it in the database rather than my method? if there is no performance variation then it's a preference after all. – Alessandro Genovese Feb 11 '16 at 14:27
  • 1
    @AlessandroGenovese: Unless you're using transactions (and you don't seem to use transactions), doing "select ... do something else ... update" risks that another process will change the database while you're doing something else - which is the exact issue that you have seen. So, the problem is not with performance, but with correctness. – Piskvor left the building Feb 11 '16 at 15:02
  • I've used select in this example to provide the main topic we're going to use and Yes Im inserting rows into each transaction. – Alessandro Genovese Feb 11 '16 at 15:14
  • 1
    @AlessandroGenovese . . . There *is* a performance difference between running two queries versus one query. In addition, doing the logic in the application layer is incorrect (due to race conditions). There is basically 0 reason to replicate database functionality at the application layer, especially when this results in inferior, slower code. – Gordon Linoff Feb 11 '16 at 16:01
0

What you're looking for is a database that allows transactions: in other words, either everything inside a transaction works, or it's rolled back, as if it never happened. This is possible with MySQL InnoDB tables (but NOT with MyISAM tables, you may need to convert them).

Depending on the operations you are trying to do, you have various locking options available - e.g. "while I'm working on this row, everyone else trying to read or write has to wait until I'm done" or "everyone has to wait for writing, but is able to read", etc.

See e.g. this for an example: http://coders-view.blogspot.com/2012/03/how-to-use-mysql-transactions-with-php.html

You may also want to check constraints (e.g. avoid debiting more than the account contains); in MySQL, you can use triggers for that purpose: Can a MySQL trigger simulate a CHECK constraint?

Community
  • 1
  • 1
Piskvor left the building
  • 91,498
  • 46
  • 177
  • 222