0

What I'm running here is a graphical file manager, akin to OneDrive or OpenCloud or something like that. Files, Folders, Accounts, and the main server settings are all stored in the database as JSON-encoded objects (yes, I did get rid of columns in favor of json). The problem is that is multiple requests use the same object at once, it'll often save back incorrect data because the requests obviously can't communicate changes to each other.

For example, when someone starts a download, it loads the account object of the owner of that file, increments its bandwidth counter, and then encodes/saves it back to the DB at the end of the download. But say if I have 3 downloads of the same file at once, they'll all load the same account object, change the data as they see fit, and save back their data without regards to the others that overlap. In this case, the 3 downloads would show as 1.

Besides that downloads and bandwidth are being uncounted, I'm also having a problem where I'm trying to create a maintenance function that loads the server object and doesn't save it back for potentially several minutes - this obviously won't work while downloads are happening and manipulating the server object all the meanwhile, because it'll all just be overwritten with old data when the maintenance function finishes.

Basically it's a threading issue. I've looked into PHP APC in the hope I could make objects persist globally between threads but that doesn't work since it just serializes/deserialized data for each request rather than actually having each request point to an object in memory.

I have absolutely no idea how to fix this without completely designing a new system that's totally different.... which sucks.

Any ideas on how I should go about this would be awesome.

Thanks!

stingray-11
  • 448
  • 2
  • 7
  • 25
  • would need to understand the process a lot more to be able to offer any suggestions –  Sep 29 '14 at 20:45
  • Objects are stored as JSON. A request loads them, makes changes and saves it back several seconds later. If more than one request happens at a time, request 2 will load the data before request 1 changes it and then when it saves it back, it'll be writing over whatever request 1 did to it. I don't know how better to phrase it. – stingray-11 Sep 29 '14 at 20:49
  • ok, well that's clearly a bad idea - yes i know that's not helpful - but that's all i got. –  Sep 29 '14 at 20:51
  • bad design... if this was a web shop, you'd be doing the equivalent of running price information though client-side roundtrips. 200" plasma tv for $0.01? Sure! Your clients should be exchanging DELTAS. e.g. "increase download count in record #42 by 1" or "decrease download count by 1". anything that does "set record to value X" will always be vulnerable to race conditions. – Marc B Sep 29 '14 at 20:51
  • what ever you 'write back' probably belong in a table of its own, row by row, not a whole big mess for all –  Sep 29 '14 at 20:52
  • I realized it's bad design which is why I came on here to ask for a better one. I can't think of a way to tell the database to increment values without abandoning the JSON object structure. – stingray-11 Sep 29 '14 at 20:55
  • 4
    nuke it (from orbit), start again –  Sep 29 '14 at 20:56
  • Hmmm. Well there goes about 400KB of code. Still not sure what I should replace it with. – stingray-11 Sep 29 '14 at 20:57
  • 1
    @user3068322 A standard database structure? Then the ACID guarantees of an SQL DBMS solve this problem for you. – IMSoP Sep 29 '14 at 20:58
  • I'm using MySQL I don't know how much more standard it gets. If I had the JSON separated out into a bunch of columns, and even updated each cell immediately when it was updated in memory, it would still be vulnerable to race conditions without it re-querying the database every time it wanted to update something. – stingray-11 Sep 29 '14 at 21:00
  • @user3068322 The problem is that you're using good, ACID-conformant MySQL in a wrong way. You must not ever store JSON in database: http://en.wikipedia.org/wiki/First_normal_form – polkovnikov.ph Sep 29 '14 at 21:01
  • 1
    one of the things you write back is 'downloads' count, instead you have a table, in which you add a row for each download, when you need it - you select the count of the rows - no race condition –  Sep 29 '14 at 21:02
  • So then ... I should separate the JSON out into columns, and every time I want to change anything in the object, have two separate actions to modify the object in memory *and* increment the cell using MySQL? – stingray-11 Sep 29 '14 at 21:03
  • That sounds like a terrible idea there would be hundreds of thousands of rows for no purpose other than counting them – stingray-11 Sep 29 '14 at 21:04
  • "other" than, so 1 use use is not enough ? –  Sep 29 '14 at 21:05
  • Correct, I would never implement that especially since we're talking about one specific feature out of the dozens I'm trying to fix. – stingray-11 Sep 29 '14 at 21:06
  • its impossible to fix a dozen things in one S.O 'question' this really is to broad for S.O. Ultimately you have gone down the wrong path, and need to start again sad, but true. –  Sep 29 '14 at 21:22
  • At its simplest: `UPDATE account SET downloads_in_progress = downloads_in_progress + 1 WHERE account_id = ?` has no race condition; MySQL will promise you that. If you want to select the value out before anything else can also update it, read up on database transactions. By using JSON as described you are basically creating your own database application, and for all MySQL's flaws, you're unlikely to create one better in less than a lifetime. – IMSoP Sep 29 '14 at 21:23

2 Answers2

1

It's not a threading issue. Your database doesn't conform to neither of the standards of building databases, including even the first normal form: every cell must contain only one value. When you're storing JSON data in DB, you cannot write an SQL request to make that transaction atomic. So, yes, you need to put that code in a trash bin.

In case you really need to get that code working, you can use some mutexes to synchronize running PHP scripts. The most common implementation in PHP is file mutex.

Community
  • 1
  • 1
polkovnikov.ph
  • 6,256
  • 6
  • 44
  • 79
-1

You can try to use flock , I guess you already have a user id before getting JSON from DB.

$lockfile = "/tmp/userlocks/$userid.txt";
$fp = fopen($lockfile, "w+");
if (flock($fp, LOCK_EX)) {
  //Do your JSON update

  flock($fp, LOCK_UN); //unlock
}else{
   // lock exist 
}

What you need to figure out is what to do when there is a lock, maybe wait for 0.5 secs and try to obtain lock again , or send a message "Only one simultaneous download allowed " or ....

BojanT
  • 801
  • 1
  • 6
  • 12