Share data between node child processes

Question

I have a small node script that gets ran in several node child processes, and I do not have access to the parent process.

The goal of the script is pretty simple, to return one element at random from an array. However, the element returned cannot be used by any of the other child processes.

The only solutions I can think of involve using redis or a database, and because this is such a tiny script, I would like to avoid that.

Here is an example of what I would like my code to look like:

var accounts = [acc1, acc2, acc3]
function() {
  var usedAccounts = sharedStore.get('usedAccounts')
  var unusedAccounts = filter(accounts, usedAccounts)
  var account = getRandomAccount(unusedAccounts)
  usedAccounts.push(account)
  sharedStore.set('usedAccounts', usedAccounts)
  return account
}

So far, the solutions I've thought of don't work because the sibling processes initially all get an empty list assigned to usedAccounts.

Same machine? If so, a combination of timers, a "lockfile", and a shared file would work. Albeit as kludge. Use the Timer to acquire the lock on the shared-file, edit the shared file, and release the lock. You'll have to guard against your process dying and not releasing the lock. — Alan, Jun 24 '16 at 00:42
I've never done anything like that before. How exactly would the timer be used? — bcoop713, Jun 24 '16 at 01:25

score 2 · Accepted Answer · answered Jun 24 '16 at 16:57

There are two problems you need to solve:

How to share data between multiple node processes without using the parent process to marshal data between them.
How to ensure that data is consistent across all the shared processes.

How to share data between multiple node processes.

Given your constraints with not wanting to use an external service (like Redis or another database service), and that nodejs doesn't have an easy way to use something like shared memory, a possible solution is to use a shared file between all the processes. Each process can read and write to a shared file, and use that that to get it's userAccount data.

The file could be JSON formatted and look something like this:

[
      {
       "accountName":"bob",
       "accountUsed":false
      },
      {
       "accountName":"alice",
       "accountUsed":true
      }
]

This would just be an array of userAccount objects, that also have a flag that indicate if the data is being read.

You app would:

GetAccountData():

Open the file
Read the file into memory
Iterate over the array
Find the first userAccount that is available
Set the accountUsed flag to true
Write the updated array back to the file
Close the file.

With having multiple processes reading and writing to a single resource is a well understood problem with concurrency called the Readers-Writers Problem.

How to ensure that data is consistent across all the shared processes.

To ensure data is consistent, you need to ensure that only one process can run the algorithm from above from start to finish at a time.

Operating Systems may provide exclusive locking of a file, but I nodejs has no native support for that.

A common mechanism would be to use a lockfile, and use it's existence to guard access to the datafile above. If it can't acquire a lock, it should wait a period of time, then attempt to reacquire the lock.

To acquire the lock:

Check if the lockfile exists.
If lockfile exists
Set a timer (setInterval) to acquire the lock
If the lockfile doesn't exist
Create the lockfile
If the lockfile creation fails (because it exists--race condition with another process)
Set a timer (setInterval) to acquire the lock
If the lockfile creation succeeds
Do GetAccountData();
Remove lockfile

This solution should work, but it's not without kludge. Using a synchronizing primative like a lock can cause your application to deadlock. Also using a timer to periodically acquire the lock is wasteful and can cause a race condition if not properly checking lock creation.

If your app crashes before it removes the lockfile, then you may create a deadlock situation. To guard against that, you might want to put a final unhandled exception handler to remove the lockfile if it was created by the process.

You will need to also make sure you only hold the lock long enough to do your serial work. Holding the lock for longer, will cause performance issues, and increase the likelihood of a deadlock.

score 0 · Answer 2 · answered Feb 25 '20 at 04:28

I rather let each process have its own flat file that they can write. And each process will be able to read all the files written by all the processes concurrently or otherwise thus obviating need of lock-file. Though you will have to figure out the logic as to how each process will write only its own file but reading all these files together brings out the source of truth

Share data between node child processes

2 Answers2

Linked