0

The problem is as follows. I have an array of objects like so:

let myObj = [
{'db1':['doc1','doc2','doc3']},
{'db2':['doc4','doc5']},
{'db3':['doc7','doc8','doc9','doc10']}
]

Note that this is a data structure I decided to use for the problem and can be changed if it can improve the overall implementation. The actual db and doc Ids are read from a text file formatted as below.

"db1","doc1"
"db1","doc2"
...

My app will iterate through the db list synchronously. Inside each db iteration, there will be an asynchronous iteration of the document list. Each document will be retrieved, processed and saved back to the db.

So basically at any given instance: one db, but multiple documents.

I have a working implementation of the above like so:

dbIterator: the synchronous outer loop to iterate dbs. The callback passed to docIterator will trigger the next iteration.

const dbIterator = function (x) {
  if (x < myObj.length) {
    let dbObj = myObj[x];
    let dbId = Object.keys(dbObj)[0];
    docIterator(dbId, dbObj[dbId], ()=>merchantIterator(x+1));
  } else {
    logger.info('All dbs processed');
  }
};

docIterator: the asynchronous loop to iterate docs. The callback cb is called after all documents are processed. This is tracked via the docsProcessed and docsToBeProcessed variables

const docIterator = function(dbId, docIds, cb){
  //create connection
  targetConnection = //some config for connection to dbId
  let docsProcessed = 0;
  let docsToBeProcessed = docIds.length;

  //asynchronous iteration of documents
  docIds.forEach((docId)=>{
    getDocument(docId, targetConnection).then((doc)=>{
      //process document
      processDoc(doc, targetConnection).then(()=>{
        //if processing is successful
        if (++docsProcessed >= docsToBeProcessed) {
          cb();
        }
      })
       //if processing fails
      .catch((e) => {
        logger.error('error when processing document');
        if (++docsProcessed >= docsToBeProcessed) {
          cb();
        }
      });

    }).catch((e)=>{
      logger.error('error when retrieving document: ');
      if (++docsProcessed >= docsToBeProcessed) {
        cb();
      }
    });
  });
};

processDoc: used to process and save an individual document. This returns a promise that gets resolved when the document processing is done which in turn increments docsProcessed and conditionally (docsProcessed >= docsToBeProcessed) calls the call back passed into docIterator

const processDoc = function(doc, targetConnection) {

  return new Promise(function(resolve, reject) {
    if(shouldThisDocBeProcessed(doc){
      let updatedDoc = logic(doc);
      targetConnection.insert(updatedDoc, updatedDoc._id,
        function (error, response) {
          if (!error){
            logger.info('updated successfully');
          } else {
            logger.error('error when saving doc');
          }
          resolve();
        }
      );
    } else {
      resolve();
    }
  })
};

This works as expected but for me this implementation is sub-optimal and messy. I'm pretty sure this can be improved upon and most importantly a chance to better understand and implement solutions to synchronous and asynchronous problems.

I'm open to constructive criticism. So how can this be improved?

Priyath Gregory
  • 927
  • 1
  • 11
  • 37
  • A good place for code reviews is https://codereview.stackexchange.com/ . vote up for the time and effort spent formulating this one, – Patrick Artner Dec 09 '17 at 10:08
  • Is it `docs` or `docIds`? – Bergi Dec 09 '17 at 11:13
  • Since you are working with promises: don't use a callback argument in your `docIterator` function. Return a promise instead! Also you don't need to do all that process counting yourself - just use `Promise.all`. – Bergi Dec 09 '17 at 11:14
  • @Bergi Promise.all will not work in this case as I need to know when all promises complete even if some get rejected. As per my understanding, promise.All does not wait if one promise rejects. Also it is `docIds`. My bad, corrected. – Priyath Gregory Dec 09 '17 at 12:31
  • The code you show has a `myObj.length` or undefined because myObj is an object literal and not an array. – HMR Dec 09 '17 at 14:13
  • I don't understand why you added a new question without answering the questions I had in your other question: https://stackoverflow.com/questions/47725420/iterating-javascript-object-synchronously "Do you want to stop processing if one rejects?" If you don't then here is an example that returns a `Fail` object so you still get all results but know which failed and which succeeded: https://stackoverflow.com/a/47716971/1641941. Combine that with throttle and you should be good to go. – HMR Dec 09 '17 at 14:23
  • @fsociety None of the promises in your loop are getting rejected, they all are produced by a `catch()` call that handles all previous errors. That's exactly [how you would use `Promise.all` to wait for all promises to settle](https://stackoverflow.com/q/31424561/1048572). – Bergi Dec 09 '17 at 16:55
  • @Bergi good point. So am i good to go in that case? Or can there still be a rejection in case of any unseen errors that I have not handled. I'm thinking some sort of an exception that would in turn be rejected by the Promise.all – Priyath Gregory Dec 10 '17 at 04:33
  • @fsociety `.catch` handles *all* rejections. Of course the `all`'d promise could still be rejected if the code within the `catch` handler throws (inadvertently or not) – Bergi Dec 10 '17 at 11:43

1 Answers1

0

Maybe something like this?

An example implementation of throttle can be found here.

//this should be available in both modules so you can filter
const Fail = function(details){this.details=details;};
// docIterator(dbId,docIds)
// .then(
//   results =>{
//     const failedResults = results.filter(
//       result => (result&&result.constructor)===Failed
//     );
//     const successfullResults = results.filter(
//       result => (result&&result.constructor)!==Failed
//     );
//   }
// )

const docIterator = function(dbId, docIds){
  //create connection
  // targetConnection = //some config for connection to dbId
  let docsProcessed = 0;
  let docsToBeProcessed = docIds.length;
  //asynchronous iteration of documents
  docIds.map(
    docId =>
      new Promise(
        (resolve,reject) =>
          //if you use throttled you can do:
          // max10(
          //   ([docId,targetConnection])=>
          //     getDocument(docId,targetConnection)
          // )([docId, targetConnection])
          getDocument(docId, targetConnection)
      )
      .then(
        doc =>
          //if this returns nothing then maybe you'd like to return the document
          processDoc(doc, targetConnection)
          .then(
            _ => doc
          )
      )
      .catch(
        err => new fail([err,docId])
      )

  )
};
HMR
  • 37,593
  • 24
  • 91
  • 160