1

I'm a bit sorry about tags, probably I understood my problem not right and used them wrong but..

The problem I'm faced with my project is new for me and I never before experienced it. So in my case I have a huge dataset response from DB (Mongo, 100'000+ docs) and I needed to http-request every specific field from doc.

Example array from dataset will be like:

{
    _id: 1,
    http: http.request.me
},
{
    //each doc of 99k docs more
}

So guess you already understood that I cannot use default for loop because

  1. if it async I'll be made a huge amount request to API and will be banned/restricted/whatever
  2. if I made it one-by-one it will take me about 12-23H of waiting before my loop completes itself. (actually, this way is in use)

This is what I'm trying to do right now

  1. there is also another way and that's why I'm here. I could split my huge array in to chunks for example each 5/10/100..N and request them one-by-one

    │→await[request_map 0,1,2,3,4]→filled
    │→await[request_map 5..10]→filled
    │→await[request_map n..n+5]→filled
    ↓
    

According to the Split array into chunks I could easily do it. But then I should use 2 for cycles, first one will split default array and second async-request this new array (length 5/10/100...N)

But I have recently heard about reactive paradigm and RxJS that (probably) could solve this. Is this right? What operator should I use? What keyword should I use to find relative problems? (if I google reactive programming I'll receive a lot of useless result with react.js but not what I want)

So should I care about all this and just write an unoptimized code or there is an npm-module for that or another-better-pattern/solution?

Probably I found and answer here RxJS 1 array item into sequence of single items - operator I'm checking it now, but I also appreciate any relevant contribution to this question


RxJS has truly been helpful in this case and worth looking. It's an elegant solution for this kind of problems

AlexZeDim
  • 3,520
  • 2
  • 28
  • 64
  • Just to clarify, is the entire array of requests in-memory, or do you need to query a batch of 5 requests from your database each time? – Patrick Roberts Mar 15 '19 at 18:37
  • Y, the entire array with necessary urls are in-memory and I just need to query a batch of 5(n) requests from it one-by-one. As far as I'm understanding it, RxJS is good. But I still don't understand how to use it with `http` (but it's very easy to divide array as I want it to) – AlexZeDim Mar 15 '19 at 18:43

2 Answers2

4

Make use of bufferCount and concatMap

range(0,100).pipe(
    // save each http call into array as observable but not executing them
    map(res=>http(...)),
    //5 at a time
    bufferCount(5),
    //execute calls concurrently and in a queue of 5 calls each time
    concatMap(res=>forkJoin(res))
).subscribe(console.log)
AlexZeDim
  • 3,520
  • 2
  • 28
  • 64
Fan Cheung
  • 10,745
  • 3
  • 17
  • 39
  • As for now I'm using a bit another code from tutorial and it's been non-trivial to understand how ```import``` works in Node/Common.JS, But I just checks this one, and right after ```bufferCount(5)``` I receive the exact thing that I want. Also I'd like to add https://rxmarbles.com to the list of things that helped me to understood this. – AlexZeDim Mar 16 '19 at 09:35
  • 1
    Glad it helps solving the problem – Fan Cheung Mar 16 '19 at 09:40
2

There's actually an even easier way to do what you want with mergeMap operator and it's second optional argument which sets the number of concurrent inner Observables:

from([obj1, obj2, obj3, ...]).pipe(
  mergeMap(obj => /* make a request out of `obj` */, 5), // keep only 5 concurrent requests
).subscribe(result => ...)
martin
  • 93,354
  • 25
  • 191
  • 226