1

Is it possible to define a boundary that shouldn't be crossed for the application to scale well regarding task scheduling (over)use?

Questions :

  1. Is there a certain cost of doing setTimeout? Let say 0.1ms or CPU time? There is certainly order of magnitude lower cost than spawning a thread in different environments. But is there any?
  2. Is it better to avoid using setTimout for micro tasks that take like 1-2 ms ?
  3. Is there something that doesn't like scheduling? For instance I noticed of some sort of IndexedDb starvation for write locks when scheduling Store retrieval and other things
  4. Can DOM operations be scheduled safely ?

I'm asking because I started using Scala.js and an Rx implementation Monifu that is using scheduling at massive scale. Sometimes one line of code submits like 5 tasks to an event loop's queue so basically I'm asking myself, is there anything like task queue overflow that would slow the performance down? I'm asking this question especially when running test suites where hundreds of tasks might be enqueued per second.

Which leads to another question, is it possible to list cases when one should use RunNow/Trampoline scheduler and when Queue/Async scheduler in regards to Rx? I'm wondering about this every time I write stuff like obs.buffer(3).last.flatMap{..} which itself schedules multiple tasks

lisak
  • 21,611
  • 40
  • 152
  • 243
  • It's certainly better to use `setImmediate` (or a polyfill) instead of `setTimeout` for micro-tasks. `setTimeout` is costly, I did some timing [here](http://stackoverflow.com/q/18826570/1768303). – noseratio Dec 04 '14 at 23:01
  • That's a great benchmark. "The HTML5 specification has gone to the extreme of recommending 250 setTimeout callbacks per second". I think that in certain phases of my application run I might be executing more than that... – lisak Dec 04 '14 at 23:13
  • By design `setTimeout()` has a minimum number of ms before it will be called, so it depends upon whether you want that feature of it or not. I doubt there are queue limits to be worried about as it's a relatively small amount of data to queue an event and many timer implementations share a single system timer among all timers which is possible because they aren't pre-emptive. If you want a micro-task to execute in 1-2ms, then `setTimeout()` is the wrong tool. – jfriend00 Dec 04 '14 at 23:13
  • Since browser JS that can manipulate the DOM is single-threaded and non-preemptive, it is always safe to manipulate the DOM from a timer callback. – jfriend00 Dec 04 '14 at 23:15
  • If you're running hundreds of tasks per second, then you may want more control over the process than JS will ever give you. Perhaps you just want your own task queue that would give you control over execution without any system overhead. All you need for each task is a callback, perhaps some arguments for it and info about when it should run (as soon as its turn comes up or after a certain delay). – jfriend00 Dec 04 '14 at 23:32
  • You may find this article on macro and micro tasks useful: https://github.com/YuzuJS/setImmediate – jfriend00 Dec 04 '14 at 23:50
  • Ok, thank you jfriend00, I think that in production there is no way to reach those hunderds/seconds as it does when running test suite, so I'll wait instead of integrating polyfills. – lisak Dec 05 '14 at 07:44

1 Answers1

1

Some notes about scheduling in Monifu - Monifu tries to collapse asynchronous pipelines, so if the downstream observers are synchronous in nature, then Monifu will avoid sending tasks into the Scheduler. Monifu also does back-pressure, so it controls how many tasks are submitted into the Scheduler, therefore you cannot end up in a situation in which the browser's queue blows up.

For example, something like this ... Observable.range(0,1000).foldLeft(0)(_+_).map(_ + 10).filter(_ % 2 == 0) is only sending a single task in the scheduler for starting that initial loop, otherwise the whole pipeline is entirely synchronous if the observer is also synchronous and should not send any other tasks in that queue. And it sends the first task in the queue because it has no idea about how large that source will be and usually subscribing to a data-source is done in relation to some UI updates that you don't want to block.

There are 3 large exceptions:

  1. you're using a data-source that doesn't support back-pressure (like a web-socket connection)
  2. you're having a real asynchronous boundary in the receives (i.e. the observer), which can happen for example when communicating with external services and that's a real Future that you don't know when it will be complete

Some solutions possible ...

  1. in case the server communication doesn't support back-pressure, in such a case the easiest thing to do is to modify the server to support it - also, normal HTTP requests are naturally back-pressured (i.e. it's as easy as Observable.interval(3.seconds).flatMap(_ => httpRequest("..."))
  2. if that's not an option, Monifu has buffering strategies ... so you can have an unbounded queue, but you can also have a queue that triggers buffer overflow and closes the connection, or buffering that tries to do back-pressure, you can also start dropping new events when the buffer is full and I'm working on another buffering strategy for dropping older events - with the purpose of avoiding blown queues
  3. if you're using "merge" on a source of sources that can be unlimited, then don't do that ;-)
  4. if you're doing requests to external services, then try optimizing those - for example if you want to track the history of events by sending them to a web service, you can group data and do batched requests and so on

BTW - on the issue of browser-side and scheduling of tasks, one thing I'm worrying about is that Monifu does not break work efficiently enough. In other words it probably should break longer synchronous loops into smaller ones, because what's worse than suffering performance issues are latencies issues visible in the UI, because some loop is blocking your UI updates. I would rather have multiple smaller tasks submitted to the Scheduler, instead of a bigger one. In the browser you basically have cooperative multi-tasking, everything is done on the same thread, including UI updates, which means it's a very bad idea to have pieces of work that block this thread for too long.

That said, I'm now in the process of optimizing and paying more attention to the Javascript runtime. On setTimeout it is being used because it's more standard than setImmediate, however I'll do some work on these aspects.

But if you have concrete samples whose performance sucks, please communicate them, as most issues can be fixed.

Cheers,

Alexandru Nedelcu
  • 8,061
  • 2
  • 34
  • 39