10

I am reading a lot about Fibers, or green threads, or whatever other name we can give to userland threads. I started reading documentations and tutorials (these are C++ links, but I don't need a specific language):

However, it seems I cannot grasp the essentials about fibers. I know that fibers are a way to cooperatively multitask, but documentation about interplay between threads and fibers in actual cases are, as far as I found, scarce.

What are some practical use cases of fibers?

For instance, every doc actually uses async I/O as an example, but what if I don't have I/O-bound problems? For instance, what if my problem is counting words in a huge file? Here I would just split the file among threads, can fibers help somehow? I suppose that CPU-bound computations such as numerical problems (e.g., matrix/vector operations) are not suitable for fibers, but again, I might be completely wrong.

senseiwa
  • 2,369
  • 3
  • 24
  • 47
  • 2
    If you have a huge number of very small computations, then fibers might be a good candidate, because you don't pay the cost of managing a native thread at the beginning and end of each of those computations. – Theodoros Chatzigiannakis Apr 30 '18 at 07:54
  • 2
    `Here I would just split the file among threads` which would make your IO access slower on a mechanical disk, since now the head is having to jump around the disk rather than just read sequentially. Even on an SSD, this is still IO bound since the processing is that much faster than the disk access. – UKMonkey Apr 30 '18 at 09:07
  • Yes, but when counting the big hurdle I find is reallocations, and this usually trumps I/O in my experience. It was just an example, I just want to focus on fibers. – senseiwa Apr 30 '18 at 09:20
  • From your first link: "Two fibers on the same kernel thread will not run simultaneously on different processor cores." This would seem to preclude any benefit if splitting the file among fibers. By definition you won't get any benefit of multiple cores. – ttemple Apr 30 '18 at 10:20
  • Seems like the best use cases would be collecting data from I/O and queuing the data for use by consumers. Apparently there is no need to mutex between fibers because only one fiber at a time will run. This would certainly make certain problems easier to solve than using threads and managing the data locking properly between producers and consumers. – ttemple Apr 30 '18 at 10:24
  • @ttemple, re "no need to mutex between fibers..." True, but, "...because only one fiber at a time will run." Not true. The reason why you don't need mutexes is, Boost Fibers implement _cooperative multitasking_: The scheduler can switch to a different fiber only when the running fiber _yields_ control---typically by making a library call that waits for something. A program that uses _preemptive multitasking_ (i.e., threads) _does_ need mutexes even on a host with just one CPU because the scheduler can force a context switch at any time (e.g., when one thread is half done updating some data.) – Solomon Slow May 03 '18 at 11:54
  • From N2024: "Two fibers in the same thread cannot execute simultaneously." Sounds to me like only one fiber can run at a time (on the same thread). The complete quote: "Two fibers in the same thread cannot execute simultaneously. This can greatly simplify sharing data between such fibers: it is impossible for two fibers in the same thread to race each other. Therefore, within the domain of a particular thread, it is not necessary to lock shared data." Perhaps my paraphrase wasn't clear. Sorry. – ttemple May 03 '18 at 14:46
  • In modern times, fibers are useless. you can use idioms like async/await or goroutine-like functions (that the compiler takes care for you to begin with) than to mess with fibers. Here's a question - let's say a thread owns to fibers and one fiber locked a lock - when you jump to the second fiber and tries to lock the lock - what should happen? consider the lock already acquired? dead-lock? – David Haim May 06 '18 at 12:38

2 Answers2

2

what if my problem is counting words in a huge file? ..., can fibers help somehow?

No.

every doc actually uses async I/O as an example

Async I/O is the problem that threads originally were meant to solve back when multi-CPU systems had not yet escaped from the laboratory. Threads were an alternate way to structure a program that had to wait for input from several different, non-synchronized sources and, had to respond to those inputs in a timely fashion.

Depending on how they were implemented, threads back in those days could be anywhere on a scale from "mostly the same as" to "completely identical with" what we call "green threads" or "fibers" today.

When multi-CPU systems hit the market, threading was seen as a natural and obvious way to exploit the parallel processing capabilities.

Solomon Slow
  • 25,130
  • 5
  • 37
  • 57
2

Fibers are meant to have lower overhead on creation and context switching than OS threads. So in theory, if you have a solution where you have lots of blocking on locks, you may see a performance improvement from fibers because the OS threads on which the fibers run will use more of their allotted runtime. This is because when a fiber blocks on a fiber mutex/lock, the underlying OS thread will invoke a fiber scheduler which will run a different fiber, all without doing an OS-thread context switch. This is the basic idea behind M:N threading models.

Another case would be if you need to create and destroy threads with great frequency or in large numbers. Because fibers are faster to create and typically more lightweight than OS threads, you can use them in much larger numbers, and for much finer grained parallelism (in theory.)

One practical application is for large agent based simulation using the actor model. With fibers, each agent/actor can be run on its own fiber.

Brandon Kohn
  • 1,612
  • 8
  • 18