Safe cross platform coroutines

Question

All coroutine implementations I've encountered use assembly or inspect the contents of jmp_buf. The problem with this is it inherently not cross platform.

I think the following implementation doesn't go off into undefined behavior or rely on implementation details. But I've never encountered a coroutine written like this.

Is there some inherent flaw is using long jump with threads?
Is there some hidden gotcha in this code?

#include <setjmp.h>
#include <thread>

class Coroutine
{
public:
   Coroutine( void ) :
      m_done( false ),
      m_thread( [&](){ this->start(); } )
   { }

   ~Coroutine( void )
   {
      std::lock_guard<std::mutex> lock( m_mutex );

      m_done = true;
      m_condition.notify_one();

      m_thread.join();
   }

   void start( void )
   {
      if( setjmp( m_resume ) == 0 )
      {
         std::unique_lock<std::mutex> lock( m_mutex );
         m_condition.wait( lock, [&](){ return m_done; } );
      }
      else
      {
         routine();
         longjmp( m_yield, 1 );
      }
   }

   void resume( void )
   {
      if( setjmp( m_yield ) == 0 )
      {
         longjmp( m_resume, 1 );
      }
   }

   void yield( void )
   {
      if( setjmp( m_resume ) == 0 )
      {
         longjmp( m_yield, 1 );
      }
   }

private:
   virtual void routine( void ) = 0;

   jmp_buf m_resume;
   jmp_buf m_yield;

   bool m_done;
   std::mutex m_mutex;
   std::condition_variable m_condition;
   std::thread m_thread;
};

Co-routines were popular in the previous century. Thoroughly outmoded when processors with multiple cores became common. Unless this is for academic interest, do take advantage of threads and avoid the horror of setjmp(). — Hans Passant, Nov 01 '11 at 23:52
I am not interested in coroutines for the sake of concurrency. They have many useful features and poor man's concurrency is not one of them. [Lua example](http://stackoverflow.com/questions/5128375/what-are-lua-coroutines-even-for-why-doesnt-this-code-work-as-i-expect-it/5128495#5128495), and a [wikipedia reference](http://en.wikipedia.org/wiki/Coroutine#Common_uses_of_coroutines) — deft_code, Nov 01 '11 at 23:58
@Hans Passant -- co-routines are definitely won't disappear, no matter how many cores the processors will have for the simple fact that context switching is a lot faster, you can have two orders of magnitude more co-routines than threads, and order of execution is sometimes important. — Gene Bushuyev, Nov 02 '11 at 00:06
@Hans Passant -- coroutines used to be implemented on top of threads. E.g. in Java. And Windows coroutines, called "fibers", reside on top of threads. So while Gene Bushuyev has a point, there's far more to coroutines than sheer efficiency. It has to do with predictability of the code. Then more time can be spent on functionality and less on chasing esoteric bugs... — Cheers and hth. - Alf, Nov 02 '11 at 00:14
@Gene, the context switches are exactly the problem. It isn't faster, context switching a co-routine is guaranteed to junk the cpu cache. Not a problem with threads running on multiple cores, each cpu has its own cache. — Hans Passant, Nov 02 '11 at 00:45
@Hans Passant -- I don't think I explained it well. Fibers (co-routines) don't incur kernel context switch, they run in a single thread, the context switch is like a long jump. My second point is since they run in a single thread, they are non-preemptive. Not only there is no locking, races, etc. the order of execution of fibers is guaranteed. They are basic primitive in even-driven simulators, where ordering of events is essential. They can't be substituted with preemptive threads. — Gene Bushuyev, Nov 02 '11 at 01:17
@HansPassant: I think that there is a confusion between concurrency and parallelism. If you take a peek at the "newish" languages, such as Go or Haskell, you will notice that they have been tailored for concurrency and provide "lightweight" threads of execution. They do not intrisically increase the parallelism of your application (the maximum parallelism you can get is hardware constrained anyway), but do allow you to define thousands of lightweight tasks that evolve concurrently. IMHO coroutines are meant for concurrency, and *might* be amenable for parallelism, but not necessarily. — Matthieu M., Nov 02 '11 at 08:24
Add Erlang, Elixer, Boost Asio + Boost Coroutine, C# 5.0 etc. — sehe, Jan 24 '14 at 08:48
@GeneBushuyev, I would like to add a correction on "there is no locking, races etc..": you still have races and you still have to lock resources if your usage of the resource will span multiple coroutine switches. So you still have the same issues as with normal threading but without thread switch overhead, without arbitrary preemption and without having to worry about context switch happening between two yields. — Martin, Feb 22 '15 at 22:35
Threads as a primitive "multitasking" tool serve only to confuse the otherwise distinct concepts of "parallel" and "concurrent". They make the assumption that the only reason you will ever need concurrency is to manage parallel workloads. It's quite an ignorant assumption, if you ask me. — James M. Lay, Mar 04 '21 at 22:36

sehe · Accepted Answer · 2013-05-13T08:54:06.533

UPDATE 2013-05-13 These days there is Boost Coroutine (built on Boost Context, which is not implemented on all target platforms yet, but likely to be supported on all major platforms sooner rather than later).

I don't know whether stackless coroutines fit the bill for your intended use, but I suggest you have a look at them here:

Boost Asio: The Proactor Design Pattern: Concurrency Without Threads

Asio also has a co-procedure 'emulation' model based on a single (IIRC) simple preprocessor macro, combined with some amount of cunningly designed template facilities that come things eerily close to compiler support for _stack-less co procedures.

The sample HTTP Server 4 is an example of the technique.

The author of Boost Asio (Kohlhoff) explains the mechanism and the sample on his Blog here: A potted guide to stackless coroutines

Be sure to look for the other posts in that series!

Wow! Thanks for the read. The use of macros and line numbers combined with a `switch` statement is crafty indeed! — Matthieu M., Nov 02 '11 at 08:20

eonil · Answer 2 · 2014-05-24T08:50:27.097

There is a C++ standard proposal for coroutine support - N3708 which is written by Oliver Kowalke (who is an author of Boost.Coroutine) and Goodspeed.

I suppose this would be the ultimate clean solution eventually (if it happens…) Because we don't have stack exchange support from C++ compiler, coroutines currently need low level (usually assembly level, or setjmp/longjmp) hack, and that's out of abstraction range of C++. Then the implementations are fragile, and need help from compiler to be robust.

For example, it's really hard to set stack size of a coroutine context, and if you overflow the stack, your program will be corrupted silently. Or crash if you're lucky. Segmented stack seems can help this, but again, this needs compiler level support.

If once it becomes standard, compiler writers will take care. But before that day, Boost.Coroutine would be the only practical solution in C++ to me.

In C, there's libtask written by Russ Cox (who is a member of Go team). libtask works pretty well, but doesn't seem to be maintained anymore.

P.S. If someone know how to support standard proposal, please let me know. I really support this proposal.

I'm using protothreads in C++ as well as in C. In fact, the above coroutines are simply repackaged protothreads macros. — Martin, Feb 22 '15 at 22:44

score 4 · Answer 3 · answered Nov 03 '11 at 18:54

4

There is no generalized cross-platform way of implementing co-routines. Although some implementations can fudge co-routines using setjmp/longjmp, such practices are not standards-compliant. If routine1 uses setjmp() to create jmp_buf1, and then calls routine2() which uses setjmp() to create jmp_buf2, any longjmp() to jmp_buf1 will invalidate jmp_buf2 (if it hasn't been invalidated already).

I've done my share of co-routine implementations on a wide variety of CPUs; I've always used at least some assembly code. It often doesn't take much (e.g. four instructions for a task-switch on the 8x51) but using assembly code can help ensure that a compiler won't apply creative optimizations that would break everything.

answered Nov 03 '11 at 18:54

supercat

77,689
9
166
211

You do not need longjmp. In fact that is bad and consumes memory. You can implement coroutines as implicit state machines by only using macros. It gives both sequentially looking code and almost no memory overhead. – Martin Feb 22 '15 at 22:46
1

@Martin: I've done state-machine based pseudo-multi-tasking, but I don't call fixed-entry-point state machines "co-routines". For me to consider a system as supporting "real" co-routines, it must be possible to switch execution contexts from within a subroutine call, making it possible to have methods which block. – supercat Feb 22 '15 at 23:07
@supercat: Can you elaborate a little bit on the exact mechanism by which "any longjmp() to jmp_buf1 will invalidate jmp_buf2"? I think you're just alluding to the fact that jumping back into routine1 will pop routine2's frame such that jumping "back" into routine2 afterwards would be nonsensical; but I've managed to confuse myself on the subject. :) If I understand the above scenario correctly, jumping to jmp_buf1 will invalidate jmp_buf2 by the mechanism I described; but jumping to jmp_buf2 will *not* invalidate jmp_buf1. Right? – Quuxplusone Dec 29 '16 at 23:50
@Quuxplusone: The "mechanism" is that the Standard says it imposes no requirements upon what may happen if code tries to use a jmp_buf after control has left the context in which it is created. Whether or not anything bad actually happens will depend upon many factors, some of which may be predictable and others not. – supercat Dec 30 '16 at 00:22

score 2 · Answer 4 · answered Nov 02 '11 at 01:07

I don't believe you can fully implement co-routines with long jump. Co-routines are natively supported in WinAPI, they are called fibers. See for example, CreateFiber(). I don't think other operating systems have native co-routine support. If you look at SystemC library, for which co-routines are central part, they are implemented in assembly for each supported platform, except Windows. GBL library also uses co-routines for event-driven simulation based on Windows fibers. It's very easy to make hard to debug errors trying to implement co-routines and event-driven design, so I suggest using existing libraries, which are already thoroughly tested and have higher level abstractions to deal with this concept.

Coroutines are not really the domain of the operating system. Rather, they are application specific - and live in the userspace. — Martin, Feb 22 '15 at 22:47

Safe cross platform coroutines

4 Answers4

Boost Asio: The Proactor Design Pattern: Concurrency Without Threads

Linked