Mutex vs. standard function call

Question

When I have a block of code like this:

    mutex mtx;

    void hello(){
        mtx.lock();
        for(int i = 0; i < 10; i++){
           cout << "hello";
        }
        mtx.unlock();
    }
    void hi(){
        mtx.lock();
        for(int i = 0; i < 10; i++){
           cout << "hi";
        }
        mtx.unlock();
    }

    int main(){
       thread x(hello);
       thread y(hi);
       x.join();
       y.join();
    }

What is the difference between just calling `hello()` and `hi()`? (Like so)
   ...
   int main(){
      hello();
      hi();
   }

Are threads more efficient? The purpose of thread is to run at the same time, right?

Can someone explain why we use mutexes within thread functions? Thank you!

The whole thread code is encapsulated in a locking mechanism that prevents concurrent execution, so in this very specific case threads are not more efficient, since they are forced to execute sequentially. You pay the additional price of instantiating and joining the threads, which you wouldn't by simply calling the functions. — Patrick Trentin, Mar 15 '16 at 23:27

score 1 · Answer 1 · answered Mar 15 '16 at 23:23

1

The purpose of thread is to run at the same time, right?

Yes, threads are used to perform multiple tasks in parallel, especially on different CPUs.

Can someone explain why we use mutexes within thread functions?

To serialize multiple threads with each other, such as when they are accessing a shared resource that is not safe to access concurrently and needs to be protected.

answered Mar 15 '16 at 23:23

Remy Lebeau

555,201
31
458
770

by shared resource, do you mean an object such as an integer, char, etc? – Ricky Mar 15 '16 at 23:35
Anything that the threads share with each other. It could be variables, or hardware resources, or files, etc. – Remy Lebeau Mar 15 '16 at 23:37

merlin2011 · Answer 2 · 2016-03-17T18:47:25.067

1

Threads have at least two advantages over purely serial code.

Convenience in separating logically independent sequences of instructions. This is true even on a single core machine. This gives you logical concurrency without necessarily parallelism.
- Having multiple threads allows either the operating system or a user-level threading library to multiplex multiple logical threads over a smaller number of CPU cores, without the application developer having to worry about other threads and processes.
Taking advantage of multiple cores / processors. Threads allow you to scale your execution to the number of CPU cores you have, enabling parallelism.

Your example is a little contrived because the entire thread's execution is locked. Normally, threads perform many actions independently and only take a mutex when accessing a shared resource.

More specifically, under your scenario you would not gain any performance. However, if your entire thread was not under a mutex, then you could potentially gain efficiency. I say potentially because there are overheads to running multiple threads which may offset any efficiency gain you obtain.

edited Mar 17 '16 at 18:47

answered Mar 15 '16 at 23:25

merlin2011

71,677
44
195
329

Concurrency and parallelism are related but not interchangeable. The question is about parallelism. E.g. I separate logically independent sequences of instructions by writing function. It is very convenient. – knivil Mar 15 '16 at 23:30
@knivil, Parallelism is simultaneous execution, while concurrency is logically running threads that are simply interleaved. The difference is described [here](http://stackoverflow.com/questions/1050222/concurrency-vs-parallelism-what-is-the-difference). – merlin2011 Mar 15 '16 at 23:32
Downvoter please correct this answer. I'm interested in learning what I'm missing. – merlin2011 Mar 15 '16 at 23:45
A lot of people mix threads with "tasks", introducing logical threads or logical concurrency does not improve the situation. In the end you confuse your self: locked execution vs. independent instruction sequences exclude each other. Yeah you mention it. Also the assumption that you gain efficiency is questionable. – knivil Mar 15 '16 at 23:53
@knivil, I addressed the last point, although I'm not sure how to make the first point more clear, given how much confusion there already is on this topic on the internet. – merlin2011 Mar 16 '16 at 00:39

score 1 · Answer 3 · answered Mar 16 '16 at 16:40

Are threads more efficient?

No. But see final note (below).

On a single core, threads are much, much less efficient (than function/method calls).

As one example, on my Ubuntu 15.10(64), using g++ v5.2.1,

a) a context switch (from one thread to the other) enforced by use of std::mutex takes about 12,000 nanoseconds

b) but invoking 2 simple methods, for instance std::mutex lock() & unlock(), this takes < 50 nanoseconds. 3 orders of magnitude! So context switch vx function call is no contest.

The purpose of thread is to run at the same time, right?

Yes ... but this can not happen on a single core processor.

And on a multi-core system, context switch time can still dominate.

For example, my Ubuntu system is dual core. The measurement of context switch time I reported above uses a chain of 10 threads, where each thread simply waits for its input semaphore to be unlock()'d. When a thread's input semaphore is unlocked, the thread gets to run ... but the brief thread activity is simply 1) increment a count and check a flag, and 2) unlock() the next thread, and 3) lock() its own input mutex, i.e. wait again for the previous task signal. In that test, the thread we known as main starts the thread-sequencing with unlock() of one of the threads, and stops it with a flag that all threads can see.

During this measurement activity (about 3 seconds), Linux system monitor shows both cores are involved, and reports both cores at abut 60% utilization. I expected both cores at 100% .. don't know why they are not.

Can someone explain why we use mutexes within thread functions? Thank you!

I suppose the most conventional use of std::mutex's is to serialize access to a memory structure (perhaps a shared-access storage or structure). If your application has data accessible by multiple threads, each write access must be serialized to prevent race conditions from corrupting the data. Sometimes, both read and write access needs to be serialized. (See dining philosophers problem.)

In your code, as an example (although I do not know what system you are using), it is possible that std::cout (a shared structure) will 'interleave' text. That is, a thread context switch might happen in the middle of printing a "hello", or even a 'hi'. This behaviour is usually undesired, but might be acceptable.

A number of years ago, I worked with vxWorks and my team learned to use mutex's on access to std::cout to eliminate that interleaving. Such behavior can be distracting, and generally, customers do not like it. (ultimately, for that app, we did away with the use of the std trio-io (cout, cerr, cin))

Devices, of various kinds, also might not function properly if you allow more than 1 thread to attempt operations on them 'simultaneously'. For example, I have written software for a device that required 50 us or more to complete its reaction to my software's 'poke', before any additional action to the device should be applied. The device simply ignored my codes actions without the wait.

You should also know that there are techniques that do not involve semaphores, but instead use a thread and an IPC to provide serialized (i.e. protected) resource access.

From wikipedia, "In concurrent programming, a monitor is a synchronization construct that allows threads to have both mutual exclusion and the ability to wait (block) for a certain condition to become true."

When the os provides a suitable IPC, I prefer to use a Hoare monitor. In my interpretation, the monitor is simply a thread that accepts commands over the IPC, and is the only thread to access the shared structure or device. When only 1 thread accesses a structure, NO mutex is needed. All other threads must send a message (via IPC) to request (or perhaps command) another structure change. The monitor thread handles one request at a time, sequentially out of the IPC.

Definition: collision

In the context of "thread context switch' and 'mutex semaphores', a 'collision' occurs when a thread must block-and-wait for access to a resource, because that resource is already 'in use' (i.e. 'occupied'). This is a forced context switch. See also the term "critical section".

When the shared resource is NOT currently in use, no collision. The lock() and unlock() cost almost nothing (by comparison to context switch).

When there is a collision, the context switch slows things down by a 'bunch'. But this 'bunch' might still be acceptable ... consider when 'bunch' is small compared to the duration of the activity inside the critical section.

Final note ... With this new idea of 'collision':

a) Multiple threads can be far less efficient in the face of many collisions.

For unexpected example, the function 'new' accesses a thread-shared resource we can call "dynamic memory". In one experience, each thread generated 1000's of new's at start up. One thread could complete that effort in 0.5 seconds. Four threads, started quickly back-to-back, took 40 seconds to complete the 4 start ups. Context switches!

b) Multiple threads can be more efficient, when you have multiple cores and no / or few collisions. Essentially, if the threads seldom interact, they can run (mostly) simultaneously.

Thread efficiency can be any where between a or b, when multiple cores and collisions.

For instance, my ram based "log" mechanisms seems to work well - one mutex access per log entry. Generally, I intentionally used minimal logging. And when debugging a 'discovered' challenge, I added additional logging (maybe later removed) to determine what was going wrong. Generally, the debugger is better than a general logging technique. But sometimes, adding several log entries worked well.

score 0 · Answer 4 · edited Apr 20 '16 at 20:33

0

Threads theoretically run simultaneously, it means that threads could write to the same memory block at the same time. For example, if you have a global var int i;, and two threads try to write different values at same time, which one value remains in i?

Mutex forces synchronous access to memory, inside a mutex block (mutex.lock & mutex.unlock) you warrant synchronous memory access and avoid memory corruption.

When you call mtx.lock(), JUST ONE THREAD KEEPS RUNNING, and any other thread calling the same mtx.lock() stops, waiting for mtx.unlock call.

edited Apr 20 '16 at 20:33

Mogsdad

44,709
21
151
275

answered Mar 15 '16 at 23:22

Ing. Gerardo Sánchez

1,607
15
14

When calling `mtx.lock()`, only threads that also call `lock()` on the same `mtx` object will be blocked until `unlock()` is called. Other threads will happily keep running unblocked. – Remy Lebeau Mar 15 '16 at 23:40

Mutex vs. standard function call

4 Answers4

Related