1

I writing a software in C/C++ using a lot BIAS/Profil, an interval algebra library. In my algorithm I have a master which divide a domain and feeds parts of it to slave process(es). Those return an int statute about those domain parts. There is common data for reading and that's it.

I need to parallelize my code, however as soon as 2 slave-threads are running (or more I guess) and are both calling functions of this library, it segfaults. What is peculiar about those segfaults, is that gdb rarely indicates the same error line from two builds: it depends on the speed of the threads, if one started earlier, etc. I've tried having the threads yield until a go-ahead from the master, it 'stabilize' the error. I'm fairly sure that it comes from the calls to memcpy of the library (following the gdb backtrace, I always end-up on a BIAS/Profil function calling a memcpy. To be fair, almost all functions call a memcpy to a temporary object before returning the result...). From what I read on the web, it would appear that memcpy() could be not thread-safe, depending on the implementations (especially here). (It seems weird for a function supposed to only read the shared data... or maybe when writing the thread-wise data both threads go for the same memory space?)

To try to address this, I'd like to 'replace' (at least for tests if behavior changes) the call to memcpy for a mutex-framed call. (something like mtx.lock();mempcy(...);mtx.unlock();)

1st question: I'm not a dev/code engineer at all, and lack of lot of base knowledge. I think that as I use a pre-built BIAS/Profil library, the memcpy called is the one of the system the library was built on, correct? If so, would it change anything were I to try building the library from source on my system? (I'm not sure I can build this library hence the question.)

2nd question: in my string.h, memcpy is declared by: #ifndef __HAVE_ARCH_MEMCPY extern void * memcpy(void *,const void *,__kernel_size_t); #endif and in some other string headers (string_64.h, string_32.h) a definition of the form: #define memcpy(dst, src, len) __inline_memcpy((dst), (src), (len)) or some more explicit definition, or just a declaration like the one quoted. It's starting to get ugly but, ideally, I'd like to create a pre-processor variable #define __HAVE_ARCH_MEMCPY 1, and a void * memcpy(void *,const void *,__kernel_size_t) which would do the mutex-framed memcpy with the the dismissed memcpy. The idea here is to avoid messing with the library and make it work with 3 lines of code ;)

Any better idea? (it would make my day...)

luneart
  • 182
  • 1
  • 8

3 Answers3

2

IMHO you shouldn't concentrate to the memcpy()s, but to the higher level funktionality.

And memcpy() is thread-safe if the handled memory intervals of the parallel running threads don't overlap. Practically, in the memcpy() is there only a for(;;) loop (with a lot of optimizations) [at least in glibc], it is the cause, why is it in declared.

If you want to know, what your parallel memcpy()-ing threads will do, you should imagine the for(;;) loops which copy memory through longint-pointers.

peterh
  • 11,875
  • 18
  • 85
  • 108
1

Given that your observations, and that the Profil lib is from the last millennium, and that the documentation (homepage and Profil2.ps) do not even contain the word "thread", I would assume that the lib is not thread safe.

1st: No, usually memcpy is part of libc which is dynamically linked (at least nowadays). On linux, check with ldd NAMEOFBINARY, which should give a line with something like libc.so.6 => /lib/i386-linux-gnu/libc.so.6 or similar. If not: rebuild. If yes: rebuilding could help anyway, as there are many other factors.

Besides this, I think memcpy is thread safe as long as long as you do never write back data (even writing back unmodified data will hurt: https://blogs.oracle.com/dave/entry/memcpy_concurrency_curiosities).

2nd: If it turns out that you have to use a modified memcpy, also think about LD_PRELOAD.

not-a-user
  • 4,088
  • 3
  • 21
  • 37
  • Agreeing with you on the old age of Profil, but not my choice. (and they did do a x64 version ~3years ago, maybe a thread-safe one in 10 years? ;) ) 1 => `ldd libBias.a` answers `not a dynamic executable`. Idem for the other libs, so I'm going to try rebuilding. The memcpy should only read the global data, and write local one... I found the link you're quoting (and quoted myself ;) ) and he says that even while reading the memcpy was doing strange stuff. 2 => I didn't know about that, thanks a lot. let's say last resort... Thank you lot guys for your reactivity! – luneart Nov 13 '13 at 17:23
  • The `.a` ending of `libBias.a` says that the lib is statically linked. So `memcpy` is the one from when and where the lib was built and is built into it. In this case you should rebuild. `LD_PRELOAD` does not work in this case. – not-a-user Nov 13 '13 at 17:30
  • Sorry, that is wrong: Still `memcpy` is not contained and will be linked in when you build your final program. (To confirm this: there is no `memcpy.o` if you do `ar t libBias.a`.) – not-a-user Nov 13 '13 at 17:37
  • @ryyker: I had troubles finding sufficient documentation on thread-safety of glibc. However `memcpy` **is** safe in some [implementations](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0492c/Chdiedfe.html). How can you be sure it is **not** here? – not-a-user Nov 14 '13 at 10:02
0

In general, you must use a critical section, mutex, or some other protection technique to keep multiple threads from accessing non thread safe (non- re-entrant) functions simultaneously. Some ANSI C implementations of memcpy() are not thread safe, some are. ( safe, not safe )

Writing functions that are thread-safe, and/or writing threaded programs that can safely accommodate non thread safe functions is a substantial topic. Very doable, but requires reading up on the topic. There is much written. This, will at least help you to start asking the right questions.

ryyker
  • 22,849
  • 3
  • 43
  • 87
  • I do use mutexes, but I can't go into some library and start putting mutexes everywhere. What I could do in theory is put mutexes each time I'm using a function from the library, but given that I use it a lot, it would pretty much amount to do it sequentially. Thanks anyway for your response and reactivity! – luneart Nov 13 '13 at 17:34
  • I am not suggesting changing the library. However, it is important that you protect any non-thread-safe function that is used out of one of these libraries, such as some implementations of memcpy() (or any function that calls a non-re-entrant function) by using one of the techniques I mentioned above. Tokens, or thread safe variables are also options. (You do not modify the library to do these things, rather they are done in the application that calls the library. – ryyker Feb 04 '16 at 03:02
  • Actually calling memcpy() (as most other functions in the standard library) is threadsafe in c++11 (it might not be reentrant, but those are two different things). Of course, if you use memcpy to copy memory to/from a location that is accessed by another thread, then this operation introduces a datarace, but that is true for any function that modifies memory. The important part is that it doesn't introduce hidden dataraces (e.g. by accessing some static buffer). – MikeMB Feb 05 '16 at 09:47