4

I am working on a project in C++ that uses self created maps to store data - maps in this sense would be more like a "geographical" map, so an image. There are different threads reading from and writing to it. The data of a map is stored in an std vector of vectors of integers. Its size does not change, only the content of certain pixels through getter and setter functions.

My problem is the following: Sometimes everything works just fine, but more often I get corrupted images, in the sense that the value of a pixel changes sign or become completely different from what they should be. Could this be an issue of the threaded read/write access to the pixels, if so what should I use instead of the standard vectors? I have tried using mutex to ensure that only one thread reads or writes to the vector, however these read/write operations happen so often, that the application becomes too slow if I lock the vector at every operation.

Tamás Szabó
  • 733
  • 6
  • 9
  • You should use a 1D vector. It performs better than a 2D one. – chris Jun 21 '12 at 13:37
  • 1
    Atomic operations could do the trick, seeing how you say "size does not change, only the content of certain pixels through getter and setter functions". Also, try to "partition" the access (i.e. different threads do not modify the exact same pixels at the same time) to minimize cache poisoning. Of course that won't ensure that two adjacent pixels are consistent with each other, but then again locking won't guarantee this either if you modify them in a chaotic way. It _will_ however guarantee that e.g. a value that is incremented and decremented concurrently doesn't get a "weird" result. – Damon Jun 21 '12 at 13:44
  • @Damon: I believe the memory-model defined in C++11 is meant to ensure that effects like you describe with adjacent pixels will not happen. See [here](http://stackoverflow.com/questions/6319146/). – Björn Pollex Jun 21 '12 at 13:59
  • @BjörnPollex: I don't see how it could do that. Properly partitioned (say, thread 1 runs a kernel on pixels 0-999, thread 2 on 1000-1999, thread 3 on 2000-2999) this does not happen anyway. However, imagine "threads reading and writing", i.e. thread 1 updates (atomically, and correctly) values 20 to 50 while thread 2 reads values 30 and 40 as input values to compute value 200. Even if both values are updated correctly, there is no way of telling if the entire set of data is consistent _without a lock_. How should a compiler do this not knowing about what threads may be running at all? – Damon Jun 21 '12 at 17:29
  • On the other hand, imagine threads 1 and 2 both drawing "hills" on the map in more or less the same region (overlapping), i.e. they're increasing the height value at a pixel according to some distance function. They both use atomic increments, and behold it "just works" (although it's a cache nightmare). In this case, the hardware does all the magic. It really depends a lot on whether you write, read-modify-update, or read-modify-update plus read another location. – Damon Jun 21 '12 at 17:43

3 Answers3

6

You will need some kind of locking. To prevent that from hurting your performance too badly, you should try to make the scope of the locks as small as possible. For instance, you could lock individual row-vectors, so that writes on different rows would not interfere with each other. What kind of solution is most appropriate for you depends on your access-patterns and platform.

Björn Pollex
  • 75,346
  • 28
  • 201
  • 283
  • Thanks a lot. I will try to lock only a small portion of my vector for writing - would I need to lock it in the reading operator as well? – Tamás Szabó Jun 21 '12 at 14:33
  • That depends on what you want to do with the data. If you only read, you don't need a lock. If you read from a region that is being written to, you need a lock - but then you can use a [Readers-Writer Lock](http://en.wikipedia.org/wiki/Readers–writer_lock). Maybe in your scenario you can even live without a lock if you only have one writer - you might read inconsistent data at times, but depending on what you do with it, that might or might not be a problem. – Björn Pollex Jun 21 '12 at 18:04
0

When you do multithreading, try to isolate the scope of the threads as much as possible.

Consider and example. Imagine you have a wall and you want to get it painted with black and white stripes. And to speed up the job, you decide to hire two employees.

Now, you could allocate this job in two ways. 1. Assign painting the black stripes to one worker and the white to another worker. 2. Divide the wall into two partitions and assign the left partition to one worker and the right portion to the second worker

Now which one is likely to yield better performance?

In, general case 2nd one is better since the worker's working area is well seperated and one need not wait for the other to do his job. Of course 1st approach is not wrong, but, it is possible that one worker is painting at some place and the second guy might also reach the same place and will have to wait for the first one to finish. Now imagine what would happen if you had 10 workers each is painting only one particular color.

Multithreaded programming is similar. If you can partition your problem data in a better way, you can make the threads to work effectively.

PermanentGuest
  • 5,213
  • 2
  • 27
  • 36
  • Unfortunately in the case of most of the threads this is not possible - there are always two threads working on the same part of the vector. There are more of these thread groups - those are mostly isolated. – Tamás Szabó Jun 21 '12 at 14:35
0
  1. Use lightweight locks. That is on windows use "CriticalSection". Or write your own user space locks e.g. like in in TinyThread (http://tinythreadpp.bitsnbites.eu/). These are written in ASM and should have almost zero overhead to actually lock and unlock.

  2. Once you're sure the locks themselves are fast, if things still run slowly it is because you have lock contention. E.g. multiple threads all needing to lock the same resource. In your usage case consider something like a 'read/write' mutex. This is a class that would have a read mutex, and a write mutex. The "readLock" method on the mutex only locks for a few cycles to increment a reference count on the mutex. "readUnlock" does decrements the reference count. "writeLock" locks the read mutex, and sets a flag that keeps readers from locking the read mutex. And then locks the write mutex and does the write operation. So you guarantee that only one write operation can happen at a time, and that no reads can happen during write. But simultaneous reads are allowed.

Rafael Baptista
  • 11,181
  • 5
  • 39
  • 59
  • Your second option sounds more like a read/write lock, whereas you can have many simultaneous reads mutually exclusive with one write. http://en.wikipedia.org/wiki/Read/write_lock_pattern – Brady Jun 21 '12 at 17:02
  • Yes. That's what it is. Is it wrong to call it a read/write mutex? All write operations are mutually exclusive. – Rafael Baptista Jun 21 '12 at 17:38
  • I wouldnt say its wrong, but its more commonly referred to as a read/write lock. – Brady Jun 21 '12 at 17:53