3

This is what I'm trying to implement:

  • A (singleton) array of fixed size (say 1000 elements)

  • A pool of threads writing smaller (<=100) element blocks to that array in parallel

  • We are guaranteed that total writes by all threads in the pool will write <1000 elements, so we never have to grow the array.

  • The order of writes doesn't matter but they have to be contiguous, e.g Thread1 populates array indexes 0-49, Thread 3 indexes 50-149, Thread 2 indexes 149-200

Is there a thread-safe data structure to achieve this?

Clearly, I would need to synchronize the "index manager" which allocates where in the array indexes a given thread needs to write. But is there a Java data structure for the array itself that can be used for this, without worrying about thread safety?

DVK
  • 126,886
  • 32
  • 213
  • 327
  • 2
    NOTE: based on [this](http://stackoverflow.com/questions/6060682/in-java-is-it-required-to-synchronize-write-access-to-an-array-if-each-thread-w?rq=1), I'm thinking that a regular Array should work, but I'm too much of a Java newbie to be certain – DVK May 11 '15 at 15:12
  • 1
    So, writing to an array index offers the same memory semantics as it does to a non-volatile field. If you're looking for a true happens-before ordering, you would lose it writing to single arrays. – John Vint May 11 '15 at 15:15
  • @JohnVint- don't care about true happens-before in this case. I always (through the app semantics) ensure that I won't read the data before it's written. – DVK May 11 '15 at 15:20
  • 2
    In that case, your concerns about thread-safety should be the same as if you were writing to a non-volatile field. If you decide it's OK to do so then writing to an array should be fine too. – John Vint May 11 '15 at 15:21
  • If threads are always going to write on the fixed indexes then IMO there is no concern of thread safety as such. Please correct me I am wrong. – akhil_mittal May 11 '15 at 15:29

2 Answers2

3

You should be able to use an AtomicReferenceArray. You can safely update indexes or atomically update with compareAndSet (though it appears you wont need that).

Editing to address akhil_mittal's question.

Let's switch the train of thought from updating an array to updating individual fields. If you were to update a field in a class the write will occur without word tearing, it won't be the case that the write will be some bits from one thread and some bits from another thread. The same is true for array indexes.

However, if you were to update a field in a class by multiple threads, the write from one thread may not be immediately visible to another thread. That is because the write may be buffered on a processor cache and eventually flushed to the other processors. The same is true for an array write to a particular index. It will be eventually visible but does not guarantee a happens-before ordering.

do we still need to concern about thread safety

You would need to worry about thread-safety the same way you would need to worry about thread-safety for a non-volatile field. It turns out that DVK may not need to worry about the writes being immediately visible.

The point of this answer is to explain that array writes are not necessarily thread-safe and using an AtomicReferenceArray can protect you from delayed writes.

Community
  • 1
  • 1
John Vint
  • 39,695
  • 7
  • 78
  • 108
  • If all the threads are going to write at fixed contiguous locations, do we still need to concern about thread safety? Please clarify :) – akhil_mittal May 11 '15 at 15:35
  • Could be overkill. The OP only asked about _writing_ to the array. Nothing was said about other threads reading the array at the same time. If the goal is only to use N threads to _initialize_ the array, and if no thread will read the array until the initialization is finished, then the only place where synchronization is needed is at the end, when (and to insure that) all of the initialization workers have finished. An AtomicXxxxxxxArray will synchronize every write, and that will slow things down a bit. – Solomon Slow May 11 '15 at 16:21
  • @jameslarge You're right it very well may be. If you're interested as posting that as an answer I think it would be reasonable. It also appears to be the case it is the correct answer according to the OP's comment. – John Vint May 11 '15 at 16:23
1

Your question has been answered already by others so I'll just add examples:

  1. Adding to an array by different threads is the way parallel sort works.
  2. Creating arrays with the Fork/Join framework does so by the work-threads writing to different parts of the array.

Go ahead and do it, you're fine.

edharned
  • 1,884
  • 1
  • 19
  • 20
  • *Creating arrays with the Fork/Join framework does so by the work-threads writing to different parts of the array.* This works because all thread synchronize on the **joining** aspect of fork/join. Just want to avoid confusion as to why ForkJoin works and why I am suggesting the AtomicReferenceArray. – John Vint May 11 '15 at 19:02