4

I have 2 threads that concurrently access the same large file(.txt).

1st Thread is reading from the File. 2nd Thread is writing to the File.

Both threads access the same block e.g. (start:0, blocksize:10), but with different channel & Buffer instances

Reader:

{
     int BLOCK_SIZE = 10;
     byte[] bytesArr = new byte[BLOCK_SIZE];
     File file = new File("/db.txt");
     RandomAccessFile randomFile = new RandomAccessFile(file, "r");
     FileChannel channel = randomFile.getChannel();
     MappedByteBuffer map = channel.map(FileChannel.MapMode.READ_ONLY, 0, BLOCK_SIZE);
     map.get(bytesArr , 0, BLOCK_SIZE);
     channel.close();
}

Writer:

{
     int BLOCK_SIZE = 10;
     File file = new File("/db.txt");
     RandomAccessFile randomFile = new RandomAccessFile(file, "rw");
     FileChannel channel = randomFile.getChannel();
     MappedByteBuffer map = channel.map(FileChannel.MapMode.READ_WRITE, 0, BLOCK_SIZE);
     map.put(bytesToWrite);
     channel.close();
}

I know that if both starts at the same time, I will get Overlapping Exceptions! BUT what I would like to know, at which point exactly the Overlapping is happing? I mean when occurs the "lock" exactly? Example: lets say the writer get access first, then if reader try to access, at which point is it possible?:

 FileChannel channel = randomFile.getChannel();
 // 1- can reader access here?
 MappedByteBuffer map = channel.map(FileChannel.MapMode.READ_WRITE, 0, BLOCK_SIZE);
 // 2- can reader access here?
 map.put(bytesToWrite);
 // 3- can reader access here?
 channel.close();
 // 4- can reader access here?

1, 2, 3 or 4?

No 4 is sure, because the channel is been closed! What about the other points?

Thanks!

Steve C
  • 18,876
  • 5
  • 34
  • 37
Rami.Q
  • 2,486
  • 2
  • 19
  • 30
  • I see no lock in your code. – Chris K Jul 16 '15 at 12:08
  • 1
    Why use multiple threads at all? Some overview of your use case would help us to advise. In general I recommend using only one thread for I/O unless a very specialised situation has occurred. – Chris K Jul 16 '15 at 12:09
  • @ChrisK, i could give you a use case, but are you familiar with JSF ManagedBeans? – Rami.Q Jul 16 '15 at 12:33
  • from my long distant past, yes. Be gentle though, I bruise easily. – Chris K Jul 16 '15 at 12:34
  • @ChrisK, sorry, what did i said that makes you angry? – Rami.Q Jul 16 '15 at 12:39
  • lol nothing at all, you miss read my reply (it was meant tongue in cheek, not angry at all - apologies for the confusion). Please, flesh out the problem that you are working to solve along with its context. – Chris K Jul 16 '15 at 12:41
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/83433/discussion-between-chris-k-and-rami-q). – Chris K Jul 16 '15 at 12:47
  • well it is a bit bad approach to perform operations while the file is opened , even more when the file is opened 2 times and 2 threads will access it at the same time , also this operation will happen per user ? F.E if you have 1000 active user you r going to spawn 2000 threads accessing 1000 files???? – AntJavaDev Jul 16 '15 at 13:57

2 Answers2

3

I am summing up a few notes from a chat conversation with the OP. The OP had the mental model (like most of us) that once a thread writes to a data structure, that data structure is immediately visible to all other threads. In the OPs tests using memory mapped files, he had confirmed that this appeared to be true on a single socket Intel CPU.

Unfortunately this is not true, and is an area where Java can and does show the underlying behaviour of the hardware. Java has been designed to assume that code is single threaded, and can thus be optimised as such until such times as it is told otherwise. What that means will differ by hardware, and version of hotspot (and the statistics that hotspot has collected). This complexity, and running on a single socket Intel CPU invalidated the OPs test.

For further information, the following links will help gain a deeper understanding into the 'Java Memory Model'. And particularly that synchronized does not just mean 'mutual exclusion'; in hardware terms it is also about 'data visibility' and 'instruction ordering'. Two topics that single threaded code take for granted.

Do not worry if this takes time to sink in, and that you feel overwhelmed at first. We all felt like that at first. Java does an amazing job of hiding this complexity, if and only if you follow this one simple rule. When a thread reads or modifies a shared data structure, it must be within a synchronized block. That is, both the writing thread and the reading thread. Obviously I am simplifying, but follow that rule and the program will always work. Break it only if you have a very deep understanding of the Java Memory Model, memory barriers and how it relates to different hardware (and even then concurrency experts even avoid breaking that rule too if they can; going single threaded is often much much simpler and can be surprisingly fast.. many low latency systems are designed to be mostly single threaded for this reason).


To directly answer the OPs question. The sample code from the question has no locks in it. No memory barriers, no concurrency controls at all. Thus the behaviour of how the reads and writes will interact is undefined. They may work, they may not. They may work most of the time. Intel has the strongest memory guarantees of all CPUs, and running the test cases on a single socket Intel CPU would miss a lot of complex bugs. Sun was caught out by this too before Java 5 and JSR 133 came out (read the article on why Double Checked Locking was broken in Java for more detail).

Chris K
  • 11,622
  • 1
  • 36
  • 49
  • thank you soo much for your great help. the links you posted about Java Memory Model and Memory Barrier are very helpfull. in spite of the fact that i was asking about conccurenry on reading/writing the same Block(Portion) of Bytes from/to MappedByteBuffers and all suggestions/answers going in another direction (maybe due to my bad worded Question) , but this lead me to the fact that i have to read more about the Interaction between JVM, Memory, System and Hardware. – Rami.Q Jul 17 '15 at 00:14
  • Is this answer really valid? As far as I know memory mapped files are special. They are handled by the os for instance mmap on posix system. They gurantee some special behaviour. For instance If you change a bit a page fault is generated and the os swaps this page from disk to memory and vice versa. – slowjack2k Mar 02 '17 at 09:09
1

You won't get any locking exceptions from this code, or any blocks either. File locks operate between processes, not between threads. What you need here is synchronization, or semaphores, or ReadWriteLocks. And there's no need to use two channels.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • thanks for your answer, but would you give me a Usecase where Overlapping occurs? – Rami.Q Jul 16 '15 at 12:27
  • @Rami.Q What do you mean by overlapping? By not using any form of memory barrier in your concurrent code, you will have no idea what is going to be visible in the reading thread. It could be the data that was written, it may be the data before it was written to or it could be the data partially written and thus in a corrupt state. – Chris K Jul 16 '15 at 12:46