5

I have an application in which I am processing a file purely sequentially in Java. The file is opened, read straight through once and then closed.

Currently I using only a File Channel. There is an option to memory map the file. Would there be any advantage to doing this?

Tyler Durden
  • 11,156
  • 9
  • 64
  • 126
  • Probably no. MMapping helps with random access. Have you tried it and measured a benefit? – zapl Oct 07 '15 at 04:32
  • @zapl I would much rather have an expert opinion on this particular question over empirical testing, because the OS interactions for memory mapping are complex and I am not expert enough to understand the different ramifications. Also, different OS's may work differently. Basically I am hoping for someone who is expert in the area to provide an authoritative answer. – Tyler Durden Oct 07 '15 at 04:34
  • http://stackoverflow.com/questions/45972/mmap-vs-reading-blocks covers some of the more theoretical aspects. Channels should equal read since you copy into userspace. Direct file to file copy could leverage DMA and avoid useespace – zapl Oct 07 '15 at 05:03
  • 3
    [The javadoc](http://docs.oracle.com/javase/8/docs/api/java/nio/channels/FileChannel.html#map-java.nio.channels.FileChannel.MapMode-long-long-) says: *For most operating systems, mapping a file into memory is more expensive than reading or writing a few tens of kilobytes of data via the usual read and write methods. From the standpoint of performance it is generally only worth mapping relatively large files into memory.* – JB Nizet Oct 07 '15 at 06:14

1 Answers1

1

This question has to do something with the hardware level.

There is no advantage if the file is processed sequentially. If the file is processed by mapping, there is an extra wastage of memory by allocating memory to some other functions(variables etc) compared to sequentially processing. The role of the memory management is done by a principle called,

Principle of Locality

Programs tend to reuse data and instructions that are close to each other or they have used recently

  1. Temporal Locality

    • Recently referenced items are probably be referenced in the near future
    • A block tend to be accessed repeatedly
  2. Spatial locality

    • Items with nearby addresses tend to be referenced close together in time
    • Near by blocks tend to be accessed

let me give you an example,

sum = 0;
int array[] = new int[10];
for (int i = 0; i < array.length; i++){
  sum += array[i];
}

Data

  • Access array elements in sequential – Spatial locality
  • Reference sum each iteration – Temporal locality

Instructions

  • Reference instructions sequentially – Spatial locality
  • Cycle through loop repeatedly – Temporal locality

Back to question, since the data in the file are written in an ordered manner in the memory(like an array) there is no advantage of mapping because this is carried in the spatial locality way as I described earlier.

Lahiru Jayathilake
  • 601
  • 2
  • 10
  • 28