83

I am looking for a memory stream implementation in Java. The implementation should be roughly modeled after the .NET memory stream implementation.

Basically I would like to have a class MemoryStream which has to factory methods:

 class MemoryStream {
     MemoryInput createInput();
     MemoryOutput createOutput();
 }

 class MemoryInput extends InputStream {
    long position();
    void seek(long pos);
 }

 class MemoryOutput extends OutputStream {
    long position();
    void seek(long pos);
 }

So once I have an instance from the class MemoryStream I should be able to concurrently simultaneously create input and output streams, which should also allow positioning in any direction. The memory stream need not be circular, it should work for small sizes well and automatically grow. The memory stream need only be confined into one process.

Any out of the box code available?

Massimiliano Kraus
  • 3,638
  • 5
  • 27
  • 47

4 Answers4

129

ByteArrayInputStream and ByteArrayOutputStream is what you are looking for.

These are implementations of the interfaces InputStream and OutputStream that read from and write to a byte array in memory. For ByteArrayOutputStream, the array will grow automatically as you write data to the stream.

approxiblue
  • 6,982
  • 16
  • 51
  • 59
Jesper
  • 202,709
  • 46
  • 318
  • 350
  • `ByteArrayInputStream` supports `mark()` and `reset()` to mark a position in the stream so that you can jump back there later. `ByteArrayOutputStream` doesn't have this. Peter Lawrey's suggestion, using NIO `ByteBuffer`, is probably more useful. – Jesper Dec 08 '11 at 22:12
  • Does this allow positioning in any direction? –  Dec 08 '11 at 22:15
  • Well these ByteBuffers, I am not yet sure. Problem is a common understanding of the non-functional requirements: Frequency, amount of data and type of access. Depending they could be a good or bad idea. –  Dec 08 '11 at 22:16
  • 5
    So the final solution (without random positioning) `ByteArrayOutputStream inMemoryStream = new ByteArrayOutputStream(); /* write into stream */; ByteArrayInputStream inputStream = new ByteArrayInputStream(inMemoryStream.toByteArray()); /* read from the inputStream */` – Ilya Serbis Nov 18 '17 at 10:19
9

Does it need to support the Input and Output Streams? If not I would just use a ByteBuffer which allows you to read/write primitive types at random locations. (Up to 2 GB)

You can share a ByteBuffer between a reader and a writer.

e.g.

// 1 GB of virtual memory outside the heap.
ByteBuffer writer = ByteBuffer.allocateDirect(1024*1024*1024); 
ByteBuffer reader = writer.slice();

You can share memory between threads (e.g. Exchanger) and processes (using memory mapped files)

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • How do I use for small sizes? Will it automatically grow? –  Dec 08 '11 at 20:01
  • It doesn't grow automatically as such. However if you make direct buffers much larger than you need but you don't use it, the OS doesn't allocate the memory to your process. – Peter Lawrey Dec 08 '11 at 20:05
  • Aha, ok, I didn't know. Is this guaranteed across OS and JVM? –  Dec 08 '11 at 20:07
  • Yeah, Input / Output is mandatory. So that I can plug into an existing application. –  Dec 08 '11 at 20:08
  • Yes, application actually does random access via RandomAccessFile not via the input / output streams. But from the RandomAccessFile it spawns input / output streams. But to make matters not too complicated I posted the above interface spec. –  Dec 08 '11 at 20:13
  • You want an in-memory RandomAccessFile? Can you use the `tmpfs` file system or similar? You wouldn't even need to change your code. – Peter Lawrey Dec 08 '11 at 20:15
  • Actually Idea is to replace temporary files via the memory streams. So that temporary file names are not seen and need not managed. –  Dec 08 '11 at 20:16
  • You can can make them deleteOnExit() and put them in a hidden directory. – Peter Lawrey Dec 08 '11 at 20:22
  • Agreed. Then there is performance, need the memory streams for high frequency small size manipulations. 10-50 bytes, something like a StringBuilder, but instead of insert() a write(). –  Dec 08 '11 at 20:24
  • In that case, using RandomAccessFile can take 1-3 micro-seconds whereas accessing memory is relatively fast (0.05 - 0.5 us). I would wrap ByteBuffer(s) with the Input/OutputStreams if you have to use those. Its pretty simple. Each ByteBuffer is limited to 2 GB in size, Unfortunately (you have to have a 64-bit JVM to use that much BTW) – Peter Lawrey Dec 08 '11 at 20:28
  • It should also work on Android devices later. Did not yet check whether they have ByteBuffer. I guess so. –  Dec 08 '11 at 20:31
  • They do, but it does provide as much benefit. In Java, ByteBuffers give you lowest level access to memory without using `Unsafe` or JNI. – Peter Lawrey Dec 08 '11 at 20:35
  • Lower than byte[] ? Maybe no GC, or less good GC, and then they are bad for my high frequency 10-50 bytes. –  Dec 08 '11 at 20:37
  • Lower than byte[], it doesn't use the heap (so next to no GC impact) and loading/storing a `long` reduces to a single machine code instruction. I was doing a test today writing/reading 20 million 17 byte updates per second. (with shared memory) How fast do you need it to be? – Peter Lawrey Dec 08 '11 at 20:40
  • On my system, its about 30x faster than using RandomAccessFile.read()/write(). – Peter Lawrey Dec 08 '11 at 20:41
  • How about fragmentation? If it doesn't use the Java GCed heap, it does use some other heap I guess. And maybe it is paged, so that 10-50 bytes land in 512 byte pages or so. I am little skeptical, but maybe I will give it a try. –  Dec 08 '11 at 20:44
  • 30x faster than using RandomAccessFile.read()/write(). --> so we have a lower bound for our memory streams... –  Dec 08 '11 at 20:45
  • I have to keep the data for a week, so it gets paged by the OS. The page size is 4KB on most systems. – Peter Lawrey Dec 08 '11 at 21:26
  • You can write a lot of 50 byte updates in a few TB of disk space. ;) – Peter Lawrey Dec 08 '11 at 21:37
  • Ah I guess there is the misunderstanding. Each 50 bytes will need a separate memory stream. This memory stream will have 3-4 input/output streams on it for a short time accessed by multiple threads, then the memory stream will go away totally. But in parallel there maybe >1000 of memory stream objects or so. –  Dec 08 '11 at 22:10
  • The memory streams are the jars and the bytes are the cookies. There will be many many jars having a short lifecycle. Greetings from the Cookie Monster. –  Dec 08 '11 at 22:13
  • Were these all RandomAccessFiles before?? Sounds like you do need to change it. ;) Do the Input and Output streams need to be multi-threaded or or each thread have its own view of the data. If each stream is 50 bytes and multi-threaded, it appears you should let the GC do the cleanup. You can use a `byte[]` and create a wrapper for each input and output stream. There is will be significant overhead and garbage but no where near as bad as using a RandomAccessFile each. BTW: Passing tiny pieces of work between threads is often very inefficient and using one thread can be much faster. – Peter Lawrey Dec 09 '11 at 07:58
8

You can use PipedInputStream and PipedOutputStream

like this:

PipedOutputStream outstr = new PipedOutputStream();
PipedInputStream instr = new PipedInputStream(outstr);

that won't directly allow you to seek, but it does allow you to skip as many bytes you want from the input stream.

Be aware that whenever you write into the outstr it is blocked until everything is read from in instr (that is: if I remember correctly the Streams don't Buffer, but you can decorate them with a BufferedInputStream then you don't have to bother.

Angelo Fuchs
  • 9,825
  • 1
  • 35
  • 72
3

NIO allows you to directly transfer data within kernel memory - I'm not sure if it exactly overlaps with .NET's memory stream. Here's a simple example of mapping an entire file into memory for reading.

Amir Afghani
  • 37,814
  • 16
  • 84
  • 124