4

I need to read char[] (size is COUNT) from text file from OFFSET with specified Charset. COUNT and OFFSET are in characters, not in bytes. He is my code:

raf = new RandomAccessFile(filePath, "r");      
if ((mBuffer == null) || (mBuffer.length < count)) {
    mBuffer = new byte[(int)(count/mDecoder.averageCharsPerByte())];
    mByteWrap = ByteBuffer.wrap(mBuffer);
    mCharBuffer = new char[count];
    mCharWrap = CharBuffer.wrap(mCharBuffer);
}
try {
    offset = (int)(offset/mDecoder.averageCharsPerByte());
    count = (int)(count/mDecoder.averageCharsPerByte());
    raf.seek(offset);
    raf.read(mBuffer,0,count);
    mByteWrap.position(0);
    mCharWrap.position(0);
    mDecoder.decode(mByteWrap, mCharWrap, true);
} catch (IOException e) {
    return null;
}
return mCharBuffer;

Is there any way easier ? (without manual matching char->byte)

I was looking about java.util.Scanner, but it's Iterator-style, and i need random access-style.

PS data should'n be copied many times

styanton
  • 676
  • 1
  • 9
  • 19

2 Answers2

4

Use BufferedReader's skip() method. In your case:

BufferedReader reader = new BufferedReader(new FileReader(filePath));
reader.skip(n); // chars to skip
// .. and here you can start reading

And if you want specify a particular encoding you can use

InputStream is = new FileInputStream(filePath);
BufferedReader reader = new BufferedReader(new InputStreamReader(is,"UTF-8"));
reader.skip(n); // chars to skip
// .. and here you can start reading
matrixanomaly
  • 6,627
  • 2
  • 35
  • 58
dash1e
  • 7,677
  • 1
  • 30
  • 35
  • 1
    I suppose **BufferedReader** uses default system charset while reading ? How does skip() works (is it just moves file pointer or reads n chars) ? – styanton Apr 11 '12 at 09:31
  • I add to the answer the way you can specify charset. – dash1e Apr 11 '12 at 09:36
  • And if i first need 100-200 characters, then 50-100 ? – styanton Apr 11 '12 at 09:50
  • 2
    With charset encoding where character length is not fixed is complex. Because you have to start from the beginning or from points where you can be sure that new char start. However you can try using `mark()` and `reset()`, and mark immediately the position 0. – dash1e Apr 11 '12 at 09:57
  • I was talking about positioning reader. I've found void reset(). Thanks, dash1e! – styanton Apr 11 '12 at 09:59
  • Looks like BufferedReader().read(char buf[]) usede additional memory, except target buffer. Calling it in cycle makes heap size grow and dalvik (i'm runing under Android) cause GC_FOR_ALLOC. Later OutOfMemory error is thrown. P.S. before every read i call reader.reser(), that is set to the file beginingg, using reader.mark(Integer.MAX_VALUE). I think i should mark by another value. – styanton Apr 24 '12 at 08:05
0

you can use read(byte[] b, int off, int len) of BufferedInputStream

here the off is offset (point from where you want to start reading)

http://docs.oracle.com/javase/7/docs/api/java/io/BufferedInputStream.html#read%28byte[],%20int,%20int%29

Afshin Moazami
  • 2,092
  • 5
  • 33
  • 55
Amit
  • 379
  • 5
  • 15