0

I have character offsets into a file and I am looking for a way to read the characters spanning these offsets. I tried using bufferedReader but the bytes that are returned are from the beginning of the file.

Ideally I want to seek to a certain position in the file and read the specific length of bytes.

Can anyone suggest a alternative solution?

shyamupa
  • 1,528
  • 4
  • 16
  • 24

1 Answers1

4

Fundamentally, this depends on the character encoding.

If you're using UTF-8 and have non-ASCII characters in your text, then a character offset tells you relatively little about the byte offset you'd have to seek to. (File systems are basically about bytes, not characters.)

However, if you're using a fixed-width encoding, you can simply multiple the character offset by the width of a character (in bytes) and then skip to the right part of the file, using InputStream.skip:

  • Construct a FileInputStream for the relevant file
  • Skip to the right part of it
  • Construct an InputStreamReader using the input stream - make sure you specify the encoding!

Again, if you're using a variable-width encoding such as UTF-8, you fundamentally don't get much information from the character offset.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194