0

As characters are written to a java.io.Writer I want to apply a regular expression to replace certain characters.

I don't want to buffer the whole thing into memory, it could be very big and it's performance sensitive.

I'm trying with a moving CharBuffer and a Pattern, but the problem is that any sizing or flushing of the CharBuffer short of the entire character stream may be incorrect.

Can a Pattern know what index in a CharSequence is the latest where it could never find, even with more input? e.g. given a regex [ab]c and an input ab the index of a is the last spot that could never find, so it should be safe to flush a, but I still need b because the next character might be c. I've played with Matcher.hitEnd() and Matcher.requireEnd() to try and achieve this, but they don't give me enough info to know what bit is safe to flush.

I'm wondering if, given the power of regular expressions, this is actually a provably unsolvable problem and I just don't know it...

Robert Elliot
  • 1,372
  • 13
  • 20
  • 1
    Theoretically you could implement `FilterWriter` but without knowing the *specifics* of what you're wanting to do I can't really say more at this point – g00se Apr 24 '23 at 09:31
  • 1
    This would definitely become much more manageable if you restrict it to some subset of regex (or even use some alternate wildcarding). Do you really need the full power of regex for this? Even just limiting regex to match within a single line would help limit the buffer size (but not if there are "infinite length" lines in the stream). – Joachim Sauer Apr 24 '23 at 09:37
  • 1
    [This question is very similar](https://stackoverflow.com/questions/3013669/performing-regex-on-a-stream). – Joachim Sauer Apr 24 '23 at 09:42

0 Answers0