4

Good morning,

I am writing a language parser, and am looking for the best structure to use for a rollback cache which currently does the following:

  • When requesting a new character from the stream, the character is added to the cache, in case a rollback is requested.
  • When a rollback is requested, go back to a certain point in the cache so that when another character is requested, it gets it from there instead.
  • When a token is found, remove everything in the rollback cache up to the current position.

So in short, I would love to know which you feel to be the best data structure for:

  • Priority 1: appending characters (codePoints would be a welcome addition)
  • Priority 2: Doing a substring (such as StringBuilder.delete(...)) on the data structure (or clearing completely)
  • Priority 3: Being able to create a string of the cache (e.g. StringBuilder.toString())

I hope to hear from you soon!

Darkzaelus
  • 2,059
  • 1
  • 15
  • 31

2 Answers2

1

If I were you, for such a specialised use and with possible performance and resource constraints, I'd implement my own buffer from primitives. I think it's more trouble adapting existing structures. Of course, if it didn't hurt, I'd try to conform to the well known relevant interfaces, such as CharSequence, Appendable, List, etc.

entonio
  • 2,143
  • 1
  • 17
  • 27
  • to be honest, i'm not sure i'd know where to start! Should I look at the Ropes implementation and make my own workaround? – Darkzaelus May 14 '11 at 11:53
0

I suspect that a combination of a StringBuilder and PushbackReader will give you what you need. Use the StringBuilder to accumulate characters and create token Strings, and the PushbackReader's mark and reset methods to implement rollback.

Alternatively, pre-read the entire input file as a String and then implement the tokenizer by indexing the String and taking substrings.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • I am currently calling this from within an abstracted BufferedReader which allows me to create the `mark()` and `reset()` methods myself (I needed multiple marks, with a non-finite rollback buffer). I originally had a BufferedReader which read in the code line by line, but this got ousted for two reasons, one that the code could all be on one line, and two because I wanted this rollback class to be extensible and work with the other two languages I have in development. – Darkzaelus May 14 '11 at 10:58