4

I'm trying to write an interpreter for LOLCODE that reads escaped strings from a file in the form:

VISIBLE "HAI \" WORLD!"

For which I wish to show an output of:

HAI " WORLD!

I have tried to dynamically generate a format string for printf in order to do this, but it seems that the escaping is done at the stage of declaration of a string literal.

In essence, what I am looking for is exactly the opposite of this question: Convert characters in a c string to their escape sequences

Is there any way to go about this?

Community
  • 1
  • 1
peteykun
  • 716
  • 9
  • 21
  • `if(str[i] == '\\') { switch(str[++i]) { case 'a': printf("\a"); break; ... } }` Well, this seems to be the easiest way of going about doing things, but probably doesn't deal with all the escape characters. Is there a more elegant way? – peteykun Feb 14 '13 at 10:29

1 Answers1

3

It's a pretty standard scanning exercise. Depending on how close you intend to be to the LOLCODE specification (which I can't seem to reach right now, so this is from memory), you've got a few ways to go.

Write a lexer by hand

It's not as hard as it sounds. You just want to analyze your input one character at a time, while maintaining a bit of context information. In your case, the important context consists of two flags:

  • one to remember you're currently lexing a string. It'll be set when reading " and cleared when reading ".
  • one to remember the previous character was an escape. It'll be set when reading \ and cleared when reading the character after that, no matter what it is.

Then the general algorithm looks like: (pseudocode)

loop on: c ← read next character
  if not inString 
    if c is '"' then clear buf; set inString
    else [out of scope here]
  if inEscape then append c to buf; clear inEscape
  if c is '"' then return buf as result; clear inString
  if c is '\' then set inEscape
  else append c to buf

You might want to refine the inEscape case should you want to implement \r, \n and the like.

Use a lexer generator

The traditional tools here are lex and flex.

Get inspiration

You're not the first one to write a LOLCODE interpreter. There's nothing wrong with peeking at how the others did it. For example, here's the string parsing code from lci.

JB.
  • 40,344
  • 12
  • 79
  • 106