1

I need to "translate" pascal code (not whole code, just lines line a:=5, or Writeln("a=5?"). In pascal = means "is equal", but my program replaces [^=!<>]=[^=!<>]with ==, so in writeln("a=5?") the = will be also replaced with ==. How to avoid replacing text in quotes? I tried with keyword AND: [^\"].*AND.*[^\"] to &&. Is there any way to do this in a single replace per keyword?

I'm writting in Java.

Xazax
  • 25
  • 2
  • 1
    What tool are you using to do the search and replace? Also, it's not quite clear to me what exactly your requirements are. Please show some examples of what strings you want to match, how their replacements should look like, and which strings you do not want to match. – Tim Pietzcker Nov 06 '10 at 18:38
  • You could split your string into *quoted* and *unquoted* parts and do the replacement just on the *unquoted* parts. – Gumbo Nov 06 '10 at 18:41
  • I believe the question is: *How can I use a bloody callback in Java regular expression so I can match on the generic `[^\"].*(x).*[^\"]` form and have the replacement value put in as appropriate?* –  Nov 06 '10 at 19:30
  • This way I get the string which is not between quotes. I've done it by spliting the string. Thanks and sorry for my english :( – Xazax Nov 06 '10 at 19:40

3 Answers3

0

A negative lookbehind will probably do the trick.

The negative lookbehind for quotes would be something like this: (?<=\")

Here's some further reading on how lookbehind matches work: Lookbehinds

Cody Snider
  • 368
  • 3
  • 5
0

You cannot write a regex to parse Pascal, even this simple subset of it. If you simply look behind for a preceding quote, how do you propose to know that it's an open quote, and not a close quote? Look into a parser generator like ANTLR, or as a lighter weight alternative, a parsing expression grammar like parboiled.

Tom Crockett
  • 30,818
  • 8
  • 72
  • 90
  • Just because you can’t use the standard Java `Pattern` class does not mean that **no** programming language offers patters that are powerful enough for real grammars. Perl, PCRE, and PHP all support recursion of capture groups and other features needed for writing real grammars using patterns. Nobody gives a flip about formalistic textbook definitions of regularity, nor considers capture groups anything but a feature. – tchrist Nov 07 '10 at 03:15
0

I believe the question is: How can I use a bloody callback in Java regular expression so I can match on the generic [^\"].*(x).*[^\"] form and have the replacement value put in as appropriate?

And the answer is -- not very easily using just the standard API. (This one very useful feature is simply missing.)

However, one can execute the regular expression and then use ugly string manipulation on the returned indexes from the Match object. A wrapper method to do all this and incorporate a re-usable interface is like 15 lines.

An actual example can be found here: Java equivalent to PHP's preg_replace_callback

(And pay attention to what others have said about Regular Expressions being incapable of handling the full grammar of pascal.)

Community
  • 1
  • 1
  • It’s just Java that does not support recursive patterns. Perl, PCRE, and even PHP all do. Once you can define things recursively, you can write real grammars. – tchrist Nov 07 '10 at 03:13
  • @tchrist Perl re's are hardly regular ;-) I don't believe that PCRE can cover (all) CFG though. Perl can cheat that though I believe -- at least with the execution blocks. –  Nov 07 '10 at 07:34
  • @tchrist For instance, validate: `((()(())))` given the productions: `S -> SS`, `S -> (S)`, `S -> ()` with a PCRE or PHP regex? –  Nov 07 '10 at 07:40
  • The PCRE manpage, *pcrepattern*, details how to go about writing recursive patterns in its section on RECURSIVE PATTERNS. I give a solution for matching balanced parens in [this answer](http://stackoverflow.com/questions/4031112/regular-expression-matching/4034386#4034386). I don’t use Perl code execution blocks like `(??{...})` to do so, I use pure pattern syntax, `\\((?:[^()]*+|(?0))*\\)`, which should work fine in PCRE, too. I agree that code inserts feel like cheating. We’ve only had fully recursive patterns without them for about 3 years, so maybe you aren’t aware of these. – tchrist Nov 07 '10 at 11:56