-2

I'm trying to fix this regex, it's meant to match any string of characters except unescaped quotation marks and unescaped newline characters:

([^"]|\\"|[^\n]|\\n)*

Would anyone mind helping out?

For example I would want to match:

The cow jumped over the \\"moon

but not:

The cow jumped over the "moon

Same for newlines

sepp2k
  • 363,768
  • 54
  • 674
  • 675
edd91
  • 185
  • 8
  • 3
    Please provide an example string and expected matches. – OrderAndChaos Jul 28 '18 at 08:40
  • I'm not certain what you're asking for but `([^"\n])*` will find any char expect `'` and `\n` see here https://regex101.com/r/OezLhH/1. If you provide the information I asked for maybe we can workout what you need. – OrderAndChaos Jul 28 '18 at 08:49
  • [`(?:\\"|\\\\n|[^"\\n])+`](https://regex101.com/r/OYhppH/1)? – 41686d6564 stands w. Palestine Jul 28 '18 at 08:51
  • @Sarcoma but I'd like to permit escaped quotation marks and newlines. So for example: The cow jumped over \" the moon. would be permitted, but The cow jumped over "the moon. would not – edd91 Jul 28 '18 at 08:55
  • @edd91 Did you check my regex? I'm assuming by "unescaped newline character" you mean `\n` and by "escaped newline character" you mean `\\n`. Is that right? – 41686d6564 stands w. Palestine Jul 28 '18 at 08:59
  • @edd91 Please add that example to your open post. – Paolo Jul 28 '18 at 09:17
  • Hi @AhmedAbdelhameed thanks I did, I'm using these in JLex (documentation https://www.cs.princeton.edu/~appel/modern/java/JLex/current/manual.html#SECTION2.3) and I'm getting a weird error that each expression must be followed by ? + or *. Obviously yours is followed by the +, so I have a suspicion JLex doesn't support the ?: notation – edd91 Jul 28 '18 at 09:19
  • 1
    Is it `The cow jumped over the \\"moon` or `The cow jumped over the \"moon` ? – anubhava Jul 28 '18 at 09:27
  • Shouldn't there be some form of quotes around the whole thing? Also your description should contain how your current regex does not fit your requirements and you should probably mention that you're using JLex in the question, not just the comments - otherwise people are going to keep missing it. – sepp2k Jul 28 '18 at 09:53
  • Thanks @sepp2k I hadn't apprehended at the time of the question it would be important, but I've learnt something now! – edd91 Jul 29 '18 at 03:23

1 Answers1

0

You may use this regex:

^[^"\n\\]*(\\.[^"\n\\]*)*$

RegEx Demo

RegEx Details:

  • ^: Start
  • [^"\n\\]*: Match 0 or more of any char except newline. backslash and "
  • (: Start 2nd group
    • \\: Match literal backslash
    • .: Match any character following a \
    • [^"\n\\]*: Match 0 or more of any char except newline. backslash and "
  • )*: End 2nd group. Match 0 or more of this group
  • $: End
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 1
    The OP mentioned using JLex in comments, so the non-capturing group should just be a group (there are no capturing groups in JLex, so the non-capturing syntax is not needed and not supported). I also don't think OP wants to match whole lines only (at least there's no mention of that), so I wouldn't use anchors. – sepp2k Jul 28 '18 at 10:11