-1

I've got a regular expression that matches strings opening with " and closing with " and can contain \".

The regular expression is this \"".*[^\\]"\".

I don't understand what's the " that is followed after \" and after the [^\\].

Also this regular expression works when I have a \n inside a string but the . rule on flex doesn't match a \n.

I just tested for example the string "aaaaa\naaa\naaaa".

It matched it with no problem.

I made a regex for flex that matches what I need. It's this one \"(([^\\\"])|([\\\"]))*\". I understand how this works though.

Also I just tested my solutions against an "" an empty string. It doesn't work. Also the answers from all those that answered have been tested and don't work as well.

Kostas Dimakis
  • 1,143
  • 1
  • 10
  • 20
  • Your new pattern is false for the same reason I explained in my answer (it is unable to match a literal backslash before the closing quote.) – Casimir et Hippolyte Mar 13 '15 at 13:41
  • @CasimiretHippolyte Yes you are quite right it doesn't work. Although that's no reason to down vote my question as well as my answer. – Kostas Dimakis Mar 13 '15 at 20:02

3 Answers3

0

The pattern is a little naive and even indeed false. It doesn't handle correctly escaped quotes because it assumes that the closing quote is the first one that is not preceded by a backslash. This is a false assumption.

The closing quote can be preceded by a literal backslash (a backslash that is escaped with an other backslash, so the second backslash is no longer escaping the quote), example: "abcde\\" (so the content of this string is abcde\)

This is the pattern to deal with all cases:

\"[^"\\]*(?s:\\.[^"\\]*)*\"

or perhaps (I don't know exactly where you need to escape literal quotes in a flex pattern):

\"[^\"\\]*(?s:\\.[^\"\\]*)*\"

Note that the s modifier allows the dot to match newlines inside the non capturing group.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • Your first regex causes flex an error. Your second regex doesn't match `"aaa\n\"\"\n\\aaa"` or `"aaa\n\"\"\n\\aaa\\"`. The one that I propose matches this `"aaa\n\"\"\n\\aaa"` but not this `"aaa\n\"\"\n\\aaa\\"`. Thank you for your answer. – Kostas Dimakis Mar 13 '15 at 20:25
0

I just figured out everything :P

This \"".*[^\\]"\" works because in flex it means: I want to match something that starts with " and ends with ". Inside these quotes there will be another matching pattern(that's why there are the unexplained ", as I was pondering their existence in my question) that can be any set of any characters, but CANNOT end with \.

What confused me more was the use of ., cause in flex it means that it will match any character except a new line \n. So I was mistakenly thinking that it won't match a string such as "aaa\naaa".

But the reality is it will match it, because when flex reads it will read first \ and then n.

The TRUE newline would be, something like this:

"something like this"

But compilers in -ansi C for example(haven't tested it on other versions other than ansi) do not let you declare a string using in different lines.

I hope my answer is clear enough. Cheers.

Kostas Dimakis
  • 1,143
  • 1
  • 10
  • 20
  • 1
    You can [accept your own answer](http://meta.stackexchange.com/questions/16930/is-it-ok-to-answer-your-own-question-and-accept-it) by clicking on the tick. This remove the question from the unanswered queue. – Brian Tompsett - 汤莱恩 Apr 24 '15 at 13:16
  • @BrianTompsett None of the answers are correct.. Neither mine.. If you have one please share it. – Kostas Dimakis Apr 25 '15 at 14:52
  • Thanks for the clarification; however if your answer is not an answer it should have really been a question edit! The words *"I just figured out everything"* & *"I hope my answer is clear enough"* imply you were happy with it and felt it correct. Yes: I can answer it correctly, now you have clarified. – Brian Tompsett - 汤莱恩 Apr 25 '15 at 16:59
  • @BrianTompsett Thank you, you are right, it should have been an edit. Although at the time I thought I had figured it out. Waiting for your answer :) – Kostas Dimakis Apr 25 '15 at 17:04
  • A flex program containing your regular expression does not work (just tested it) - are you wanting one that does work? If you want your expression explained can you edit the Q to include a whole flex program please. I think you are confused by what you have. – Brian Tompsett - 汤莱恩 Apr 25 '15 at 18:44
  • @BrianTompsett I've figured out excactly what my regex does. It doens't solve the problem though. I've explained my regex in full detail in my (false) answer. What I would like is a solution that works. No need for explanations I think. There's a bunch info in all our answers and comments for people to read. – Kostas Dimakis Apr 25 '15 at 19:59
  • I have several solutions lined up. Your problem specification is not unambiguous. Do you want something that matches strings over several lines? – Brian Tompsett - 汤莱恩 Apr 25 '15 at 20:04
  • @BrianTompsett No, not several lines, as that's wrong. – Kostas Dimakis Apr 25 '15 at 20:13
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/76253/discussion-between-brian-tompsett-and-konstantinos). – Brian Tompsett - 汤莱恩 Apr 25 '15 at 20:16
-1

Your pattern does not match "hello" but it matches ""hello"".

if you want to match anything that is in quotes and may contain \" try something like:

/(\"[\na-zA-Z\\"]*\")/gs
Eduardo Ramos
  • 416
  • 3
  • 8