0

I am trying to replace double nested quotes from string in C# using Regex, but not able to achieve it so far. Below is the sample text and the code i tried -

string html = "<img src=\"imagename=\"g1\"\" alt = \"\">";
string output = string.Empty;
Regex reg = new Regex(@"([^\^,\r\n])""""+(?=[^$,\r\n])", RegexOptions.Multiline); 
output = reg.Replace(html, @"$1");

the above gives below output -

"<img src="imagename="g1 alt = >"

actual output i am looking for is -

"<img src="imagename=g1" alt = "">"

Please suggest how to correct the above code.

Jack
  • 13
  • 3
  • 1
    Be sure you read [this](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags?rq=1) before using regex for manipulating html. – Sriram Sakthivel Aug 15 '14 at 09:44
  • 4
    The above string cant be parsed into XML due to the double nested quotes itself, so suggested thread is not helpful. – Jack Aug 15 '14 at 09:54

1 Answers1

2

Pattern : \s*"\s*([^ "]+)"\s*(?=[">])|(?<=")("")(?=")

Replacement : $1

Here is demo and tested at regexstorm

String literals for use in programs:

@"\s*""\s*([^ ""]+)""\s*(?=["">])|(?<="")("""")(?="")"

To keep it simple and more precised directly focused for src attribute value

Pattern : (\bsrc="[^ =]+=)"([^ "]+")"

Replacement : $1$2

Here is online demo and tested at regexstorm

String literals for use in programs:

@"(\bsrc=""[^ =]+=)""([^ ""]+"")"""

Note: I assume attribute values don't contain any spaces.

Braj
  • 46,415
  • 5
  • 60
  • 76
  • `(=\s*")([^"]*=)"([^"]*")"`, with replacement `$1$2$3` was my refinement of your first regexp, to deal with inner quote containing spaces. This does not handle `=""abc""` but OP gives little indication that one should. – Taemyr Aug 15 '14 at 10:27
  • @user3218114, giving exactly output i required, thanks a lot! – Jack Aug 15 '14 at 10:40