2

I have a string from which I need to remove some characters that end in backslash doublequote. There are multiple matches. I have it to where it ALMOST works, except I can't get rid of the last backslash double quote (\") in each place that the namespace occurs.

I went to regexpal.com and came up with this regex string that does what I want.

xmlns=*.+be/\\"

But when I put it in C# the two backslashes make it grab way too much. This code repeats my issue and shows my progress:

var str = "<Request>  <sender xmlns=\"http://stuff.otherstuff.be/\">    <name>Sender name</name>    </sender>  <addressee xmlns=\"http://some.stuff.be/\"> </addressee>  <networkType xmlns=\"http://yet.more.stuff.be/\">11</networkType></Request>";

str = Regex.Replace(str, @"xmlns=.*?\.be/", "", RegexOptions.IgnoreCase);

I wind up with a string that looks like this. I need to modify the regex a bit to also catch the backslash and double quote

<Request>  
    <sender \">    
         <name>Sender name</name>    
    </sender>  
    <addressee \"> 
    </addressee>  
    <networkType \">11</networkType>
</Request>

I've tried various combinations of multiple backslashes and multiple double quotes but am not getting it.

I have looked at a lot of answers here and elsewhere, and haven't figured it out, so a "has duplicate" isn't really going to help me.

EDIT: At this point in the code all I have is a string that came from a serialized class. I don't really want to load the string into and XMLDocument and do recursive calls like in the possible answer shown. A quick regex replace should get me what I need in 1 statement.

EDIT: The answer with adding two doublequotes does not help me because it ignores the final backslash that I'm trying to get rid of.

Wai Ha Lee
  • 8,598
  • 83
  • 57
  • 92
Brad Boyce
  • 1,248
  • 1
  • 17
  • 34
  • You `str` looks like ` Sender name 11`. There are no ``\"`` literal strings inside. Is that your input? – Wiktor Stribiżew Sep 17 '15 at 22:31
  • my bad. The backslash should be in .be/\" – Brad Boyce Sep 17 '15 at 22:32
  • Try `str2 = Regex.Replace(str2, @"\s*xmlns=.*?\.be/""", "", RegexOptions.IgnoreCase);`. However, I guess there are better ways to remove XML namespaces. – Wiktor Stribiżew Sep 17 '15 at 22:35
  • actually, no - it is correct. The backslash doublequote .be/\" is there in the string 3 times. I see it here in my question and in my code when I run. Not sure why it doesn't show for what you posted – Brad Boyce Sep 17 '15 at 22:36
  • This is coming from a xml serialized class that I don't have control over. If there is a better way, I'd like to know about it. – Brad Boyce Sep 17 '15 at 22:37
  • 2
    Isn't [this post](http://stackoverflow.com/questions/987135/how-to-remove-all-namespaces-from-xml-with-c) helpful? If you decide that yes, and decide to use any of the solutions discussed there, just remove this question. – Wiktor Stribiżew Sep 17 '15 at 22:38
  • No, the post isn't helpful, but thank you. – Brad Boyce Sep 18 '15 at 12:24
  • Not fixed. Still there. Not sure why I got downvoted. It's a real issue and I'm being as clear as I can. It needs to be a string replace and it ends in a backslash doublequote. – Brad Boyce Sep 18 '15 at 12:28
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/90023/discussion-between-stribizhev-and-brad-boyce). – Wiktor Stribiżew Sep 18 '15 at 12:30

2 Answers2

2

You need to add the trailing quote like this (if using the @ syntax you must use "" to match a one quote):

str = Regex.Replace(str, @"xmlns=.*?\.be/""", "", RegexOptions.IgnoreCase);

Add a space at the beginning if you want <sender> instead of <sender >:

str = Regex.Replace(str, @" xmlns=.*?\.be/""", "", RegexOptions.IgnoreCase);
Matthew Strawbridge
  • 19,940
  • 10
  • 72
  • 93
  • Thank you for the answer, but I really do need to match a backslash before the final doublequote. I need to match this in multiple places xmlns=\"http://stuff.otherstuff.be/\" – Brad Boyce Sep 17 '15 at 22:43
  • i don't understand it the slashes are being wiped out. the string to be matched should end in [period lowercase b lowercase e forwardslash backslash doublequote] – Brad Boyce Sep 17 '15 at 22:44
  • @mathew strawbridge , can you show me a regex string that will match a set of characters that ends in forwardslash, backslash, doublequote? – Brad Boyce Sep 18 '15 at 12:24
0

Note that to remove XML namespaces, you can use regular C# code described at How to remove all namespaces from XML with C#?, but since you say that does not help, here is a solution for your special case.

In order to remove any slashes you may use a character class [/\\] - just in case you have both \ and /. Note that a literal backslash must be doubled in a verbatim string literal.

The regex will look like

\s*xmlns=[^<]*?\.be[/\\]"

Here is a regex demo

And in C#:

var rx = new Regex(@"\s*xmlns=[^<]*?\.be[/\\]""");

The \s* will "trim" the whitespace in the resulted replacement.

Results after replacing with string.Empty:

enter image description here

Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563