-1

Using C#, I download a website's HTML source. I want to replace any character between

<span class="comment-name" title="

and

">

I am not sure how I am supposed to do this? I have been trying to use Regex.

Chris Schaller
  • 13,704
  • 3
  • 43
  • 81
games dl
  • 25
  • 1
  • 7
  • 1
    There are many ways to accomplish this, please update with what you have tried. – Trevor Dec 05 '19 at 21:29
  • Always be cautious when trying to [parse HTML with regex](https://stackoverflow.com/a/1732454/4665) . In general you may be better off using an html passer lit HTMLAgility pack. Select the node with class **comment-name** an remove the title attribute. – Jon P Dec 05 '19 at 23:56

2 Answers2

0

Pretty simple, just write a function like this :

string Between(string str, string firstString, string lastString)
{    
 int pos1 = str.IndexOf(firstString) + firstString.Length;
 int pos2 = str.Substring(pos1).IndexOf(lastString);
 return str.Substring(pos1, pos2);
}

Then call it like this :

string myString = Between(mainString, "title=\"", """;

Source Source 2

Software Dev
  • 5,368
  • 5
  • 22
  • 45
  • Of course the body text contains the string title="Something" then that will also be replaced. – Jon P Dec 05 '19 at 23:56
0

If the whole tag is constant (always: <span class="comment-name" title="...">), you can use this Regex pattern: (<span class=\"comment-name\" title=\")[^\"]+(\">)

Then you can replace the text with the first capture group (open tag up to title with quote), the replacement text, then the second capture group (end quote and end tag) like so: $1REPLACE$2 (note: replace the text REPLACE with whatever you need)

This replacement changes: <span class="comment-name" title="..."> to <span class="comment-name" title="REPLACE">

In C#, you can do this in one line:

Regex.Replace(text, "(<span class=\"comment-name\" title=\")[^\"]+(\">)", "$1REPLACE$2");
dvo
  • 2,113
  • 1
  • 8
  • 19