0

Good Day,

I have some HTML input that I want to do a search and replace operation.

string html = @"
    <div class=""left bottom-margin"">
    <input id=""0086"" maxlength=""29"" data-src=""200:80"" type=""text""><br />
    <input id=""0087"" maxlength=""38"" data-src=""201:80"" type=""text""><br />
    <input id=""0088"" maxlength=""38"" data-src=""202:80"" type=""text""><br />
</div>";    

// Here we call Regex.Match.
Match match = Regex.Match(html, @"(<input.*id=""0087"".*?>)", RegexOptions.IgnoreCase);

// Here we check the Match instance.
if (match.Success)
{
    // Finally, we get the Group value and display it.
    string key = match.Groups[1].Value;
    Console.WriteLine(key);
} else {
    Console.WriteLine("No Match...");
}

This code does work, so far, but I want to be able to provide a parameter to the Regex.Match initialization. Is this possible? What if I wanted to search for 0086 or 0088 as the id? I have a couple hundred tags like this where I want to be able to find the HTML tag by providing a parameter?

I understand that the @ makes the string verbatim.

But I've tried doing this:

// string pattern = "(<input.*id=\"\"0087\"\".*?>)";
// string pattern = "(<input.*id=\"\"" + "0087" + "\"\".*?>)";

This doesn't work either. Most of the Regex.Match samples I've seen use the @ verbatim symbol to do the actual matching. Is my understanding of this correct?

Any suggestions?

John Saunders
  • 160,644
  • 26
  • 247
  • 397
coson
  • 8,301
  • 15
  • 59
  • 84
  • Why do you have pairs of escaped double-quotes in `"()"`? This will result in a regex pattern `()` - certainly not what you want... –  Aug 06 '14 at 01:21

2 Answers2

1

You can't supply a parameter to a regular expression. But you could...Not try to coerce regular expressions into being an HTML parser.

  • If your document contains valid markup, you can load it into a suitable XMLDocument and apply the desired tranformations in any of a number of different ways:
    • either programatically using XPATH queries
    • by traversing the document to find the nodes you're interested in,
    • applying an XSLT transformation.
    • using Linq for XML
  • OR you could install the HTML Agility Pack via NuGet, load your document into an HTmlDocument and use its transformation capabilities.

If you're determined to use regular expression, you can

  • Build your regular expression on the fly, something like

    Regex ConstructRegex( int id )
    {
      string pattern = string.format( @"(<input.*id=""{0:0000}"".*?>)" , id ) ;
      Regex instance = new Regex( pattern ) ;
      return instance
    }
    
  • Make your regular expression generic and supply a MatchEvaluator/Func<Match,string> to apply the desired transformations to each match (if required):

    static readonly Regex rx = new Regex( @"(<input.*id=""(?<id>\d\d\d\d)"".*?>)" ) ;
    
    string Transform( string html , Func<string,string> transform )
    {
      string transformed = rx.Replace( html, transform ) ;
      return transformed ;
    }
    

    Which you could use thus:

    string raw    = "some html here" ;
    string cooked = Transform( raw , m => {
        int id = int.Parse( m.Groups["id"].Value ) ;
        string s = Match.Value ;
        if ( id == 86 )
        {
          s = apply_some_transformation_here(m.Value) ;
        }
        return s ;
      }) ;
    
Nicholas Carey
  • 71,308
  • 16
  • 93
  • 135
0

How about this:

string pattern = String.Format(@"(<input.*id=""{0}"".*?>)", "0087");

It looks like it works fine for me.

Actually, even this works as well:

string pattern = @"(<input.*id=""" + "0087" + @""".*?>)";
rfernandes
  • 1,121
  • 7
  • 9
  • I guess that's what happens when you haven't had much sleep in one day. So close and yet so far. thanks, mate. – coson Aug 06 '14 at 16:29