-1

html source

<form>
<input type="text" name="a" value="a1fa4" type="hidden"/>
<input type="text" name="b" value="b1fa9" type="hidden"/>
<input type="text" name="c" value="c1fd2" type="hidden"/>
<input type="text" name="d" value="d1fx1" type="hidden"/>
</form>

php source

<?php
  preg_match_all('/<input name="(.*?)" value="(.*?)" type="hidden"\/>/i', $form, $input);

  $var = array();

  for($i=0;$i<count($input[1]);$i++){
    $var[$input[1][$i]] = $input[2][$i];
  }
?>

C# source

Match match = Regex.Match(html, "<input name=\"(.*?)\" value=\"(.*?)\" type=\"hidden\"/>", RegexOptions.IgnoreCase );
while (match.Success)
{
    System.Console.WriteLine(" {0} {1} ", match.Value, match.Index);  
}

The php code works, but the c# code does not work. how can I fix the c# code? thanks!

keyser
  • 18,829
  • 16
  • 59
  • 101
user808186
  • 27
  • 2
  • 7
  • Requests for [just code](http://stuck.include-once.org/#help5) are usually off-topic. Primary site intent is coding approaches, not readymade solutions, nor [tutoring](http://stuck.include-once.org/#help6) per se. – hakre Sep 29 '12 at 08:28

2 Answers2

3

If you want to parse your html with a real Html parser instead of regex

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);

var dict =  doc.DocumentNode
       .Descendants("input")
       .ToDictionary(n=>n.Attributes["name"].Value,n=>n.Attributes["value"].Value);
Community
  • 1
  • 1
L.B
  • 114,136
  • 19
  • 178
  • 224
  • thanks again again again! it's worked! I'm add filter type is not hidden element. var dict = doc.DocumentNode .Descendants("input").Where(n => n.Attributes["type"] != null && n.Attributes["type"].Value == "hidden") .ToDictionary(n => n.Attributes["name"].Value, n => n.Attributes["value"].Value); – user808186 Sep 29 '12 at 10:07
1

The problem with your regular expression is you omitted the type=\"text\". The following works:

string html =
    @"<form>
    <input type=""text"" name=""a"" value=""a1fa4"" type=""hidden""/>
    <input type=""text"" name=""b"" value=""b1fa9"" type=""hidden""/>
    <input type=""text"" name=""c"" value=""c1fd2"" type=""hidden""/>
    <input type=""text"" name=""d"" value=""d1fx1"" type=""hidden""/>
    </form>";

foreach(Match match in Regex.Matches(html, 
    "<input type=\"text\" name=\"(.*?)\" value=\"(.*?)\" type=\"hidden\"/>", 
        RegexOptions.IgnoreCase))
{
    // Group 0 is the string matched so get groups 1 and 2.
    System.Console.WriteLine("Name={0} Value={1} ", match.Groups[1].Value, 
        match.Groups[2].Value);
}

However, as L.B says, use a dedicated HTML parser instead of regular expressions because HTML is not guaranteed to be valid XML, may contain different layouts and encodings and so on.

If you must use regular expressions, they need to be a lot more flexble. For example there may be more or different whitespace between attributes and elements.

akton
  • 14,148
  • 3
  • 43
  • 47