1

I a have a string that contains the code of a webpage.

This is an example:

<input type="text" name="x4B07" value="650"
    onchange="this.form.x8000.value=this.name;this.form.submit();"/>
<input type="text" name="x4B08" value="250"
    onchange="this.form.x8000.value=this.name;this.form.submit();"/>

In that string I want to get the 650 and 250 (these are variables and they change value).

How can I do so?

Example:

name value
x4b08 254
x4b07 253
x4b06 252
x4b05 251
Toni
  • 1,555
  • 4
  • 15
  • 23
Luis
  • 2,665
  • 8
  • 44
  • 70

6 Answers6

2

If you were confident that the markup would never change (and you have a simple snippet like your example line) a regex could get you those values, for example:

Regex re = new Regex("name=\"(.*?)\" value=\"(.*?)\""); 
Match match = re.Match(yourString); 
if(match.Success && match.Groups.Count == 3){ 
    String name = match.Groups[1]; 
    String value = match.Groups[2];
}

Alternatively you could parse the page content and query the resulting document for the elements, and then extract the values. (C# HTML Parser: Looking for C# HTML parser )

Community
  • 1
  • 1
ndtreviv
  • 3,473
  • 1
  • 31
  • 45
  • this only return the first element, how can i return the others? for example create a array? ou list?. and the name is return as x8000" type="hidden – Luis Jan 11 '11 at 11:04
  • To get multiple name/value pairs from a section of document just carry on with if((match = match.NextMatch()) != null){ name = match.Groups[1]; value = match.Groups[2]; } Check it out in the docs: http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.match.aspx and http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.aspx – ndtreviv Jan 11 '11 at 11:20
  • PS: Obviously to create in a list/array or your collection of choice just add the values to that collection as you iterate through. Iteration = Programming 101. – ndtreviv Jan 11 '11 at 11:22
0

You can use regular expressions to match value="([0-9]*)"

Or you can look for the string "value" using string.IndexOf and then take the following few characters.

Ilya Kogan
  • 21,995
  • 15
  • 85
  • 141
0

This should work for you (assuming that s contains the string you want to parse):

string value = s.Substring(s.IndexOf("value=")+7);
value = value.Substring(0, value.IndexOf("\""));
Øyvind Bråthen
  • 59,338
  • 27
  • 124
  • 151
  • and if i want to get this? name="x4B07" value="650" , name and respective value? – Luis Jan 11 '11 at 10:45
  • Regex re = new Regex("name=\"(.*?)\" value=\"(.*?)\""); Match match = re.Match(yourString); if(match.Success && match.Groups.Count == 3){ name = match.Groups[1]; value = match.Groups[2];} - I'm gonna go ahead and add this to my answer so it's more readable! – ndtreviv Jan 11 '11 at 10:55
  • Then you will have to make two lines of code to extract the name as well where you take s.IndexOf("name=") + 6 instead of the code written above. – Øyvind Bråthen Jan 11 '11 at 10:56
  • this only return the first element, how can i return the others? for example create a array? ou list?. and the name is return as x8000" type="hidden – Luis Jan 11 '11 at 11:04
0

How specific are your examples? Could you also want to extract varying length alphabetic strings? Will the strings you want to extract always be properties?

While the regex/substring way works for the specified examples I think they will scale quite badly.

I'd parse the HTML using a parser (see ndtreviv's answer) or possibly with an XML parser (if the HTML is valid XHTML). That way you will get better control and don't have to bleed your eyes out from fidgeting with a bucketload of regex.

Anders Arpi
  • 8,277
  • 3
  • 33
  • 49
  • hi, the page have several name and values, and what i want is something like the example i will insert – Luis Jan 11 '11 at 10:56
0

If you have multiple such controls in the form of string you can create and XmlDocument and iterate through it.

Vinay Pandey
  • 8,589
  • 9
  • 36
  • 54
0

just solved with this

HttpWebRequest req = (HttpWebRequest)WebRequest.Create(URL);
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
Stream st = resp.GetResponseStream();
StreamReader sr = new StreamReader(st);
string buffer = sr.ReadToEnd();

ArrayList uniqueMatches = new ArrayList();
Match[] retArray = null;
Regex RE = new Regex("name=\"(.*?)\" value=\"(.*?)\"", RegexOptions.Multiline);
MatchCollection theMatches = RE.Matches(buffer);

for (int counter = 0; counter < theMatches.Count; counter++)
{
//string[] tempSplit = theMatches[counter].Value.Split('"');

Regex reName = new Regex("name=\"(.*?)\"");
Match matchName = reName.Match(theMatches[counter].Value);

Regex reValue = new Regex("value=\"(.*?)\"");
Match matchValue = reValue.Match(theMatches[counter].Value);

string[] dados = new string[2];
dados[0] = matchName.Groups[1].ToString();
dados[1] = matchValue.Groups[1].ToString();
uniqueMatches.Add(dados);
}

Tks all for the help

Luis
  • 2,665
  • 8
  • 44
  • 70