I've inherited code for a website, and this particular function is used to get a description from a website when a part number is given. I've never worked with regular expressions before so this set is a little out of my area, and would like some help figuring out why it's not working properly.
Essentially the ideal operation of this functions is that, when a user of the site inputs a part number in the appropriate field and presses a button, the standard part description, which is gotten from a separate site, is outputted to the user. I inspected the element on the third party site that the regex is trying to match and it's coded as
<span id="ctl00_BodyContentPlaceHolder_lblDescription">Random Description</span>
public static string GetPartHpDescription(string url)
{
// Create a request to the url
HttpWebRequest request = HttpWebRequest.Create(url) as HttpWebRequest;
// If the request wasn't an HTTP request (like a file), ignore it
if (request == null) return null;
// Use the user's credentials
request.UseDefaultCredentials = true;
// Obtain a response from the server, if there was an error, return nothing
HttpWebResponse response = null;
try { response = request.GetResponse() as HttpWebResponse; }
catch (WebException) { return null; }
// Regular expression for an HTML title
// string regex = @"(?<=<body.*>)([Description : HP]*)(?=</body>)";
string regex = "<span [^>]*id=(\"|')ctl00_BodyContentPlaceHolder_lblDescription(\"|')>(.*?)</span>";
string regex1 = "<span [^>]*id=(\"|')ctl00_BodyContentPlaceHolder_gvGeneral_ctl02_lblpartdesc1(\"|')>(.*?)</span>";
// Regex re = new Regex(@"<span\s+id=""ctl00_BodyContentPlaceHolder_lblDescription");
// string regex = @"<span\s+id=""ctl00_BodyContentPlaceHolder_lblDescription"
// If the correct HTML header exists for HTML text, continue
if (new List<string>(response.Headers.AllKeys).Contains("Content-Type"))
if (response.Headers["Content-Type"].StartsWith("text/html"))
{
// Download the page
WebClient web = new WebClient();
web.UseDefaultCredentials = true;
string page = web.DownloadString(url);
// string title = Regex.Match(page, @"<span\s+id=""ctl00_BodyContentPlaceHolder_lblDescription"">.*?</span>", RegexOptions.IgnoreCase).Groups["Title"].Value;
// Extract the title
Regex ex = new Regex(regex, RegexOptions.IgnoreCase);
String data = ex.Match(page).Value.Trim();
if (data == "")
{
Regex ex1 = new Regex(regex1, RegexOptions.IgnoreCase);
data = ex1.Match(page).Value.Trim();
}
return data;
// return title;
}
// Not a valid HTML page
return null;
}
What's currently happening is that if the Part No is not currently in the system database (sql backend) then the function doesn't get the part description properly.