I'd like to read the Info of this table (It's always the same style) in C#. It's a plan for teacher substitution and I'd like to integrate this into my time table for school.
Asked
Active
Viewed 5,796 times
1
-
1short answer, you might need to strip and parse the table, if its always constant, it might be a bit faster – mahlatse Dec 06 '18 at 14:17
-
1Sounds like you're looking for something called a "DOM Parser". Perhaps something like HTMLAgilityPack. A Google search should get you started on that. – David Dec 06 '18 at 14:19
3 Answers
4
You can use a third party library like HtmlAgilityPack to parse the data into data that you can use use Linq to query
From this StackOverflow post , the following becomes simpler
tmlDocument doc = new HtmlDocument();
doc.LoadHtml(htmlCode);
var headers = doc.DocumentNode.SelectNodes("//tr/th");
DataTable table = new DataTable();
foreach (HtmlNode header in headers)
table.Columns.Add(header.InnerText); // create columns from th
// select rows with td elements
foreach (var row in doc.DocumentNode.SelectNodes("//tr[td]"))
table.Rows.Add(row.SelectNodes("td").Select(td => td.InnerText).ToArray());
You can create a custom class for your specific table and check the attributes of the tables td/ or headers to know where what maps where and
e.g
var myTableClass = new TableClass();
myTbaleClass.Name = row[0];
.....
that will make things simpler for you.

mahlatse
- 1,322
- 12
- 24
-
But now i get an error. It says that 'headers' doesn't have a value – Jeff Brogan Dec 06 '18 at 16:57
-
Doesnt look like your table has a th elemnt, comment out the th part, you can just work with td, – mahlatse Dec 06 '18 at 17:07
0
Okay. I found out the best solution:
var web = new HtmlWeb();
var doc = web.Load(url);
foreach (HtmlNode table in doc.DocumentNode.SelectNodes("//table"))
{
foreach (HtmlNode row in table.SelectNodes("tr"))
{
temprow = new List<string>();
foreach (HtmlNode cell in row.SelectNodes("td"))
{
temprow.Add(cell.InnerText);
}
rows.Add(temprow);
}
}

Jeff Brogan
- 11
- 1
- 3
-
1I still do not see the reason why my answer was unaccepted if you are using almost the same logic? you opted to use the table element instead of the table data element. – mahlatse Dec 07 '18 at 21:40
0
private DataTable GetHtmlTable (string urlStr, int i) {
DataTable dt = new DataTable();
var web = new HtmlWeb();
var doc = web.Load(urlStr);
HtmlNode table = doc.DocumentNode
.SelectSingleNode(
string.Format(
"//table[{0}]", i
));
// notice the dot
var headers = table.SelectNodes(".//tr/th");
foreach (HtmlNode header in headers)
dt.Columns.Add(
header.InnerText.Replace(
" ", ""
));
// notice the dot
foreach (var row in table.SelectNodes(".//tr[td]"))
dtTable.Rows.Add(
row.SelectNodes("td")
.Select(td => td.InnerText.Replace(
" ", ""
)).ToArray()
);
return dt;
}

sanitizedUser
- 1,723
- 3
- 18
- 33

Wong David
- 1
- 1