0

When I view the webpage I see the numerical data I want to retrieve. However, when I view the page source the actual numbers are not there. Instead I see the following:

<td class="idc-td-12 idc-t-numeric"
                                        data-bind="numericText:accumulatedDepreciationDepletion, format: '(0.00 a)'">
                                        0,0.00
                                    </td>

It only shows a format, rather than the number. Is there a way I can access this bound data (the actual numbers that I see on the webpage). I am using C#, HtmlAgilityPack

mn2015
  • 1
  • 4
  • It seems like there's some dynamic content that is being injected into the DOM after page load with javascript. Have a look at this: http://stackoverflow.com/a/13708309/1284637 – Nick De Beer Dec 25 '15 at 17:09
  • @Nick De Beer - it seems like that solution requires Microsoft.VisualStudio.TestTools.UITesting.dll which requires Visual Studio Premium or Ultimate. I am using Community Edition. Is there another way? – mn2015 Dec 25 '15 at 17:51
  • For scraping dynamic content you will have to let the page's javascript execute. Unfortunately this means it needs to be rendered and crawled. There are some tools like Selenium webdriver which could help extracting dynamic content, but I don't think its possible with HtmlAgilityPack since it just pulls the actual HTML and doesn't render it. The other option is WinForms app with WebBrowser control that waits for the page to load and then do requests, however that requires that a UI be rendered. – Nick De Beer Dec 25 '15 at 20:22
  • @Nick De Beer - Thank you, I will look into your second option and see if I can figure it out. – mn2015 Dec 26 '15 at 03:08

0 Answers0