-3

I would like to get source code of entire page including contents that is generated dynamically. I've tried wininet and curl but i just get the contents that rendered in code behind.

Example: enter image description here

As you can see below, the list of people doesn't show up as source.

Page source:

<!DOCTYPE html>
<html>
<head>
    <title>Presto</title>
    <meta charset="utf-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=edge, chrome=1" />
    <meta name="apple-mobile-web-app-capable" content="yes" />
    <meta name="apple-mobile-web-app-status-bar-style" content="black" />
    <meta name="format-detection" content="telephone=no"/>
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />

    <link href="/Content/ie10mobile.css" rel="stylesheet"/>
<link href="/Content/jquery.mobile-1.3.2.css" rel="stylesheet"/>
<link href="/Content/jquery.mobile.structure-1.3.2.css" rel="stylesheet"/>
<link href="/Content/jquery.mobile.theme-1.3.2.css" rel="stylesheet"/>
<link href="/Content/bootstrap.css" rel="stylesheet"/>
<link href="/Content/bootstrap-responsive.css" rel="stylesheet"/>
<link href="/Content/durandal.css" rel="stylesheet"/>
<link href="/Content/toastr.css" rel="stylesheet"/>
<link href="/Content/app.css" rel="stylesheet"/>

    <script type="text/javascript">
        if (navigator.userAgent.match(/IEMobile\/10\.0/)) {
            var msViewportStyle = document.createElement("style");
            var mq = "@-ms-viewport{width:auto!important}";
            msViewportStyle.appendChild(document.createTextNode(mq));
            document.getElementsByTagName("head")[0].appendChild(msViewportStyle);
        }
    </script>
</head>
<body>
    <div id="applicationHost">
        <div class="page-splash"></div>
<div class="page-splash-message">
    Presto
</div>
<div class="progress progress-striped active page-progress-bar">
    <div class="bar" style="width: 100%;"></div>
</div>

    </div>

    <script src="/scripts/jquery-1.9.1.js"></script>
<script src="/scripts/jquery.mobile-1.3.2.js"></script>
<script src="/scripts/knockout-2.2.1.debug.js"></script>
<script src="/scripts/sammy-0.7.4.js"></script>
<script src="/scripts/toastr.js"></script>
<script src="/scripts/Q.js"></script>
<script src="/scripts/breeze.debug.js"></script>
<script src="/scripts/bootstrap.js"></script>
<script src="/scripts/moment.js"></script>

            <script type="text/javascript" src="/App/durandal/amd/require.js" data-main="/App/main"></script>
</body>
</html>
Kitiara
  • 343
  • 6
  • 21
  • 1
    Do you have access to the site's server? – doug Aug 18 '20 at 04:35
  • You would need to know how a page works (in terms of it's API) in order to do that. It seems you're looking for some magic solution but it doesn't exist. – john Aug 18 '20 at 04:42
  • Why do you imagine it should be possible? – n. m. could be an AI Aug 18 '20 at 04:42
  • @n.'pronouns'm. Why not ? I can see the currently active document source by using Chrome. – Kitiara Aug 18 '20 at 04:58
  • 1
    You could use Wireshark to capture all the communication to and from the server. This will give you everything but you won't be able to see the code that generates the server responses. – doug Aug 18 '20 at 05:17
  • "active document source" is called "source" because it is a program that renders your page. but this program is itself an output of another program. you want to read that other program. by what magical means? – n. m. could be an AI Aug 18 '20 at 05:17
  • @doug wireshark? really? you can see the same data with curl or wget. – n. m. could be an AI Aug 18 '20 at 05:18
  • @n.'pronouns'm. By tricking the `another program` and so that it can serve same output like it serve for an actual browser ? Somehow.. – Kitiara Aug 18 '20 at 05:27
  • @doug I'm just interested in the server responses in html way, not the code that generates it obviously. – Kitiara Aug 18 '20 at 05:31
  • 1
    @Kitiara Well then Wireshark should do the job. It's purely a monitoring utility and when used with a separate computer tapping into the ethernet stream it's presence is not detectable. I use it frequently to debug or monitor various devices. – doug Aug 18 '20 at 05:40
  • Please decide what you want. If you want to trick a server to give you data that it doesn't normally give, then it is a criminal offense in many jurisdictions, called unauthorised access to computer systems, or "hacking". There is no universal recipe to do such things. You need to find vulnerabilities and exploit them. If you want to read an http response which the server does normally give, then `curl` or `wget` let you see that. – n. m. could be an AI Aug 18 '20 at 05:40
  • @n.'pronouns'm. I didn't mean `hacking` by tricking. See the example in the post. – Kitiara Aug 18 '20 at 05:42
  • @doug I see, thank you. – Kitiara Aug 18 '20 at 05:48
  • 1
    Oh so you want to see HTML of the page that is generated by scripts in the browser. https://stackoverflow.com/questions/6868577/seeing-html-source-changes-after-javascript-has-acted-upon-it-in-chrome – n. m. could be an AI Aug 18 '20 at 05:48
  • @n.'pronouns'm. Exactly. – Kitiara Aug 18 '20 at 05:49
  • I would try a headless development and testing browser like PhantomJS. This website uses AJAX to manipulate the DOM. It's difficult to apply the DOM changes by hand. I recommend using a browser that can be scripted. There are also tools that convert dynamic websites to static websites. – Thomas Sablik Aug 18 '20 at 08:08
  • 1
    Does this answer your question? [Can you load a web page in c++, including JS and dynamic html and get the rendered DOM string?](https://stackoverflow.com/questions/39340643/can-you-load-a-web-page-in-c-including-js-and-dynamic-html-and-get-the-render) – Thrasher Aug 18 '20 at 10:43
  • @Thrasher Partially. That one use old IE, and there is no way of switching to Edge by using IWebBrowser2. There is this registry trick but it's not actually switching. – Kitiara Aug 19 '20 at 11:25

1 Answers1

-1

I've found 2 solutions; IWebBrowser2 and WebView2.

  1. IWebBrowser2 is using older versions of IE and some websites require higher versions. And there is no way of switching to Edge. There is a registry trick but it's not actually changing the version of IE. So this one is a bit problematic.

  2. WebView2 is using latest version of Microsoft Edge and it's working great. There are several samples around, here it is the one i tried: https://github.com/MicrosoftEdge/WebView2Browser

For some reason WebView2 didn't worked with my current Microsoft Edge version at first then I installed Microsoft Edge Canary Channel to make it work.

Kitiara
  • 343
  • 6
  • 21