2

I am trying to return the html scipt of a website giving input the url. The problem is that the script returns part of html and not the whole one. The specific problem is the website http://www.4xinvestmentgroup.com. So, do you have any idea for a possible problem?

Firstly I tried the following script:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            string url = "http://www.4xinvestmentgroup.com";

            HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
            HttpWebResponse res = (HttpWebResponse)req.GetResponse();

            StreamReader sr = new StreamReader(res.GetResponseStream(), Encoding.GetEncoding(res.CharacterSet));
            Console.WriteLine(sr.ReadToEnd());
            sr.Close();

            Console.WriteLine("Press enter to close...");
            Console.ReadLine();
        }
    }
}

After that I tried the following script:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            string url = "http://www.4xinvestmentgroup.com";

            WebClient client = new WebClient();
            string reply = client.DownloadString(url);

            Console.WriteLine(reply);

            Console.WriteLine("Press enter to close...");
            Console.ReadLine();
        }
    }
}

In both solutions the returned html is:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/x
html1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="pl" lang="pl">
<head>
      <base href="http://www.4xinvestmentgroup.com/" />
  <meta http-equiv="content-type" content="text/html; charset=utf-8" />
  <meta name="keywords" content="Forex, 4x investment group, 4xinvestmentgroup,
trading, forex analysis, forex market, forex signal provider, Economic Calendar,
trading profit, Exchange Market, Exchange Rates" />
  <meta name="description" content="4x Investment Group, Forex Signal Provider,
Trading services - The Foreign Exchange Market can ensure a Huge Trading Profit.
 " />
  <title>4x Investment Group</title>
  <link href="/index.php?format=feed&amp;type=rss" rel="alternate" type="applica
tion/rss+xml" title="RSS 2.0" />
  <link href="/index.php?format=feed&amp;type=atom" rel="alternate" type="applic
ation/atom+xml" title="Atom 1.0" />
  <link href="/favicon.ico" rel="shortcut icon" type="image/vnd.microsoft.icon"
/>
  <link rel="stylesheet" href="/media/system/css/modal.css" type="text/css" />
  <link rel="stylesheet" href="/templates/gk_finance_business/css/k2.css" type="
text/css" />
  <link rel="stylesheet" href="http://www.4xinvestmentgroup.com/templates/gk_fin
ance_business/css/mobile/handheld.css" type="text/css" />
  <script src="/media/system/js/mootools-core.js" type="text/javascript"></scrip
t>
  <script src="/media/system/js/core.js" type="text/javascript"></script>
  <script src="/media/system/js/mootools-more.js" type="text/javascript"></scrip
t>
  <script src="/media/system/js/modal.js" type="text/javascript"></script>

  <script src="/components/com_k2/js/k2.js" type="text/javascript"></script>
  <script src="/media/system/js/caption.js" type="text/javascript"></script>
  <script src="http://www.4xinvestmentgroup.com/templates/gk_finance_business/js
/mobile/gk.handheld.js" type="text/javascript"></script>


    <meta name="viewport" content="width=device-width, minimum-scale=1.0, maximu
m-scale=1.0" />
    </head>
<body>
    <div id="gkWrap">
        <div id="gkTopWrap">
                                        <h1 id="gkHeader" class="cssLogo">
                   <a href="/./">4x Investment Group</a>
                          </h1>

                                <a href="#" id="gk-btn-switch" ><span>Switch to
desktop</span></a>


                                <a href="http://www.4xinvestmentgroup.com/index.
php?option=com_users&amp;view=login" id="gk-btn-login" ><span>Login</span></a>
                        </div>

        <div id="gkNav">
                <div id="gkNavContent">
                        <select id="gkMenu" onchange="window.location.href=this.
value;">
                        <option  value="/index.php/home-mobile">4x Investment Gr
oup</option><option  value="#">Explore Forex<option  value="/index.php/explore-f
orex-2/benefits-of-trading">&nbsp;&nbsp;&raquo;Benefits of Trading</option><opti
on  value="/index.php/explore-forex-2/risk-statement">&nbsp;&nbsp;&raquo;Risk St
atement</option></option><option  value="#">Forex Tools<option  value="/index.ph
p/forex-tools-2/currency-converter">&nbsp;&nbsp;&raquo;Currency Converter</optio
n></option><option  value="/index.php/4x-investment-group-provider-2">Case Study
 (2)</option><option  value="#">About us<option  value="/index.php/about-us-2/fe
w-words">&nbsp;&nbsp;&raquo;Few words</option><option  value="/index.php/about-u
s-2/contact-form">&nbsp;&nbsp;&raquo;Support</option></option>
        </select>
                </div>
        </div>

        <div id="gkContent">

                <div id="gkMain">

<div id="system-message-container">
</div>

<div class="blog-featured">





</div>


                </div>


                <div id="gkFooter">
                        <p id="gkCopyrights">4x Investment Group Ac 2012. All ri
ghts reserved.</p>

                        <p id="gkOptions">
                                <a href="#gkHeader">Top</a>
                                <a href="javascript:setCookie('gkGavernMobileFin
ance_Business', 'desktop', 365);window.location.reload();">Desktop version</a>
                        </p>
                </div>
        </div>
        </div>

        </body>
</html>

But the original html if you check from a browser is much much richer in details.

Tasos K.
  • 7,979
  • 7
  • 39
  • 63
user3203275
  • 195
  • 2
  • 11
  • 1
    What do you mean by `much much richer in details`? You only download the main page. What about other links such as js/css files, iframes, images etc. – L.B Feb 28 '14 at 15:25

2 Answers2

0

It appears that the request is being redirected to a mobile version of the website. Try setting the user agent string to one used by a desktop browser. For example:

HttpWebRequest

req.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)";

WebClient

client.Headers.Add("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
David Brown
  • 35,411
  • 11
  • 83
  • 132
0

One way to get complete html of this site by using WebBrowser Control.

Create window form application. Add webbrowser control from toolbox. Inside form load event use following code.

    webBrowser1.Navigate("http://www.4xinvestmentgroup.com");

    while(webBrowser1.ReadyState != WebBrowserReadyState.Complete)
    { 
       // just to keep it busy until document is not loaded completely.
       Application.DoEvents();
    }

    string html = webBrowser1.DocumentText;
Irfan TahirKheli
  • 3,652
  • 1
  • 22
  • 36
  • 2
    *if you find yourself needing to call DoEvents anywhere, think about starting another thread instead, or using asynchronous delegates* – L.B Feb 28 '14 at 16:04
  • No, Worse. Since you are now blocking the UI thread, webBrowser1 will never be able to process messages and `webBrowser1.ReadyState` will never be `Complete`. – L.B Feb 28 '14 at 16:32
  • Thanks for the info.. i was not aware of this. Highly appropriated. – Irfan TahirKheli Feb 28 '14 at 17:54
  • @IrfanTahirKheli, read about [busy waiting loop](http://en.wikipedia.org/wiki/Busy_waiting) and [DoEvents implications](http://blogs.msdn.com/b/jfoscoding/archive/2005/08/06/448560.aspx). Instead, wrap `DocumentCompleted` with `TaskCompletionSource`, [example](http://stackoverflow.com/a/21152965/1768303). – noseratio Mar 01 '14 at 01:43