How to calculate total size of a page with additional files and scripts

Question

I would like to ask if it's possible to get programmatically in C#, a specific site content size. By size I mean: the full size of the site including all images and scripts referenced in the head section or body and so on. For example if we have a site http://www.google.com I want o get it's total size including the logo, scripts refered to, and so on as it will be presented to the user not just the main page.

Here is a picture what I mean: (click for full size)

If we use IE Developer tool in IE 9, and start capturing traffic on the network session, than we hit google.com and it shows the total files loaded (.js, .png, and so on) and the time of loading in milliseconds.

I tried to do something similar using a webrequest but i get only 43kb instead of 101 as IE developer tool gets.

Here is the code:

WebRequest request = WebRequest.Create(textBox2.Text.ToString());     
request.Credentials = CredentialCache.DefaultCredentials;           
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream dataStream = response.GetResponseStream();      
StreamReader reader = new StreamReader(dataStream);          
string responseFromServer = reader.ReadToEnd();         
byte[] bytes = Encoding.ASCII.GetBytes(responseFromServer);
MessageBox.Show(ConvertSize(responseFromServer.Length) + "  -  " + responseFromServer.Length.ToString());
reader.Close();
dataStream.Close();
response.Close();

How can I get the total size of a site including all images, js and additional files used/referenced in that specific page? Thanks a lot!

I would guess that google may well deliver different content based on what it thinks you can handle. When I look at the source of the google homepage (on FF) and just do a character count its got just over 100k characters which is a little higher than IE told you. I would guess that your WebRequest method really is getting 43k of file. Try it again with proper browser impersonation (ie setting user agent, etc.) and see if you get a different sized file... And of course google does show you different content logged in compared to not... — Chris, Jul 23 '12 at 14:32

score 0 · Answer 1 · answered Jul 23 '12 at 14:35

0

Your WebRequest is just getting the HTML. It's not parsing to fetch any referenced files (images, CSS, javascript includes, etc). Controls such as the WebBrowser control can allow you to automate a browser

answered Jul 23 '12 at 14:35

podiluska

50,950
7
98
104

Hi, Thanks for answer. Any idea how to do that ? – user1493460 Jul 23 '12 at 14:36
That's an explanation, but not an answer to the OP's question. – comecme Jul 23 '12 at 14:52
Some hints here : http://stackoverflow.com/questions/60609/automate-safari-web-browser-using-c-sharp-on-windows – podiluska Jul 23 '12 at 14:53
this is not answer, do not lose time with checking. – ozziem Aug 23 '22 at 12:51

How to calculate total size of a page with additional files and scripts

1 Answers1

Linked