0

In my application I am trying to calculate how much HtmlAgilityPack creates data memory flow when loading documents.

For example, if I am loading HtmlDocument to my memory, how to verify what is size of single document, when executing code below? As I belive, it will match the size of memory flow created for http request -> result, am I right?

HtmlDocument doc = await web.LoadFromWebAsync("https://www.somefancywebsite.com/");

What is memory size of the doc object?

I did not managed to find similar topic on SO, and Im sorry if it is something obvious. I am suspecting, that I supposed to convert the document to some other format, where easily I could verify its memory size? Should this do the job correctly? Or should I take it ToArray? Or similar approach?

int memorySize = doc.ToString().Legth //size in bytes??

EDIT: in this article I found, that each string takes 20+(n/2)*4 bytes, where n is quantity of characters. Maybe using this formula?

EDIT: as it is managed object, I was trying out solutions from here and here. Unfortunatelly I am getting exception on doc object:

System.Runtime.Serialization.SerializationException: Type 'HtmlAgilityPack.HtmlDocument' in Assembly 'HtmlAgilityPack (...)

EDIT: basing on this solution, I am trying to convert doc.ToString(), as it is basically text document, what gives me sort of result, but I am not completely satisfied with this. I am just wondering how much I can rely on this solution and whether there are any alternatives.

bakunet
  • 559
  • 8
  • 25

1 Answers1

1

Using the class TestSize<T> from this github and calling its sizing method was viewed as a safe method according to this post. This method seems to work for most types. However, when attempting to find the bytes of a string, it seemed to be better for me to use

System.Text.ASCIIEncoding.Unicode.GetByteCount(myString);

One thing to note is that the architecture (32-bit/64-bit) as well as version of C# may use different memory sizes.

Jamin
  • 1,362
  • 8
  • 22