40

Can anyone tell me how many bytes the below string will take up?

string abc = "a";
dav_i
  • 27,509
  • 17
  • 104
  • 136
Mohit Kumar
  • 1,885
  • 5
  • 21
  • 24
  • 3
    it takes 17 bytes in the source. please specify your question some more. are you thinking in the memory at runtime, when encoded to byte[], ... – Daniel Mošmondor Oct 19 '10 at 10:35
  • 3
    You will need to clarify some things, for example: This string, where are you going to save it? In memory, as the .NET type "string"? In a file? With which encoding? Why are you interested? If in .NET's memory, each additional string with the same content doesn't necessarily use a lot more memory since the old one can be reused – Onkelborg Oct 19 '10 at 10:35
  • 2
    It takes little enough that you really shouldn't be concerned about it :-) – paxdiablo Oct 19 '10 at 10:36

3 Answers3

42

From my article on strings:

In the current implementation at least, strings take up 20+(n/2)*4 bytes (rounding the value of n/2 down), where n is the number of characters in the string. The string type is unusual in that the size of the object itself varies. The only other classes which do this (as far as I know) are arrays. Essentially, a string is a character array in memory, plus the length of the array and the length of the string (in characters). The length of the array isn't always the same as the length in characters, as strings can be "over-allocated" within mscorlib.dll, to make building them up easier. (StringBuilder does this, for instance.) While strings are immutable to the outside world, code within mscorlib can change the contents, so StringBuilder creates a string with a larger internal character array than the current contents requires, then appends to that string until the character array is no longer big enough to cope, at which point it creates a new string with a larger array. The string length member also contains a flag in its top bit to say whether or not the string contains any non-ASCII characters. This allows for extra optimisation in some cases.

I suspect that was written before I had a chance to work with a 64-bit CLR; I suspect in 64-bit land each string takes up either 4 or 8 more bytes.

EDIT: I wrote up a blog post more recently which includes 64-bit information (and contradicts the above slightly for x86...)

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Well, it makes string very unpopular if you want to store great amount of them in the memory... – Daniel Mošmondor Oct 19 '10 at 10:41
  • So a 1-character string will take 20 bytes according to your article. And 20 bytes is the object data. Where is the character stored then ? :-) – meze Jan 17 '13 at 15:31
  • @meze: Not sure what you mean by "20 bytes is the object data". Could you clarify? Also see http://msmvps.com/blogs/jon_skeet/archive/2011/04/05/of-memory-and-strings.aspx for more recent information - will add that in. – Jon Skeet Jan 17 '13 at 15:34
  • I assumed that the constant "20" is the size of internal details of a string object that is not related to the string content. I've just found your explanation that `null` takes 2 bytes in a string. My guess is that 20 is actually '18 + 2 for null value', and that's why an empty string and 1-character string will take the same amount of memory. – meze Jan 17 '13 at 15:47
  • @meze: Well, my other blog post has a slightly different formula: 14+n*2, rounded up to the nearest 4 bytes. That would give 16 bytes for a 1-character string, rather than 20. I'm not sure which is accurate, off-hand. – Jon Skeet Jan 17 '13 at 15:53
  • Jon sir, 2 questions. 1)Is the string stored as a char array in memory for c#? 2)How did you get the constant factor 20? – Therii Oct 21 '21 at 19:30
  • @Therii: 1) No, it's not a separate char array. 2) See the blog post for details. – Jon Skeet Oct 21 '21 at 19:40
  • Ty Jon sir. Sir, I have heard the word overhead in many of your answers eg "object overhead - a string takes up about 20 bytes". Can you please explain? – Therii Oct 22 '21 at 03:29
  • @Therii: I've explained as best I can in the blog post written at the end of the answer. – Jon Skeet Oct 22 '21 at 06:04
12

Basically, Each string object require a constant 20 bytes for the object data. The buffer requires 2 bytes per character. The memory usage estimation for string in bytes: 20 + (2 * Length). So, Normally The memory in CLR for this string: 22 bytes

However while we pass or sending this string to another end or in any other usage, we do not need this much memory(we never need the 20 bytes for the object data). So it depends on the type of encoding you select, while you use it.

For a default encoding, it will take 1 byte for a character.

So Answer is 1 byte for default encoding.

You can check with this code:

Encoding.Default.GetBytes("a"); //It will give you a byte array of size 1.
Encoding.Default.GetBytes("ABC"); //It will give you a byte array of size 3.
Tamilmaran
  • 1,257
  • 1
  • 10
  • 21
1

If you ask about size of string object then it is wrong to ask about its size, without debugger it is impossible to say what exactly is it. Not sure that it is possible with debugger either. string uses pointers internally.

If you ask about size of sequence of chars that it contains then it is 4, because strings are stored in UTF-16. All chars in Basic Multilingual Plane are coded with two bytes.

Andrey
  • 59,039
  • 12
  • 119
  • 163