I'm wondering if I can know how long in bytes for a string
in C#, anyone know?
Asked
Active
Viewed 2e+01k times
153

Majid
- 13,853
- 15
- 77
- 113

user705414
- 20,472
- 39
- 112
- 155
-
Check out [this answer](http://stackoverflow.com/questions/472906/net-string-to-byte-array-c-sharp). – Sergey Kalinichenko Jan 03 '12 at 04:02
-
14Are you asking how much memory a `string` object occupies, or how many bytes the representation of a string will occupy when written to a file or sent over a network (i.e. encoded), because those are two completely different questions. majidgeek almost answered the former while diya answered the latter (at least for two common encodings). – Allon Guralnek May 03 '13 at 08:50
-
possible duplicate of [how much bytes will take?](http://stackoverflow.com/questions/3967411/how-much-bytes-will-take) – nawfal Oct 23 '13 at 07:12
-
@AllonGuralnek:Good point. do you know why diya below didn't suggest to use System.Text.Encoding.Unicode.GetByteCount instead? Why ASCIIEncoding part? – Giorgi Moniava Oct 17 '15 at 19:55
-
@Giorgi: Since `Unicode` is a static property of `System.Text.Encoding`, which is the base class of `ASCIIEncoding`, both statements are actually the same. You can access a static member from subclasses as well (but it's not considered idiomatic). – Allon Guralnek Oct 18 '15 at 04:42
4 Answers
179
You can use encoding like ASCII to get a character per byte by using the System.Text.Encoding
class.
or try this
System.Text.ASCIIEncoding.Unicode.GetByteCount(string);
System.Text.ASCIIEncoding.ASCII.GetByteCount(string);

diya
- 6,938
- 9
- 39
- 55
-
19Stupid question, but how will we know whether to use the Unicode or ASCII class if the data in the string came from a 3rd party file? – Matthew Lock Feb 24 '14 at 01:11
-
8@MatthewLock You should use UTF16 (or majidgeek's `Length * sizeof(Char)`, which should give the same result since each `Char` is UTF16/2-bytes) if you want the same number of bytes as the internal representation of a string. If you actually want the exact amount of memory the entire object takes, rather than just the number of bytes in its internal character array, then you might consider a [more general method](http://stackoverflow.com/questions/1128315/find-size-of-object-instance-in-bytes-in-c-sharp). – Bob Jul 02 '14 at 02:25
122
From MSDN:
A
String
object is a sequential collection ofSystem.Char
objects that represent a string.
So you can use this:
var howManyBytes = yourString.Length * sizeof(Char);

Majid
- 13,853
- 15
- 77
- 113
-
as far as i can understand basics of data structure it's the most 'pined' choise to compare to – LoneXcoder Oct 08 '15 at 09:38
-
5Don't forget to take into account the size of the length member. int howManyBytes = yourString.Length * sizeof(Char) + sizeof(int); – Zoltan Tirinda Nov 29 '15 at 07:56
-
-
2This should be correct answer. Also .Length is what the amount of bytes the server will receive if you send that same string. This is what I was needing. – DaWiseguy Sep 27 '19 at 21:41
-
This answer is misleading. This calculation will only tell how many bytes does the string take in RAM when the application is running, so probably the least useful information somebody would want. When writing a string to a file or API response, this calculation will majorly fail, because it doesn't take encoding into account. – Robert Synoradzki Aug 02 '23 at 11:22
29
System.Text.ASCIIEncoding.Unicode.GetByteCount(yourString);
Or
System.Text.ASCIIEncoding.ASCII.GetByteCount(yourString);
10
How many bytes a string
will take depends on the encoding you choose (or is automatically chosen in the background without your knowledge). This sample code shows the difference:
using System;
using System.Text;
static void Main()
{
Encoding testedEncodings = new[]
{
Encoding.ASCII, // Note that '' cannot be encoded in ASCII, data loss will occur
Encoding.Unicode, // This is UTF-16. It is used by .NET to store your strings in RAM when the application is running, but this isn't useful information unless you're trying to manipulate bytes in RAM
Encoding.UTF8, // This should always be your choice nowadays
Encoding.UTF32
};
string text = "a";
Console.WriteLine($"Tested string: {text}");
Console.WriteLine($"String length: {text.Length}");
Console.WriteLine();
PrintTableHeader("Encoding", "Bytes", "Decoded string");
foreach (var encoding in testedEncodings)
{
byte[] bytes = encoding.GetBytes(text);
string decodedString = encoding.GetString(bytes);
PrintTableRow(
encoding.EncodingName,
$"{bytes.Length} ({string.Join(' ', bytes)})",
decodedString);
}
}
static void PrintTableHeader(params string[] values)
{
PrintTableRow(values);
Console.WriteLine(new string('-', 60));
}
static void PrintTableRow(params string[] values)
{
Console.WriteLine("{0,-16} | {1,-24} | {2}", values);
}
Output:
Tested string: a
String length: 3
Encoding | Bytes | Decoded string
------------------------------------------------------------
US-ASCII | 3 (97 63 63) | a??
Unicode | 6 (97 0 62 216 106 220) | a
Unicode (UTF-8) | 5 (97 240 159 161 170) | a
Unicode (UTF-32) | 8 (97 0 0 0 106 248 1 0) | a

Robert Synoradzki
- 1,766
- 14
- 20