-4

In C# If I have 4-5 GB of data which is now in form of bytes then I am converting it into string then what will be the impact of this on memory and how to do better memory management while using variable for large string ?

Code

 public byte[] ExtractMessage(int start, int end)
    {
        if (end <= start)
            return null;

        byte[] message = new byte[end - start];
        int remaining = Size - end;

        Array.Copy(Frame, start, message, 0, message.Length);

        // Shift any remaining bytes to front of the buffer
        if (remaining > 0)
            Array.Copy(Frame, end, Frame, 0, remaining);

        Size = remaining;
        ScanPosition = 0;

        return message;
    }



   byte[] rawMessage = Buffer.ExtractMessage(messageStart, messageEnd);

   // Once bytes are received, I want to create an xml file which will be used for further more work 

string msg = Encoding.UTF8.GetString(rawMessage);

CreateXMLFile(msg);

 public void CreateXMLFile(string msg)
        {
            string fileName = "msg.xml";
            if (File.Exists(fileName))
            {
                File.Delete(fileName);
            }
            using (File.Create(fileName)) { };

            TextWriter tw = new StreamWriter(fileName, true);
            tw.Write(msg);
            tw.Close();
        }
Gaurav123
  • 5,059
  • 6
  • 51
  • 81
  • 2
    a **lot** of context is missing here. – The Paramagnetic Croissant Oct 16 '15 at 10:01
  • I just want to know the memory management between bytes and string, so I am not showing any code. Please don't negative votes instead if you have knowledge share it – Gaurav123 Oct 16 '15 at 10:03
  • 2
    Here's the first bit of knowledge you need: **It depends on the language.** – Ignacio Vazquez-Abrams Oct 16 '15 at 10:05
  • @TheParamagneticCroissant : are you doing negative voting to my other questions as well ?????? you are looking like a sick buddy man. Just chill :) – Gaurav123 Oct 16 '15 at 10:08
  • @IgnacioVazquez-Abrams : Thanks :) I have updated the language in my quesiton. Its c# – Gaurav123 Oct 16 '15 at 10:09
  • Guarav123, the code sample would show what you currently have, and what you want to change it to. Then we can give you a decent chance of an answer that's relevant. You don't need to show the whole thing. – Russ Clarke Oct 16 '15 at 10:14
  • @RussClarke : Thanks, I have updated my question. Hope you will get an idea – Gaurav123 Oct 16 '15 at 10:20
  • What this code shows is that you have a buffer with text data that you want to save as XML. There's no need to convert it to string first, you can write it directly to disk – Panagiotis Kanavos Oct 16 '15 at 10:25
  • When you have a `byte[]` containing UTF8-encoded text, then you use about 1 byte per character (guessing that you don't use an asian language). When you convert it to a String, those characters now become `Char`s, which are 2 bytes in size. Plus you still have the original byte array. – Hans Kesting Oct 16 '15 at 10:30
  • @HansKesting that's incorrect - UTF8 uses 1-4 bytes per characer. Only ASCII characters are encoded to 1 byte only – Panagiotis Kanavos Oct 16 '15 at 11:12
  • @PanagiotisKanavos - I didn't want to confuse the issue, but that is why I said "*about* 1 byte" and mentioned asian languages. I could have included emoji as well. I'm guessing the OP's XML sample will use mostly "ASCII chars" which take 1 byte. – Hans Kesting Oct 16 '15 at 12:13
  • @Gaurav123 I have no idea what your other questions are. I'm not purposefully following you and downvoting your questions. If, however, your questions' quality is typically as low as that of this one, then chances are I have downvoted (and/or will downvote) those too. BTW, there's no need to give me spite/revenge downvotes on my answers. – The Paramagnetic Croissant Oct 17 '15 at 12:19

1 Answers1

2

.NET strings are stored as unicode which means two bytes per character. As you are using UTF8 you'll double the memory usage when converting to a string.

Once you've converted the text to a string nothing more will happen unless you try do modify it. string objects are immutable which means that a new copy of the string will be created each time you modify it using one of the methods such as Remove().

You can read more here: How are strings passed in .NET?

A byte array however is always passed by reference, and each change will affect all variables holding it. Thus changes will not hurt performance/memory consumption.

You can get a byte[] from a string by using var buffer = yourEncoding.GetBytes(yourString);. Common encodings can be accessed using static variables: var buffer= Encoding.UTF8.GetBytes(yourString);

Community
  • 1
  • 1
jgauffin
  • 99,844
  • 45
  • 235
  • 372
  • That isn't correct - strings in .NET are Unicode, ie 2 bytes per character,while the original buffer contains UTF-8 encoded text which varies from 1 to 4 bytes. – Panagiotis Kanavos Oct 16 '15 at 11:11
  • @PanagiotisKanavos: true. Didn't check how he stored the array. Will update the answer. – jgauffin Oct 16 '15 at 11:57