2

I have a file that contains text data and binary data. This may not be a good idea, but there's nothing I can do about it. I know the end and start positions of the binary data.

What would be the best way to read in that binary data between those positions, make a Base64 string out of it, and then write it back to the position it was.

EDIT: The Base64-encoded string won't be same length as the binary data, so I might have to pad the Base64 string to the binary data length.

james.garriss
  • 12,959
  • 7
  • 83
  • 96
hs2d
  • 6,027
  • 24
  • 64
  • 103
  • 2
    Your base64 string is guaranteed to be (4/3) bigger than the binary data – Mark Peters Jun 28 '11 at 19:34
  • Hmm, ok. Thats a good thing to know. Thanks – hs2d Jun 28 '11 at 19:41
  • 1
    Base64 will already be bigger - you can't pad it out to fit... – The Evil Greebo Jun 28 '11 at 20:00
  • I would like to know more about why you want to do this. I suspect that you are asking for a way to implement the wrong solution. – Jeffrey L Whitledge Jun 28 '11 at 20:21
  • @Jeffrey, the reason why i need to do this is that we receive a file holding some data. to make it readable for our application i have to make change binary fields to base64 strings. – hs2d Jun 28 '11 at 20:34
  • Consider using the encoded word format; it's made to do exactly what you are doing. http://tools.ietf.org/html/rfc2047#section-2 – james.garriss Nov 05 '14 at 17:14
  • @james.garriss Can you reveal how you find the start point of the binary data? I am struggling with a similar problem, although I do not need to write. I have a mixed text/binary file and I do not know how to isolate the binary data. There is a nice text header above it. – Dave Ludwig May 22 '20 at 18:20
  • For an encoded word, it's the section marked "encoding". – james.garriss May 26 '20 at 16:18

3 Answers3

0

You'll want to use a FileStream object, and the Read(byte[], int, int) and Write(byte[], int, int) methods.

Although the point about base64 being bigger than binary is valid - you'll actually need to grab the data beyond the end point of what you want to replace, store it, write to the file with your new data, then write out the stored data after you finish.

I trust you're not trying to mod exe files to write viruses here... ;)

The Evil Greebo
  • 7,013
  • 3
  • 28
  • 55
  • But now when i have couple of binary blocks after each other and they all need to converted to separate base64 strings then it would be impossible to do it because the length would change and my positions wouldnt be accurate anymore? – hs2d Jun 28 '11 at 19:55
  • Like I said, you'ld need to first identify the end point of what you're replacing, then capture everything beyond that - THEN you would start writing your output in the new format w/o regard to the prior positions, and finally append the original end data after you finish your changes. The file size, of course, would change - but going from binary to base 64 means you can't avoid that w/o chopping out something. – The Evil Greebo Jun 28 '11 at 20:00
0
int binaryStart = 100;
int binaryEnd = 150;

//buffer to copy the remaining data to it and insert it after inserting the base64string
byte[] dataTailBuffer = null;

string base64String = null;

//get the binary data and convert it to base64string
using (System.IO.Stream fileStream = new FileStream(@"c:\Test Soap", FileMode.Open, FileAccess.Read))
{
    using (System.IO.BinaryReader reader = new BinaryReader(fileStream))
    {
        reader.BaseStream.Seek(binaryStart, SeekOrigin.Begin);

        var buffer = new byte[binaryEnd - binaryStart];

        reader.Read(buffer, 0, buffer.Length);

        base64String = Convert.ToBase64String(buffer);

        if (reader.BaseStream.Position < reader.BaseStream.Length - 1)
        {
            dataTailBuffer = new byte[reader.BaseStream.Length - reader.BaseStream.Position];

            reader.Read(dataTailBuffer, 0, dataTailBuffer.Length);
        }
    }
}

//write the new base64string at specifid location.
using (System.IO.Stream fileStream = new FileStream(@"C:\test soap", FileMode.Open, FileAccess.Write))
{
    using (System.IO.BinaryWriter writer = new BinaryWriter(fileStream))
    {
        writer.Seek(binaryStart, SeekOrigin.Begin);

        writer.Write(base64String);//writer.Write(Convert.FromBase64String(base64String));

        if (dataTailBuffer != null)
        {
            writer.Write(dataTailBuffer, 0, dataTailBuffer.Length);
        }
    }
}
Jalal Said
  • 15,906
  • 7
  • 45
  • 68
  • Yes, that gets the reading part sorted. Im struggeling more with the writing base64 string back to the file part. – hs2d Jun 28 '11 at 19:37
  • @hs2d: answer updated. please leave a comment if you find any issue. – Jalal Said Jun 28 '11 at 19:57
  • 1
    @Jalal: that looks ok, but you forgot the issue that base64 string is bigger than the original binary data and there is more data after binary block. – hs2d Jun 28 '11 at 20:01
  • @hs2d: if that the case. we could save the data after the end position to a buffer and after writing the base64String just insert that data again at the end of file. – Jalal Said Jun 28 '11 at 20:06
  • Was just testing the code and i get exception on this line: `using (System.IO.Stream fileStream = new FileStream(@"your file path", FileMode.Open | FileMode.Append))` Exeption: `Enum value was out of legal range. Parameter name: mode` – hs2d Jun 28 '11 at 20:37
  • `using (System.IO.Stream fileStream = new FileStream(@"your file path", FileMode.Open, FileAccess.ReadWrite))` – Jalal Said Jun 28 '11 at 20:42
  • Now the `BinaryWriter` says that the Stream was not writable. – hs2d Jun 28 '11 at 20:49
  • That because we use it in the binary reader, so we need to access to the file twice, one time for reading and the other time is for writing. a solution is to make `using (System.IO.Stream fileStream = new FileStream(@"your file path", FileMode.Open, FileAccess.Read)) //perform the binary reader here` 'using (System.IO.Stream fileStream = new FileStream(@"your file path", FileMode.Open, FileAccess.Write))//perform the BinaryWriter here' – Jalal Said Jun 28 '11 at 20:53
  • @hs2d: Sorry for the delay, _Finally_ I got access to my computer with visual studio, and fixed all bugs within the previous -not tested- code. Let me know if you have any problem. – Jalal Said Jun 29 '11 at 01:07
  • @Jalal, i dont know why but this binarywriter adds a D in the beginning of every base64 string. In the code when debugging there is no D in the beginning of the string but there is one in the file. – hs2d Jun 29 '11 at 05:17
  • Was testing little more here, and still the writer adds some strange characters to the beginning of the base 64 string. Why that could be? Heres a example, in text file: http://imageshack.us/photo/my-images/30/strangeh.png/ And in visual studio: http://imageshack.us/photo/my-images/94/strange2w.png/ – hs2d Jun 29 '11 at 05:59
  • OK, found answere myself: http://stackoverflow.com/questions/1488486/why-does-binarywriter-prepend-gibberish-to-the-start-of-a-stream-how-do-you-avoi – hs2d Jun 29 '11 at 06:04
0

Clearly, writing out base-64 in the place of binary data cannot work, since the base-64 will be longer. So the question is, what do you need to do this for?

I will speculate that you have inherited this terrible binary file format, and you would like to use a text-editor to edit the textual portions of this binary file. If that is the case, then perhaps a more robust round-tripping binary-to-text-to-binary conversion is what you need.

I recommend using base-64 for the binary portions, but the rest of the file should be wrapped up in XML, or some other format that would be easy to parse and interpret. XML is good, because the parsers for it are already available in the system.

<mydoc>
    <t>Original text</t>
    <b fieldId="1">base-64 binary</b>
    <t>Hello, world!</t>
    <b fieldId="2">928h982hr98h2984hf</b>
</mydoc>

This file can be easily created from your specification, and it can be easily edited in any text editor. Then the file can be converted back into the original format. If any text intrudes into the binary fields, then it can be truncated. Likewise, text that is too short could be padded with spaces.

Jeffrey L Whitledge
  • 58,241
  • 9
  • 71
  • 99
  • Actully the software what for im trying to make binary to base64 does exactly what u are suggesting to do. (: – hs2d Jun 28 '11 at 20:45