4

I am querying the table (one of the columns is a VARBINARY(MAX)) which returns some records.

Then I save that as .dat.csv then I parse through that .dat file and get that varbinary value into a string by splitting the file based on commas. Now I need to convert this varbinary to byte array. How can I do that?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Pinu
  • 7,310
  • 16
  • 53
  • 77
  • 1
    you want convert the string containing varbinary in a byte array true? – Roberto Conte Rosito Feb 04 '11 at 16:58
  • 1
    I'm likely missing something, but why the intermediate, to CSV step? If you've got a string and you want byte[], you can just run var encoding=new System.Text.UTF8Encoding(); var bytes = encoding.GetBytes(str); – Sean Feb 04 '11 at 17:02
  • @Sean - I tried that it's giving me corrupt file – Pinu Feb 04 '11 at 17:10
  • then what encoding is the file being written in? I just grabbed the code I had, which was UTF8. If you can skip the CSV step I would (or write it binary vs. encoded). Writing binary files into a text format is a bad practice -- especially one sensitive to quotes, commas and line breaks. It might be base64 encoded in the CSV, so you might have to decode the base64 encoding. – Sean Feb 04 '11 at 17:13

2 Answers2

6

Good question. Technically, you can do this by first converting to a character array, then converting to bytes. However, strings in .NET are Unicode-encoded by default (instead of ASCII), so it gets tricky.

If at all possible, you should try to pull the varbinary out of the file as a byte array, using the FileStream you're reading from instead of the StreamReader which performs encoding conversions to and from the file encoding type.

The problem with this byte-to-string-to-byte babelfishing is that certain bytecodes have special meaning in each Unicode encoding, giving information to the decoder about the number of bytes it should pull to decode the next character. When converting between various Unicode encodings and the .NET-native UTF-8 encoding for strings, bytes will be gained, lost, and changed. When it's a string, no biggie; the encoding information stays with the string. When it's binary data, the encoding and decoding can garble it unless it's done in a very specific way.

The only way this will work flawlessly is if you write the file out using ASCII encoding, then read it back in as such, which will cause each individual byte to be treated as a single character. You can then simply convert each char back to a byte, and the more significant byte of the UInt16 behind the scenes of the Syetem.Char, which is just zero-padding for the byte fed in to that char, will be discarded.

var reader = new StreamReader(new FileStream("test.csv"), Encoding.ASCII);
var varBinaryString = reader.Read(<wherever the varbinary is in the file/line>);

var byteArray = varBinaryString.ToCharArray().Select(c=>(byte)c).ToArray();

Technically, you could pull it in using any Unicode encoding as well, but you need to know a lot of specifics about how you wrote out those bytes and how the reader is reading them back in, so that you can perform the correct encoding and expansion (or deflation) as necessary to get the original bytestream.

EDIT: The .NET 2.0 version - no Linq:

StreamReader reader = new StreamReader(new FileStream("test.csv"), Encoding.ASCII);
string varBinaryString = reader.Read(<wherever the varbinary is in the file/line>);

char[] charArray = varBinaryString.ToCharArray();
byte[] byteArray = new byte[charArray.Length];

for(int i=0; i< charArray.Length; i++)
{
    byteArray[i] = (byte)charArray[i];
}
Chris Mack
  • 5,148
  • 2
  • 12
  • 29
KeithS
  • 70,210
  • 21
  • 112
  • 164
  • What's confusing to you? The first line is setting up a StreamReader which will read from your file and will assume that it was written in ASCII encoding. The second like is a placeholder for however you would use the StreamReader to actually get the varbinary out of the file. The third one is what you really should pay attention to, but just be aware that it ONLY works for a string consisting only of ASCII-compliant character codes. – KeithS Feb 04 '11 at 18:46
  • my C# version does not support var – Pinu Feb 04 '11 at 18:54
  • Ah, then just replace it with the concrete type; in the first line you'd declare a StreamReader, then in the second a string, then in the third a byte[]. `var` just tells the compiler to infer the variable will be of the type returned by the initialization expression, so if you end up changing it you don't have to go change the declared type too. However, if you can't use var, you probably also can't use Linq; it was introduced in C# 3.0 with Framework 3.0-3.5. – KeithS Feb 04 '11 at 22:44
  • @KeithS: Yeah, can you please give me equivalent code for c# 2.0 – Pinu Feb 07 '11 at 15:34
  • Thanks Keith , but when i write the byte array from the above code to the filestream. It does not open the file as it is not a supported file type.when i open that using text editor i still see the binary data. – Pinu Feb 07 '11 at 16:17
  • Well, when you tell the computer it's text (by looking at binary data in a text editor), it IS text. "Not a supported file type" can mean many things. Are you saving the file with the correct extension, for which Windows has a default program registered (or has it registered as an executable)? Did the CSV file you're pulling in get written with ASCII encoding, and not Unicode? These are the problems inherent in storing mixed text and binary data. – KeithS Feb 07 '11 at 16:39
  • trying to save the result byteArray to the database again as varbinary will result in wrong value -nothing like the original- https://imgur.com/329cGde they were images and i read them from a text file as ascii string and i make sure it's the same value as in the database before converting – Ahmed Mohammed Apr 12 '21 at 22:23
4

I have founded more simple solution for the case when you have only one or serveral values in SQL resultset and don't want to convert it to *.csv and so on.

The case when copy-pasting the SQL-varbinary-values is acceptable.

using System;
using System.Text;

namespace TestProj
{
    class Program
    {
        static void Main(string[] args)
        {
            //the value from database (Ctrl+C & Ctrl+V)
            string varbinaryStr = "0x7B2255";  

            //remove '0x' prefix AND trailing spaces
            var no_0x_varbinary_str = varbinaryStr.Replace("0x", "").Trim();

            //the bytes
            byte[] bytes = StrToByteArray(no_0x_varbinary_str);
            
            //now you can work with the bytes as you need
            //in my case there was an UTF8-string that have been saved in SQL as varbinary
            //so I am just decode it like this
            var string_from_database = Encoding.UTF8.GetString(bytes);
            
            Console.ReadKey();
        }
        
        static byte[] StrToByteArray(string str)
        {
            Dictionary<string, byte> hexindex = new Dictionary<string, byte>();
            for (int i = 0; i <= 255; i++)
                hexindex.Add(i.ToString("X2"), (byte)i);

            List<byte> hexres = new List<byte>();
            for (int i = 0; i < str.Length; i += 2)
                hexres.Add(hexindex[str.Substring(i, 2)]);

            return hexres.ToArray();
        }
    }
}

The example of string to byte[] convertion that have been used in StrToByteArray I found in this question where you can find other refference to the discussion of the fastest convertion way (with examples and tests).

Kamerton
  • 315
  • 3
  • 9