0

I have a binary file that I'd like to open, read and understand; but I've never tried to work with binary information before.

Various questions (including Using structs in C# to read data and How to read a binary file using c#?) helped me to open and read the file, but I have no idea how to interpret the information I've so far extracted.

One approach I got some hopeful data out of was this:

using (BinaryReader reader = new BinaryReader(File.Open(filename, FileMode.Open, FileAccess.Read)))
{
    for (int i = 0; i < 100; i++)
    {
        iValue = reader.ReadInt32();
        sb.AppendFormat("{1}={2}{0}", Environment.NewLine, i, iValue);
    }
}

Returns something like this:

0=374014592
1=671183229
2=558694987
3=-1018526206
4=1414798970
5=650
6=4718677
7=44
8=0
9=7077888
10=7864460

But this isn't what I was expecting, nor do I even know what it means - have i successfully determined the file contains a bunch of numbers or am I looking at an interpretation of the data (similar to how using the wrong/different encodings will return different characters for the same input).

Do I have any hope or should I stop entirely?

Community
  • 1
  • 1
Adrian K
  • 9,880
  • 3
  • 33
  • 59
  • 1
    Yes, you're looking at the interpretation of the data which treats it as a list of integers. You need to know what the format of the file is if you want to know what the "correct" interpretation is. – Blorgbeard Aug 10 '15 at 01:06
  • I would try reading it in as a list of bytes instead of Int32s and then apply ASCII decoding to them. Or more simply, open the file with Notepad. This will not get you all the way there, but it is possible there is some actual text in there that might give you a clue as to the structure of the rest of it. – WDS Aug 10 '15 at 01:50

2 Answers2

4

You have to already know how the binary file is structured in order to be able to read and interpret the file properly.

For example, if you write to a binary file an int, a double, a boolean and a string, like this:

 int i = 25;
 double d = 3.14157;
 bool b = true;
 string s = "I am happy";    

using (var bw = new BinaryWriter(new FileStream("mydata", FileMode.Create))
{           
    bw.Write(i);
    bw.Write(d);
    bw.Write(b);
    bw.Write(s);
}

then you must later read back the data values using the same types, in exactly the same order:

using (var br = new BinaryReader(new FileStream("mydata", FileMode.Open)))
{
    i = br.ReadInt32();
    Console.WriteLine("Integer data: {0}", i);
    d = br.ReadDouble();
    Console.WriteLine("Double data: {0}", d);
    b = br.ReadBoolean();
    Console.WriteLine("Boolean data: {0}", b);
    s = br.ReadString();
    Console.WriteLine("String data: {0}", s);
}

http://www.tutorialspoint.com/csharp/csharp_binary_files.htm

Here is what you would need to know to be able to successfully read a .WAV file (a binary file format that holds sound information). WAV files are one of the simpler binary formats:

Diagram of the .WAV file format

http://soundfile.sapp.org/doc/WaveFormat/

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
  • Ok, so without knowing the structure I'm basically wasting my time? I.e. there's no way of being able to work the file structure out? – Adrian K Aug 10 '15 at 01:12
  • Do you have *any* idea what the file contains? Without some idea of what the file's content is, I don't see how it would be useful to you in any meaningful programming context. – Robert Harvey Aug 10 '15 at 01:12
  • I'm expecting key-value pairs. It's for a spare-time hobby project (not nefarious, honest!) so not real drama. It more of a learning oppty than anything else. – Adrian K Aug 10 '15 at 01:14
  • Is that all the information you have? Are the key and the value of some fixed-length type, or are they variable-length? Is there field length information in the file? – Robert Harvey Aug 10 '15 at 01:16
  • Thanks for the valuable help so far, its helped me understand some of the other answers I was reading. I'll do some extra digging on the file format. I know others in the (gaming) community have parsed it but all their implementations are in python and other languages. – Adrian K Aug 10 '15 at 01:24
2

By definition a binary file is just a series of bits. Whether you interpret those bits as numbers, characters or something else depends entirely upon what was written into the file in the first place.

In general there's no way to tell what was written into the file by looking at the file contents. Of course if you interpret the bits as characters and get readable text then there's a good chance that text is what was written into the file. But a file containing only text typically wouldn't be described as a binary file.

By calling ReadInt32 you are assuming that the contents of your file are a series of four-byte integers. But what if eight-byte integers or floats or an enumeration or something else was written to your file? What if your file doesn't contain a multiple of four bytes?

You might consider changing your loop to use ReadByte rather than ReadInt32 so it might look something like this...

bValue = reader.ReadByte();
sb.AppendFormat("{1}=0x{2:X}{0}", Environment.NewLine, i, bValue);

so you treat the file as a sequence of bytes and write the data out in hex rather than as a decimal number.

Another approach might be to find a good hex editor and use that to inspect the file contents rather than writing your own code (at least to start with).

There is a simple hex editor built into Visual Studio (assuming that's what you are using). Go to File | Open | Open File. Then in the Open File dialog select your binary file and then click on the drop down to the right of the Open Button and select Open With and then select Binary Editor.

What you'll see is the contents of the file shown as hex and characters. Not great but quick.

Frank Boyne
  • 4,400
  • 23
  • 30