Adam is pointing in exactly the right direction.
If you want to find out how to sense almost any file, look at the database behind the file
command on a UNIX, Linux, or Mac OS X machine.
file
uses a database of “magic numbers” — those initial bytes Adam listed — to sense a file's type. man file
will tell you where to find the database on your machine, e.g. /usr/share/file/magic
. man magic
will tell you its format.
You can either write your own detection code based on what you see in the database, use pre-packaged libraries (e.g. python-magic), or — if you're really adventurous — implement a .NET version of libmagic
. I couldn't find one, and hope another member can point one out.
In case you don't have a UNIX machine handy, the database looks like this:
# PNG [Portable Network Graphics, or "PNG's Not GIF"] images
# (Greg Roelofs, newt@uchicago.edu)
# (Albert Cahalan, acahalan@cs.uml.edu)
#
# 137 P N G \r \n ^Z \n [4-byte length] H E A D [HEAD data] [HEAD crc] ...
#
0 string \x89PNG PNG image data,
>4 belong !0x0d0a1a0a CORRUPTED,
>4 belong 0x0d0a1a0a
>>16 belong x %ld x
>>20 belong x %ld,
>>24 byte x %d-bit
>>25 byte 0 grayscale,
>>25 byte 2 \b/color RGB,
>>25 byte 3 colormap,
>>25 byte 4 gray+alpha,
>>25 byte 6 \b/color RGBA,
#>>26 byte 0 deflate/32K,
>>28 byte 0 non-interlaced
>>28 byte 1 interlaced
1 string PNG PNG image data, CORRUPTED
# GIF
0 string GIF8 GIF image data
>4 string 7a \b, version 8%s,
>4 string 9a \b, version 8%s,
>6 leshort >0 %hd x
>8 leshort >0 %hd
#>10 byte &0x80 color mapped,
#>10 byte&0x07 =0x00 2 colors
#>10 byte&0x07 =0x01 4 colors
#>10 byte&0x07 =0x02 8 colors
#>10 byte&0x07 =0x03 16 colors
#>10 byte&0x07 =0x04 32 colors
#>10 byte&0x07 =0x05 64 colors
#>10 byte&0x07 =0x06 128 colors
#>10 byte&0x07 =0x07 256 colors
Good luck!