44

I have a byte array filled from a file uploaded. But, in another part of the code, I need to know this file type uploaded from the byte[] so I can render the correct content-type to browser!

Thanks!!

André Miranda
  • 6,420
  • 20
  • 70
  • 94

11 Answers11

24

As mentioned, MIME magic is the only way to do this. Many platforms provide up-to-date and robust MIME magic files and code to do this efficiently. The only way to do this in .NET without any 3rd party code is to use FindMimeFromData from urlmon.dll. Here's how:

public static int MimeSampleSize = 256;

public static string DefaultMimeType = "application/octet-stream";

[DllImport(@"urlmon.dll", CharSet = CharSet.Auto)]
private extern static uint FindMimeFromData(
    uint pBC,
    [MarshalAs(UnmanagedType.LPStr)] string pwzUrl,
    [MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
    uint cbSize,
    [MarshalAs(UnmanagedType.LPStr)] string pwzMimeProposed,
    uint dwMimeFlags,
    out uint ppwzMimeOut,
    uint dwReserverd
);

public static string GetMimeFromBytes(byte[] data) {
    try {
        uint mimeType;
        FindMimeFromData(0, null, data, (uint)MimeSampleSize, null, 0, out mimeType, 0);

        var mimePointer = new IntPtr(mimeType);
        var mime = Marshal.PtrToStringUni(mimePointer);
        Marshal.FreeCoTaskMem(mimePointer);

        return mime ?? DefaultMimeType;
    }
    catch {
        return DefaultMimeType;
    }
}

This uses the Internet Explorer MIME detector. This is the same code used by IE to send a MIME type along with uploaded files. You can see the list of MIME types supported by urlmon.dll. One thing to watch out for is image/pjpeg and image/x-png which are non-standard. In my code I replace these with image/jpeg and image/png.

mroach
  • 2,403
  • 1
  • 22
  • 29
  • You extern method declaration seems to be wrong. Someone wrote about it here: http://webandlife.blogspot.com/2012/11/google-is-your-alcoholic-friend.html – SandRock Jul 21 '13 at 22:56
  • 4
    Funny how his code before refactoring is exactly same as after refactoring. Doesn't bode well from someone who's pointing out mistakes at others' but apparently cannot handle copy/paste on his own. Kinda dents his credibility doesn't it? :) – Mrchief Aug 01 '14 at 14:59
  • @Mrchielf: It's not the same. First difference I found was changing `uint` to `IntPtr`. Which makes sense, because the post was specifically on the topic of matching C and C# data types. – Ben Voigt Dec 08 '17 at 19:26
11

If you know it's a System.Drawing.Image, you can do:

public static string GetMimeTypeFromImageByteArray(byte[] byteArray)
{
   using (MemoryStream stream = new MemoryStream(byteArray))
   using (Image image = Image.FromStream(stream))
   {
       return ImageCodecInfo.GetImageEncoders().First(codec => codec.FormatID == image.RawFormat.Guid).MimeType;
   }
}
Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
yazanpro
  • 4,512
  • 6
  • 44
  • 66
10

Not sure, but maybe you should investigate about magic numbers.

Update: Reading about it, I don't think it's very reliable though.

Carles Company
  • 7,118
  • 5
  • 49
  • 75
  • 2
    `FindMimeData` doesn't even detect something as basic as `audio/mp3`, so magic numbers is the only option if you're detecting something outside those 26 types. Can you elaborate on why you think it unreliable? – Mrchief Aug 01 '14 at 15:28
8

You can't know it from the byte stream, but you can store the MIME type when you initially populate the byte[].

  • 3
    In general, you can't. However, you can use heuristics to check for magic numbers and guess the content type with a good probability (as the `file` command in UNIX does). You can check its source. – Mehrdad Afshari Oct 31 '09 at 16:35
  • You can fake it with System.Net.Mail's ContentType, by casting your uploaded file to an Attachment (not hard to do), or you can try the URLMON.DLL hack from this question: http://stackoverflow.com/questions/58510/in-c-how-can-you-find-the-mime-type-of-a-file-based-on-the-file-signature-not-th –  Oct 31 '09 at 22:50
7

Short answer: you can't

Longer answer: Usually, programs use the file extension to know what type of file they're dealing with. If you don't have that extension, you can only make guesses... for instance, you could look at the first few bytes and check if you recognize a well-known header (XML declaration tag for instance, or bitmap or JPEG header). But that will always be a guess in the end : without some metadata or information about the content, an array of bytes is just meaningless...

Thomas Levesque
  • 286,951
  • 70
  • 623
  • 758
  • 1
    A good example may be all the file types that wrap zip/cab files (ie, .docx). Presumably, if I'm able to simply change the extension and open the file with another program, then the 'magic numbers' for the underlying file bytes would be the same, thus leading to ambiguity. – JoeBrockhaus Dec 03 '14 at 21:49
7

If you know extension of the file name, may be System.Web.MimeMapping will do the trick:

MimeMapping.GetMimeMapping(fileDisplayNameWithExtension)

I used it in MVC Action like this:

return File(fileDataByteArray, MimeMapping.GetMimeMapping(fileDisplayNameWithExtension), fileDisplayNameWithExtension);
Yasser Sobhdel
  • 611
  • 8
  • 26
4

Reminds me of back in the day we, er um "some people" used to share 50MB rar files on the early free image hosting sites, by just adding the .gif extension to the .rar filename.

Clearly if you are public facing and your are expecting a certain file type, and you have to be sure it is that file type, then you can't just trust the extension.

On the other hand, if your app would have no reason to distrust the the uploaded extension and or MIME type, then just get those when the file is uploaded like the answers you received from @rossfabircant and @RandolphPotter. create a type that has the byte[], as well as the original extension or mimetype, and pass that around.

If you need to verify that the file is actually a certain expected type like a valid .jpeg, or .png you can try to interpret the file as those types and see if it opens successfully. (System.Drawing.Imaging.ImageFormat)

If you are trying to classify the file only from the binary contents, and it could be any format in the whole wide world, that is really a tough, open-ended problem and there is no 100% reliable way to do it. You could invoke TrID against it, and there are likely similar forensics tools used by law enforcement investigators if you can find (and afford) them.

If you don't have to do it the hard way, don't.

DanO
  • 2,526
  • 2
  • 32
  • 38
1

You don't want to do it that way. Call Path.GetExtension when the file is uploaded, and pass the extension around with the byte[].

RossFabricant
  • 12,364
  • 3
  • 41
  • 50
0

If you have a limited number of expected file types you want to support, magic numbers can be the way to go.

A simple way to check is to just open example files with a text/hex editor, and study the leading bytes to see if there is something there you can use to differentiate/discard files from the supported set.

If, on the other hand, you are looking to recognize any arbitrary file type, yeah, as everyone has stated already, tough.

Oskuro
  • 1,045
  • 7
  • 8
0

Using the System.Drawing.Image 'RawFormat.Guid' Property you can detect MIME Type of Images.

but i am not sure how to find other File Types.

http://www.java2s.com/Code/CSharp/Network/GetImageMimeType.htm

UPDATE: you may try taking a look on this post

Using .NET, how can you find the mime type of a file based on the file signature not the extension

Community
  • 1
  • 1
0

I got AccessViolationException while accessing memory using other answers, so I solved my problem using this code:

[DllImport("urlmon.dll", CharSet = CharSet.Unicode, ExactSpelling = true, SetLastError = false)]
private static extern int FindMimeFromData(IntPtr pBc,
    [MarshalAs(UnmanagedType.LPWStr)] string pwzUrl,
    [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1, SizeParamIndex = 3)]
    byte[] pBuffer,
    int cbSize,
    [MarshalAs(UnmanagedType.LPWStr)] string pwzMimeProposed,
    int dwMimeFlags,
    out IntPtr ppwzMimeOut,
    int dwReserved
);

/**
 * This function will detect mime type from provided byte array
 * and if it fails, it will return default mime type
 */
private static string GetMimeFromBytes(byte[] dataBytes, string defaultMimeType)
{
    if (dataBytes == null) throw new ArgumentNullException(nameof(dataBytes));

    var mimeType = string.Empty;
    IntPtr suggestPtr = IntPtr.Zero, filePtr = IntPtr.Zero;

    try
    {
        var ret = FindMimeFromData(IntPtr.Zero, null, dataBytes, dataBytes.Length, null, 0, out var outPtr, 0);
        if (ret == 0 && outPtr != IntPtr.Zero)
        {
            mimeType = Marshal.PtrToStringUni(outPtr);
            Marshal.FreeCoTaskMem(outPtr);
        }
    }
    catch
    {
        mimeType = defaultMimeType;
    }

    return mimeType;
}

How to call it:

string ContentType = GetMimeFromBytes(byteArray, "image/jpeg");

Hope this helps!

Homayoon Ahmadi
  • 1,181
  • 1
  • 12
  • 24