I'm managing the upload of different types of files on server side.
I have implemented an action that takes care of returning the file format by comparing the byte sequence
of the file with the byte sequence
of specific file formats.
While searching I found this answer which helped me a lot.
So I implemented my action like this:
private static MediaFormat GetFormat(byte[] bytes, string fileName = null)
{
// these are my file formats byte sequences
byte[] jpeg = new byte[] { 255, 216, 255, 224 };
byte[] jpeg2 = new byte[] { 255, 216, 255, 225 };
byte[] png = new byte[] { 137, 80, 78, 71 };
byte[] doc = new byte[] { 208, 207, 17, 224, 161, 177, 26, 225 };
byte[] docx_zip = new byte[] { 80, 75, 3, 4 };
byte[] pdf = new byte[] { 37, 80, 68, 70, 45, 49, 46 };
if (jpeg.SequenceEqual(bytes.Take(jpeg.Length)))
return MediaFormat.jpg;
if (jpeg2.SequenceEqual(bytes.Take(jpeg2.Length)))
return MediaFormat.jpg;
if (png.SequenceEqual(bytes.Take(png.Length)))
return MediaFormat.png;
if (doc.SequenceEqual(bytes.Take(doc.Length)))
return MediaFormat.doc;
if (docx_zip.SequenceEqual(bytes.Take(docx_zip.Length)))
{
if (!string.IsNullOrEmpty(fileName) && fileName.Contains(".zip", StringComparison.OrdinalIgnoreCase))
return MediaFormat.zip;
return MediaFormat.docx;
}
if (pdf.SequenceEqual(bytes.Take(pdf.Length)))
return MediaFormat.pdf;
return MediaFormat.unknown;
}
In the answer I found (and shared in this question) the creator indicates a link to a site where I could find other sequences of bytes to identify other formats but unfortunately the site is 404
so I couldn't find all the formats I needed, PowerPoint (.ppt
, .pptx
) and Excel and CSV (.xlx
, .xlxs
, .csv
) and if even .txt
were possible.
Could anyone tell me what the correct byte sequences are or where can I find them? Thanks so much!