4

After googling, I found that Magic numbers can be used to identify the content type of a file.

In my program, I would like to validate the file content type on server side.

My client side code :

<form action="/Home/Index" method="post" enctype="multipart/form-data">
    <input type="file" id="inputFile" value="" onchange="readFileContent(this)" />
    <input type="submit" value="Submit" />
</form>

function readFileContent(input) {
        if (input.files && input.files[0]) {

            reader = new FileReader();
            reader.onload = function (e) {

                var xhr = new XMLHttpRequest();
                xhr.open('POST', '/Home/CheckFileType', true);
                xhr.setRequestHeader("Content-Type", "multipart/form-data");
                xhr.setRequestHeader('X-File-Name', input.files[0].name);
                xhr.setRequestHeader('X-File-Type', input.files[0].type);
                xhr.setRequestHeader('X-File-Size', input.files[0].size);
                xhr.send(input.files[0]);
                xhr.onreadystatechange = function () {
                    if (xhr.readyState == 4 && xhr.status == 200) {
                        alert(xhr.responseText);
                    }
                }

            };
            reader.readAsText(input.files[0]);
        }
    }

And this is my server side code :

[HttpPost]
        public JsonResult CheckFileType()
        {
            string fileType = Request.Headers["X-File-Type"];
            byte[] buffer = new byte[Request.InputStream.Length];
            Request.InputStream.Read(buffer, 0, Convert.ToInt32(Request.InputStream.Length));

            object result = new { status = "finished" };
            return Json(result, JsonRequestBehavior.AllowGet);
        }

What is the magic number for a plain-text or .txt file

Yeasin Abedin
  • 2,081
  • 4
  • 23
  • 41

1 Answers1

13

Magic numbers in the context discussed here are often used to indicate what kind of data is in a binary file. A program that parses a file can look at the magic number and then know what to do with the rest of the file. For example, the magic number for all Java .class files (not source files) is 0xCAFEBABE. At runtime when the classloader loads a class it will look at those first 4 bytes and if they aren't 0xCAFEBABE, the class loader will not treat the file as a valid Java class file. If you were defining your own file type for some software you were writing or expected others to write, you could define your own magic number or numbers. When software created files of your type it would be that software's responsibility to write the appropriate magic number in the file. The software that reads the files could use that magic number to help decide what to do.

Magic numbers do not make sense for plain text files. If you write a magic number to the file, it would no longer be a plain text file. It would be a file that follows your format which might be a magic number followed by a bunch of plain text. If that is what you want, then do it. I don't know what your app is doing but conceivably that might make sense as long as you know the files will always be read and written by your own software (or other software which is compliant with your magic number expectations).

I hope that helps.

Jeff Scott Brown
  • 26,804
  • 2
  • 30
  • 47
  • 1
    What if someone rename an executable file to make it looks like text file and try to upload it, what would be the proper way to detect real text file from fake ones? – Mehdi Dehghani Feb 02 '22 at 16:24
  • "What if someone rename an executable file to make it looks like text file and try to upload it, what would be the proper way to detect real text file from fake ones?" - It depends on what qualifies as a "fake one". for example, if anything that doesn't begin with 0xCAFEBABE counts as fake, then read those initial bytes and verify they are as expected. – Jeff Scott Brown Feb 02 '22 at 17:40
  • 1
    No, let's say there is some executable file called `my-file.blah`, then user rename it to `my-file.txt` and upload it via file uploader in my application's form. if txt file is not recognizable via magic number, how can I check if the file is real text file, not some other file renamed to .txt? – Mehdi Dehghani Feb 02 '22 at 17:46
  • @MehdiDehghani The file you are describing isn't a text file. Just renaming the extension to `.txt` doesn't change the contents of the file. If you are given a file whose name ends in `.txt` and you want to know if the contents of that file contain a valid JVM class declaration, one piece of validating that would be to read the first 4 bytes and verify that they are 0xCAFEBABE. – Jeff Scott Brown Feb 02 '22 at 17:57
  • "how can I check if the file is real text file, not some other file renamed to .txt?" - It depends on what your definition of "real" is. If "real" means that it begins with a certain magic number, read the contents of the file and see if that number is there. – Jeff Scott Brown Feb 02 '22 at 17:57
  • I'm sorry, I should mention it before, I'm not talking about JVM or any other specific file type here, based on `Magic numbers do not make sense for plain text files.` I'm wondering how can I detect if some file (it could be any file but plain text file) renamed to be looks like plain text file. e.g: somefile.someextension - renamed to -> somefile.txt – Mehdi Dehghani Feb 02 '22 at 18:04
  • 1
    "I'm wondering how can I detect if some file (it could be any file but plain text file) renamed to be looks like plain text file. e.g: somefile.someextension - renamed to -> somefile.txt" - You need something that defines the realness or fakeness of the file you are receiving. If the expected file format contains a magic number, look for that. If the expected file format has an expected size, look for that. You have the entire file to inspect. Do whatever it takes to validate that it contains expected input. Checking the extension is probably insufficient. – Jeff Scott Brown Feb 02 '22 at 18:55