0

Is there any way to find the exact extension of File.For example if the extension is .csv, somebody changed the extension manually into .xlsx(e.g. Changing of .csv extension manually in desktop into .pdf).where .Now file extension will be changed in to .xlsx format but content are belong to .csv format.So there is any way to find such kind of malicious file in C#. Below code get the file extension,but it doesn't know whether the file extension is changed manually or not

Path.GetExtension(Be.FileName).ToLower().Contains(".csv")  

I want to check both content type extension and also file extension.Since some intruder may change the file extension into extension that we want.But file content may belong to different extension.Since above code find only the filename extension and not content exact Extension.

Do we have any other property in CSVhelper for find such kind of file.

Can anyone help on this?

  • 6
    What you've got does find "the exact extension" - but that's very different from "the format of the actual content of the file." There's no guaranteed relationship between a filename and its content. – Jon Skeet Dec 01 '17 at 11:24
  • You could invent some kind of heuristic file checker .... but the sheer amount of cases would make that unprobable. You know what you "want" as file - try to parse it, catch errors and chide the user. Crap in - crap out. – Patrick Artner Dec 01 '17 at 11:26
  • Also, bear in mind that two files may contain *exactly the same* content but the intended *interpretation* is different. E.g. the issue that content-types in HTTP tries to deal with. – Damien_The_Unbeliever Dec 01 '17 at 11:29
  • Possible duplicate of [How to determine file type?](https://stackoverflow.com/questions/4177922/how-to-determine-file-type) – mjwills Dec 01 '17 at 11:39
  • I want to check both content type extension and also file extension.Since some intruder may change the file extension into extension that we want.But file content may belong to different extension.Since above code find only the filename extension and not content exact Extension right?. – Sai Krishnan Harish Dec 01 '17 at 11:57
  • Did you read all of the suggestions in the link? – mjwills Dec 01 '17 at 12:02
  • 2
    Quoting @JonSkeet _here's no guaranteed relationship between a filename and its content_. I read (quite a long time ago) about a way to create a file that is a valid bmp and a valid HTML at the same time (with content of bitmap and of HTML completely unrelated). name the file .bmp, double click it and OK will open Paint or whatever is the default editor showing an image. Name it .HTML and it will open the browser showing the HTML. Long story short, you can maybe write some Heuristic parser, but be aware that you cannot be sure it will work correctly 100 % of the times. – Gian Paolo Dec 01 '17 at 12:36
  • @SaiKrishnanHarish, usually one uses backups to restore or revert files to a previous state. IMO, you should concentrate on being able to prevent an attack, but most of all, be able to understand that it happened in the first place. – r41n Dec 01 '17 at 13:21

1 Answers1

0

CSV is a pretty simple file type to check for. Write a file parser before you take the file as input to your program.

Parser does following: Prove the integrity of the file as a CSV file. It does that by checking the seperator in the middle and the new line separator at the end.

If the file is correctly titled and correctly formatted as a CSV, only then you proceed.

The 2nd method might be a bit complex. If you have control over the entire filesystem then you could place file loggers on files you want to be notified about. So any change in the file would be reported to your program. This would cost a lot of resources ultimately and still might fail at times.

demoncrate
  • 390
  • 2
  • 14