1

How can I check if a file that is in a directory is a CSV file without checking its extension?

  • 1
    You could check several (or all) rows of data in the file to confirm that there are the same number of commas in each row. Not that that is any guarantee. – MikeC Mar 15 '16 at 17:41
  • 3
    Not with 100% efficiency. The best way I can think of is to see if it follows the format of commas between words or quoted (") sections. Then test that each new line has the same number of sections – Michael Coxon Mar 15 '16 at 17:41
  • There are some mime type detection tools, but I don't know how reliable they are or how many file types they detect. [This link](http://stackoverflow.com/questions/58510/using-net-how-can-you-find-the-mime-type-of-a-file-based-on-the-file-signature) can help: – gender_madness Mar 15 '16 at 17:47
  • 1
    @MikeC A compliant CSV file isn't necessarily going to have the same number of commas in all lines, or is a non-CSV files necessarily going to not have the same number of commas in each line. – Servy Mar 15 '16 at 17:58
  • @Servy Which is why I said "Not that that is any guarantee." in my comment. In fact Michael Coxon effectively made the same suggestion a few seconds after I did. – MikeC Mar 15 '16 at 21:01
  • Technically speaking, wouldn't a file that contains a single row/column value (e.g. the word "StackOverflow") be a valid CSV file? It's tough to make any decisions with making some minimum assumptions about what you might be expecting in a file. Does it need to contain a header? More than one column? So many rows? At least one comma? You'd probably be better off spending time attempting to parse whatever you have, reacting appropriately when a file fails the parsing. – Cᴏʀʏ Mar 16 '16 at 02:23

1 Answers1

8

If the file extension is not a mandatory for your program design, but only format of the file (csv in your case), the only valid way to check if a given file is either "ok" for you or not, is simply to check your data structure after you populated it from the file.

The basic flow in gross may look like this:

1) Select file

2) Read the file

  • a) Exception happens somewhere = non valid CSV formatted file
  • b) All is ok

3) Validate datastructure(s) populated from the file

  • a) Some mandatory fields are not initialized = non valid CSV file
  • b) All is ok
Tigran
  • 61,654
  • 8
  • 86
  • 123