36

Are there any good sites/services to validate consistency of CSV file ?

The same as W3C validator but for CSV ?

Thanks!

Scherbius.com
  • 3,396
  • 4
  • 24
  • 44

7 Answers7

22

I recently came across Google Refine (now OpenRefine) - it's not a service for validating CSV files, it's a tool you download locally, but it does provide a lot of tools for working with data and detecting anomalies.

As mentioned in a reply, "CSV" has become an ill-defined term, principally because people don't follow the One True Way when using delimiter separated data

http://www.catb.org/~esr/writings/taoup/html/ch05s02.html

EDIT/UPDATE (2016-08-09):
CSV Currently Becoming a Well-Defined Term by the W3C CSV Working Group

rd3n
  • 4,440
  • 1
  • 33
  • 45
Adrian
  • 2,244
  • 18
  • 20
  • Getting a good definition of CSV is all very well - but given the majority of CSV data will continue to be i) Excel and ii) horrible old code ... I don't hold out a lot of hope that it will really make a difference. I would tend to advocate people use the One True Way linked above if they are creating new tabular data export routines... – Adrian Aug 09 '16 at 20:52
13

The Open Data Institute is developing a CSV validation service that will allow users to check the structure of their data as well as validate it against a simple schema.

The service is still very much in alpha but can be found here:

http://csvlint.io/

The code for the application and the underlying library are both open source:

https://github.com/theodi/csvlint

https://github.com/theodi/csvlint.rb

The README in the library provides a summary of the errors and warnings that can be generated. The following types of error can be reported:

  • :wrong_content_type -- content type is not text/csv
  • :ragged_rows -- row has a different number of columns (than the first row in the file)
  • :blank_rows -- completely empty row, e.g. blank line or a line where all column values are empty
  • :invalid_encoding -- encoding error when parsing row, e.g. because of invalid characters
  • :not_found -- HTTP 404 error when retrieving the data
  • :quoting -- problem with quoting, e.g. missing or stray quote, unclosed quoted field
  • :whitespace -- a quoted column has leading or trailing whitespace

The following types of warning can be reported:

  • :no_encoding -- the Content-Type header returned in the HTTP request does not have a charset parameter
  • :encoding -- the character set is not UTF-8
  • :no_content_type -- file is being served without a Content-Type header
  • :excel -- no Content-Type header and the file extension is .xls
  • :check_options -- CSV file appears to contain only a single column
  • :inconsistent_values -- inconsistent values in the same column. Reported if <90% of values seem to have same data type (either numeric or alphanumeric including punctuation)
ldodds
  • 249
  • 2
  • 4
2

The National Archives developed a CSV Schema Language and CSV Validator, software written in Java. It's open source.

Milos
  • 192
  • 3
  • 11
1

To validate a CSV file I use the RAINBOW CSV extension in Visual Studio Code and also I open the CSV file in Excel.

mruanova
  • 6,351
  • 6
  • 37
  • 55
1

There is a great way to validate your CSV file.I am referring to this article, where the whole process is explained in tiniest details.

The validation process has two steps: the first one is to post the file to the API. Once your file is accepted,the API returns a polling endpoint that contains the results of the validation process.10 MB limit per file.

monkrus
  • 1,470
  • 24
  • 23
0

CSV Lint at csvlint.com (not .io :) is a service we're building to solve this problem. It checks CSV files against user-defined validation rules / schemas cell by cell.

We spent a lot of time tweaking the UI to allow users to create complex validation rules / schemas easily that meet their business needs without a single line of code.

Our offline validation feature allows users to see the results in-realtime even when validating multiple large size (with millions+ rows) files, and most importantly it 100% protects user data privacy.

Joe
  • 279
  • 1
  • 4
  • 15
0

Toolkit Bay CSV Validator & Linter online, easy to use, set delimiter and go.

Flatfile CSV validator online demo, automatic delimiter detection, upload and go.

Akira Yamamoto
  • 4,685
  • 4
  • 42
  • 43