109

How is a CSV file built in general? With commas or semicolons? Any advice on which one to use?

Lance Roberts
  • 22,383
  • 32
  • 112
  • 130
membersound
  • 81,582
  • 193
  • 585
  • 1,120
  • 7
    See [Comma-separated values](http://en.wikipedia.org/wiki/Comma-separated_values) (Wikipedia). – Jesper Apr 13 '12 at 12:42
  • 7
    CSV -> Comma Seperated Values – ChadNC Apr 13 '12 at 13:02
  • @ChadNC I thought of that too when selecting the delimiter that my code would use. Easiest solution was to use `;`, which is automatically recognized both in Ubuntu and Windows, without me having to dictate any specific setting(s). – gsamaras Dec 11 '18 at 13:07

10 Answers10

94

In Windows it is dependent on the "Regional and Language Options" customize screen where you find a List separator. This is the char Windows applications expect to be the CSV separator.

Of course this only has effect in Windows applications, for example Excel will not automatically split data into columns if the file is not using the above mentioned separator. All applications that use Windows regional settings will have this behavior.

If you are writing a program for Windows that will require importing the CSV in other applications and you know that the list separator set for your target machines is ,, then go for it, otherwise I prefer ; since it causes less problems with decimal points, digit grouping and does not appear in much text.

Marcos Dimitrio
  • 6,651
  • 5
  • 38
  • 62
Sibster
  • 3,081
  • 21
  • 18
  • 41
    This change will also affect Opening of CSV files in Excel. If you want to make more compatible for windows CSV file, put `sep=;` on top of the CSV file. – papo Dec 30 '15 at 03:23
  • 17
    > "i prefer ; since it causes less problems with decimal points". This is the wrong solution. If your separator is a comma and a cell's value also contains a comma, you must enclose the value into double quotes. For example: 111,222,"33,5",444,"55,98",666 – Elmue Jul 27 '16 at 20:41
  • 1
    It is the case for Mac Numbers app as well. I was pulling my hair out because I couldn't make a csv that Photoshop recognized. I changed region option back to US english and it all worked fine again – Tibidabo May 11 '17 at 08:30
  • 35
    That regional thing is the most stupid thing it could ever be made. How can format of the csv be different across regions, this means my colleagues cannot open or save csv for guys with a pc in another region .. oucch – Miguel Jun 01 '18 at 20:04
  • If you're programming in `C` or `C++` under Windows you can use the following API to retrieve system-wide, or user delimiter: `GetLocaleInfo(lcid, LOCALE_SLIST, buff, _countof(buff));` – ahmd0 Nov 03 '18 at 03:21
73

CSV is a standard format, outlined in RFC 4180 (in 2005), so there IS no lack of a standard. https://www.ietf.org/rfc/rfc4180.txt

And even before that, the C in CSV has always stood for Comma, not for semiColon :(

It's a pity Microsoft keeps ignoring that and is still sticking to the monstrosity they turned it into decades ago (yes, I admit, that was before the RFC was created).

  • One record per line, unless a newline occurs within quoted text (see below).
  • COMMA as column separator. Never a semicolon.
  • PERIOD as decimal point in numbers. Never a comma.
  • Text containing commas, periods and/or newlines enclosed in "double quotation marks".
  • Only if text is enclosed in double quotation marks, such quotations marks in the text escaped by doubling. These examples represent the same three fields:

    1,"this text contains ""quotation marks""",3

    1,this text contains "quotation marks",3

The standard does not cover date and time values, personally I try to stick to ISO 8601 format to avoid day/month/year -- month/day/year confusion.

Luc VdV
  • 1,088
  • 9
  • 14
  • 29
    The monstruosity would have never appeared if semicolon was chosen in first place. The comma is not only used for decimals or thousands separator, but also very often in text, unlike the semicolon. Semicolon would have been a better choice because it's much more rare... – AFract Jun 27 '16 at 09:04
  • 19
    When I said monstrosity, I meant that Microsoft made it language dependent. Excel opens CSV files and treats them as spreadsheets, and can save spreadsheets in CSV format, but a CSV file saved by Excel in the US (for example) will not work if someone in France or Belgium tries to open it, and vice versa. Whatever standard is OK for me, as long as there's a standard that works for everybody. – Luc VdV Jun 28 '16 at 10:21
  • 1
    As far as I see, the "PERIOD as decimal point in numers" and "text containing periods must be enclosed in double quotation marks" ist not part of the the RFC4180. But I agree, that this is a good recommendation. – Fried Aug 30 '16 at 08:22
  • "PERIOD as decimal point in numbers. Never a comma." Nonsense and against most non-US standards. Also an additional problem is that Yankees use also comma as thousands separator (quite silly) when the rest of the World do not need stinking thousand separators at all. That said, who ever thought back in the day that comma (instead of semicolon) should be a viable data delimiter for CSV was just insane. Now we just need to follow those insane standards and then escape everything using double quotes. – Risto Välimäki Jun 27 '18 at 06:43
  • 6
    *"CSV is a standard format, outlined in RFC 4180 (in 2005), so there IS no lack of a standard."* Unfortunately, RFC 4180 is a request for comments, NOT a standard. It says right at the top -- *"does not specify an Internet standard of any kind."* Later, it says that RFC 4180 *"documents the format that seems to be followed by most implementations."* It appears that this was created by a small private company, not associated with any standard's body. It's an excellent starting point, but there are unfortunately lots of CSV files out there in the wild that don't follow this "standard." – Jim Rea Jan 04 '19 at 21:50
  • 11
    *ALL* internet standards are called RFC. – Luc VdV Jan 05 '19 at 22:10
  • 2
    But when they are standard, they do not state they are not intended to be normative. Instead, they have a number of normative sections within, which is not the case of RFC 4180. – Frédéric Sep 11 '19 at 12:28
  • 1
    I agree with Luc, the Excel approach makes CSV file interchange impossible among countries using different decimal/thousand delimiters, and is not easy to overcome. In addition I would say that a file where values are not separated by commas shall not be called CSV, to avoid confusion. In Excel I tried File -> Options -> Advanced -> Use system delimiters = unchecked, and then: Decimal Delimiter = (dot), and Thousand delimiter = (comma), but after doing that, saving a file as CSV still used semicolon delimiters. Very annoying. – Giorgio Barchiesi May 14 '20 at 09:53
  • The whole argument to use a semicolon because it appears less in data is crazy. Any separate you use you still need to use escaping for when the separator appears in the data. Because the error occurs less doesn't make it a better solution. – AnthonyT Jun 30 '22 at 06:53
  • The pure comma approach is makes interchange impossible among countries as long as there is no consensus on the decimal comma/dot (and the thousand separator) topic (quite a huge international standard topic, we do not need to tell them what we think, they already know it). The problem could be solved using tab-separator but then we have the problem if ctrl-chars are banned. Believe me we have to live with what we got and have to handle it right. I think this article should be about how to handle rather than what we think about it. I think MS tried to do the best of it as it is. – Jan Bergström Nov 06 '22 at 04:55
  • This topic has many sides and Excel can't take the UTF-BOM and the sep= at the same time and the BOM is more important. On top of that Excel will not accept decimal dots in a computer set to decimal commas and the topic is actually much larger than imaginable. Cross border excel/CSV-file transfers is a user topic and the important thing doing an application is to do it right where the user are (the general OS settings) and make the BOM and not the sep=. Read more https://superuser.com/questions/407082/easiest-way-to-open-csv-with-commas-in-excel – Jan Bergström Nov 06 '22 at 17:58
61

I'd say stick to comma as it's widely recognized and understood. Be sure to quote your values and escape your quotes though.

ID,NAME,AGE
"23434","Norris, Chuck","24"
"34343","Bond, James ""master""","57"
adarshr
  • 61,315
  • 23
  • 138
  • 167
  • 6
    +1 for covering a lot of csv scenarios with a very short example – Niels Brinch Sep 03 '12 at 12:53
  • 1
    Comma separating on its own is good, but you get problems when you get fields that have commas in them. Here is a good reference i found [on the manual](http://www.php.net/manual/en/function.fputcsv.php#56827) – Ibu May 16 '13 at 18:35
  • 4
    WRONG! In my machine if I write a CSV file using commas as separators and open it in Excel, I get the whole line to occupy only one column! It must be some Windows regional setting! – sergiol Jan 04 '17 at 14:56
  • 3
    @sergiol I'm not sure how that invalidates what they said. – AMC Apr 09 '20 at 01:02
  • Seems only the USA and UK use comma delimiters, the rest of the world prefers semicolon, so I wouldn't suggest this answer at all. The better solution is to specify the separator at the first line of the file using `sep=;` like Excel supports. – MasterHD Jul 12 '23 at 09:19
49

Also relevant, but specially to excel, look at this answer and this other one that suggests, inserting a line at the beginning of the CSV with

"sep=,"

To inform excel which separator to expect

Rafareino
  • 2,515
  • 1
  • 19
  • 26
11

1.> Change File format to .CSV (semicolon delimited)

To achieve the desired result we need to temporary change the delimiter setting in the Excel Options:

Move to File -> Options -> Advanced -> Editing Section

Uncheck the “Use system separators” setting and put a comma in the “Decimal Separator” field.

Now save the file in the .CSV format and it will be saved in the semicolon delimited format.

Anil_M
  • 10,893
  • 6
  • 47
  • 74
7

Initially it was to be a comma, however as the comma is often used as a decimal point it wouldnt be such good separator, hence others like the semicolon, mostly country dependant

http://en.wikipedia.org/wiki/Comma-separated_values#Lack_of_a_standard

  • That section of the Wikipedia article disappeared. You may want to re-link to https://en.wikipedia.org/wiki/Comma-separated_values#Basic_rules, This has a passage on locales: > "_Adjacent fields must be separated by a single comma. However, "CSV" formats vary greatly in this choice of separator character. In particular, in locales where the comma is used as a decimal separator, semicolon, TAB, or other characters are used instead."_ – Jules Kerssemakers May 28 '19 at 10:26
  • Nice overview on available "standards" is given at https://w3c.github.io/csvw/syntax/#standards. – koppor Apr 15 '23 at 21:41
4

CSV is a Comma Seperated File. Generally the delimiter is a comma, but I have seen many other characters used as delimiters. They are just not as frequently used.

As for advising you on what to use, we need to know your application. Is the file specific to your application/program, or does this need to work with other programs?

Youssef G.
  • 617
  • 4
  • 10
3

To change comma to semicolon as the default Excel separator for CSV - go to Region -> Additional Settings -> Numbers tab -> List separator and type ; instead of the default ,

Michael
  • 39
  • 1
  • 1
    If your regional settings are set with `,` as decimal separator and you export a file with columns separated by `,` when you open it In Excel, text will appear in ONLY ONE column. In Excel 2013, select the first column, go to DATA -> Text To Columns -> Choose `Delimited`, click `Next` then choose only`Comma` on `Delimiters` and you will see changes immediately in the `Data preview` and finally click `Finish` – sergiol Jan 04 '17 at 15:15
2

Well to just to have some saying about semicolon. In lot of country, comma is what use for decimal not period. Mostly EU colonies, which consist of half of the world, another half follow UK standard (how the hell UK so big O_O) so in turn make using comma for database that include number create much of the headache because Excel refuse to recognize it as delimiter.

Like wise in my country, Viet Nam, follow France's standard, our partner HongKong use UK standard so comma make CSV unusable, and we use \t or ; instead for international use, but it still not "standard" per the document of CSV.

-3

best way will be to save it in a text file with csv extension:

Sub ExportToCSV()
Dim i, j As Integer
Dim Name  As String

Dim pathfile As String

Dim fs As Object
    Dim stream As Object

    Set fs = CreateObject("Scripting.FileSystemObject")
On Error GoTo fileexists

i = 15
Name = Format(Now(), "ddmmyyHHmmss")
pathfile = "D:\1\" & Name & ".csv"

Set stream = fs.CreateTextFile(pathfile, False, True)

fileexists:

If Err.Number = 58 Then
    MsgBox "File already Exists"
    'Your code here
    Return
End If
On Error GoTo 0

j = 1
Do Until IsEmpty(ThisWorkbook.ActiveSheet.Cells(i, 1).Value)

    stream.WriteLine (ThisWorkbook.Worksheets(1).Cells(i, 1).Value & ";" & Replace(ThisWorkbook.Worksheets(1).Cells(i, 6).Value, ".", ","))

    j = j + 1
    i = i + 1
Loop


stream.Close

End Sub
hering
  • 1,956
  • 4
  • 28
  • 43
AzizD
  • 103
  • 10