12

Trying to convert a .tsv to a .csv. This:

import csv

# read tab-delimited file
with open('DataS1_interactome.tsv','rb') as fin:
    cr = csv.reader(fin, delimiter='\t')
    filecontents = [line for line in cr]

# write comma-delimited file (comma is the default delimiter)
with open('interactome.csv','wb') as fou:
    cw = csv.writer(fou, quotechar='', quoting=csv.QUOTE_NONE)
    cw.writerows(filecontents)

Gives me this error:

  File "tsv2csv.py", line 11, in <module>
    cw.writerows(filecontents)
_csv.Error: need to escape, but no escapechar set
hannah
  • 889
  • 4
  • 13
  • 27
  • 1
    Why don't you simply do a find and replace from tab to comma on the file content ? `fileContent = re.sub("(?ism)\t", ",", fileContent ) ` – Pedro Lobito Apr 20 '15 at 22:12
  • 1
    @PedroLobito Damn, ninja'd! Also, You might need to change `quotechar=''` to `quotechar='"'`. – The name's Bob. MS Bob. Apr 20 '15 at 22:13
  • I would refer you to this answer http://stackoverflow.com/questions/2535255/fastest-way-convert-tab-delimited-file-to-csv-in-linux Does this help? – dparadis28 Apr 20 '15 at 22:15
  • 3
    @PedroLobito There might be commas in the tsv. – Don Roby Apr 20 '15 at 22:17
  • @hannah, which escape character did you chose? – Pedro Lobito Apr 20 '15 at 22:49
  • @hannah according to [rfc4180](https://tools.ietf.org/html/rfc4180), "Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes." The correct solution for your problem would be enclose the existing TSV values in double-quotes. – Pedro Lobito Apr 21 '15 at 12:07

5 Answers5

11
import pandas as pd 
tsv_file='name.tsv'
csv_table=pd.read_table(tsv_file,sep='\t')
csv_table.to_csv('new_name.csv',index=False)

We can use the above code to convert the .tsv file to .csv file

Terminator17
  • 782
  • 1
  • 6
  • 13
  • bash alias `alias tsv2csv="python -c 'import sys,pandas; assert len(sys.argv)==3; pandas.read_table(sys.argv[1], sep=\"\\t\",quoting=3).to_csv(sys.argv[2], index=False)'"`. Usage: `tsv2csv inp.tsv out.csv` – Thamme Gowda Jun 05 '23 at 02:38
7

While attempting to write to the CSV file, it encounters a token where it has to insert an escape character. However, you have not defined one.

Dialect.escapechar

A one-character string used by the writer to escape the delimiter if quoting is set to QUOTE_NONE and the quotechar if doublequote is False. On reading, the escapechar removes any special meaning from the following character. It defaults to None, which disables escaping.

Source: https://docs.python.org/2/library/csv.html#csv.Dialect.escapechar

Example code:

# write comma-delimited file (comma is the default delimiter)
with open('interactome.csv','wb') as fou:
    cw = csv.writer(fou, quotechar='', quoting=csv.QUOTE_NONE, escapechar='\\')
    cw.writerows(filecontents)
Community
  • 1
  • 1
Erik S
  • 1,939
  • 1
  • 18
  • 44
  • What would be the correct escape character on a .csv file ? – Pedro Lobito Apr 20 '15 at 22:49
  • Generally, you would use a backslash as your escape character. I've updated my answer with the correct escapechar. – Erik S Apr 21 '15 at 11:59
  • By the way, it's a double backslash because it actually is an escape character in Python as well; `\\` means 'The character \'. Otherwise, it would treat the apostrophe after it as a character, instead of a token. – Erik S Apr 21 '15 at 12:00
  • 1
    @eric-dolor according to rfc4180, "Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes." Based on this, both of our answers are incorrect. The correct solution for this problem would be enclose the existing TSV values in double-quotes. – Pedro Lobito Apr 21 '15 at 12:03
  • That would not be enough, however, since a double quote inside double quotes would still need to be escaped. This only means you should add " as the quotechar. – Erik S Apr 21 '15 at 13:49
  • 1
    @eric-dolor No, your assumption is wrong, on a CSV file double-quotes inside double-quotes don't need to be escaped. Source: [rfc4180](https://tools.ietf.org/html/rfc4180) Page 2 -> #7 – Pedro Lobito Apr 21 '15 at 13:53
1

TSV is a file type where fields are separated by tab. If you want to convert a TSV into a CSV (comma separated value) you just need to do a find and replace from TAB to COMMA.

Update:
As pointed out by don-roby, "There might be commas in the tsv", for that we use a regex to escape all the csv special characters as defines by rfc4180.

i.e.:

import re
tsv = open('tsv.tsv', 'r')
fileContent =  tsv.read()
appDesc = re.sub("""(?ism)(,|"|')""", r"\\\1", appDesc) # escape all especial charaters (" ' ,) rfc4180
fileContent = re.sub("\t", ",", fileContent) # convert from tab to comma
csv_file = open("csv.csv", "w")
csv_file.write(fileContent)
csv_file.close()
Community
  • 1
  • 1
Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268
  • While this would of course work as well, using parsers is a great exercise for novice programmers! It's a good chance to get to use some different tools than you would normally use in scripts. – Erik S Apr 21 '15 at 10:05
  • @ErikDolor Thank you and good luck with your new career. Can you please the question I've posted on your answer ? (What would be the correct escape character on a .csv file ?) tks :) – Pedro Lobito Apr 21 '15 at 10:14
0
import sys
import csv

tabin = csv.reader(open('sample.txt'), dialect=csv.excel_tab)
commaout = csv.writer(open('sample.csv', 'wb'), dialect=csv.excel)

for row in tabin:
  commaout.writerow(row)
  • While this code may answer the question, providing additional context regarding why and/or how this code answers the question improves its long-term value. – Donald Duck Jun 22 '17 at 00:23
0
import pandas as pd
file_path = "/DataS1_interactome.tsv"
DataS1_interactome.csv = pd.read_csv(file_path, sep="\t")
thepunitsingh
  • 713
  • 1
  • 12
  • 30