5

Using this question: Pandas writing dataframe to CSV file as a model, I wrote the following code to make a csv file:

df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep='\s+', header=True)

But it returns the following error:

TypeError: "delimiter" must be an 1-character string

I have looked up the documentation for this here http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html but I can't figure out what I am missing, or what that error means. I also tried using (sep='\s') in the code, but got the same error.

Mohamed Ali JAMAOUI
  • 14,275
  • 14
  • 73
  • 117
Julia
  • 85
  • 1
  • 2
  • 8
  • the code creates the file, but then does not write data in it. – Julia Jan 08 '14 at 19:44
  • 3
    If you would like the `sep` to be a space use `sep=' '`. – mechanical_meat Jan 08 '14 at 19:46
  • 3
    Just to expand on bernie's comment, `\s+` is a regex that matches 1 or more spaces. It's useful for *reading* csv's that use a variable number of spaces as a separator. You want to *write* your csv with single spaces separating. – TomAugspurger Jan 08 '14 at 19:51

2 Answers2

4

As mentioned in the issue discussion (here), this is not considered as a pandas issue but rather a compatibility issue of python's csv module with python2.x.

The workaround to solve it is to enclose the separator with str(..). For example, here is how you can reproduce the problem, and then solve it:

from __future__ import unicode_literals
import pandas as pd 
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=',')

This will raise the following error:

TypeError ....              
----> 1 df.to_csv(sep=',')
TypeError: "delimiter" must be an 1-character string

The following however, will show the expected result

from __future__ import unicode_literals
import pandas as pd 
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=str(','))

Output:

',0,1\n0,a,A\n1,b,B\n'

In your case, you should edit your code as follows:

df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep=str('\s+'), header=True)
Mohamed Ali JAMAOUI
  • 14,275
  • 14
  • 73
  • 117
2

Note that the although the solution to this error was using a string charcter instead of regex, pandas also raises this error when using from __future__ import unicode_literals with valid unicode characters. As of 2015-11-16, release 0.16.2, this error is still a known bug in pandas:
"to_csv chokes if not passed sep as a string, even when encoding is set to unicode" #6035

For example, where df is a pandas DataFrame:

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd

df.to_csv(pdb_seq_fp, sep='\t', encoding='utf-8')

TypeError: "delimiter" must be an 1-character string

Using a byte lteral with the specified encoding (default utf-8 with Python 3) -*- coding: utf-8 -*- will resolve this in pandas 0.16.2: (b'\t') —I haven't tested with previous versions or 0.17.0.

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd

df.to_csv(pdb_seq_fp, sep=b'\t', encoding='utf-8')

(Note that with versions 0.13.0 - ???, it was necessary to use pandas.compat import u; but by 0.16.2 the byte literal is the way to go.)

Community
  • 1
  • 1
Michelle Welcks
  • 3,513
  • 4
  • 21
  • 34