0

I am attempting to read in a csv file using pandas.read_csv. I am very confused, since the code works when one types in the csv manually.

from six.moves import cStringIO as StringIO

Companies="""
Top,       Equipment,  Users, Neither 
Company 1,       0,     0,  43
Company 2,       0,     0,  32
Company 3,       1,     3,  20
Company 4,       9,     3,  9
Company 5,       8,      7, 3
Company 6,       2,     7,  8
Company 7,       5,     2,  1
Company 8,       1,     4,  1
Company 9,       5,     1,  0
Company 10,      1,     1,  3
Company 11,      2,     2,  0
Company 12,      0,     1,  1
Company 13,      2,     0,  0
Company 14,      1,     0,  0
Company 15,      1,     0,  0
Company 16,      0,     1,  0
"""

Using:

df = pd.read_csv(StringIO(Companies),
                 skiprows=1,
                 skipinitialspace=True,
                 engine='python')

^^ The above works!

However, when I try to read the data from a separate csv,I keep getting errors.

I tried:

df = pd.read_csv(StringIO('MYDATA.csv', nrows=17, skiprows=1,skipinitialspace=True, delimiter=','))

and got the error TypeError: StringIO() takes no keyword arguments Originally I got the error TypeError: Must be Convertible to a buffer, not DataFrame, but I can't remember how I got rid of that error.

I looked up the StringIO documentation and other sites including: https://newcircle.com/bookshelf/python_fundamentals_tutorial/working_with_files but I'm stuck!

smci
  • 32,567
  • 20
  • 113
  • 146
jenryb
  • 2,017
  • 12
  • 35
  • 72
  • As to which package you import StringIO from, since 2015 you can use `pandas.compat`, which gives you Python 2/3 independence without using `six`. See [Should we use pandas.compat.StringIO or Python 2/3 StringIO?](https://stackoverflow.com/questions/50283292/should-we-use-pandas-compat-stringio-or-python-2-3-stringio) – smci May 13 '18 at 10:05

1 Answers1

2

You closed the parentheses in the wrong location:

df = pd.read_csv(StringIO('MYDATA.csv', nrows=17, skiprows=1,skipinitialspace=True, delimiter=','))
#                        ^            ^ not closed here

You'd move the closing parenthesis to close the StringIO() call and leave the keyword arguments for the pd.read_csv() call:

df = pd.read_csv(StringIO('MYDATA.csv'), nrows=17, skiprows=1,skipinitialspace=True, delimiter=',')

Note that StringIO('MYDATA.csv') creates an in-memory file with the contents MYDATA.csv; it does not open a file with that filename. If you wanted to open a file on your filesystem named MYDATA.csv, you need to leave off the StringIO call:

df = pd.read_csv('MYDATA.csv', nrows=17, skiprows=1, skipinitialspace=True, delimiter=',')
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Oh duh! I feel silly. Fixed that, but now I'm getting a ValueError: No columns to parse from file. I tried messing around and adding an encoding = 'utf-16' but that might not be the code I'm looking for, would you know anything about how to fix this? – jenryb Jun 09 '15 at 16:21
  • @jenryb: are you now actually opening the file or still using `StringIO()`? – Martijn Pieters Jun 09 '15 at 16:23
  • Still using StringIO(). Should I open the file first? That would make sense since the original was available inside the code. So would I type: with open('MYDATA.csv', 'rb') as f? Then call f in the StringIO? – jenryb Jun 09 '15 at 16:30
  • @jenryb: you don't need to use `StringIO()` *at all*. You only need to use that when you have the data in a Python string. See my answer; the last version uses no `StringIO` and passes the filename directly to the `pd.read_csv()` call; the first argument can also be a filename. – Martijn Pieters Jun 09 '15 at 16:34