Python csv string to array

Question

Anyone know of a simple library or function to parse a csv encoded string and turn it into an array or dictionary?

I don't think I want the built in csv module because in all the examples I've seen that takes filepaths, not strings.

score 356 · Accepted Answer · edited Sep 19 '19 at 14:56

356

You can convert a string to a file object using io.StringIO and then pass that to the csv module:

from io import StringIO
import csv

scsv = """text,with,Polish,non-Latin,letters
1,2,3,4,5,6
a,b,c,d,e,f
gęś,zółty,wąż,idzie,wąską,dróżką,
"""

f = StringIO(scsv)
reader = csv.reader(f, delimiter=',')
for row in reader:
    print('\t'.join(row))

simpler version with split() on newlines:

reader = csv.reader(scsv.split('\n'), delimiter=',')
for row in reader:
    print('\t'.join(row))

Or you can simply split() this string into lines using \n as separator, and then split() each line into values, but this way you must be aware of quoting, so using csv module is preferred.

On Python 2 you have to import StringIO as

from StringIO import StringIO

instead.

edited Sep 19 '19 at 14:56

Boris Verkhovskiy

14,854
11
100
103

answered Jul 22 '10 at 05:18

Michał Niklas

53,067
18
70
114

8

the split method wouldn't work if his csv file contained strings which contained commas – Carson Myers Jul 22 '10 at 05:21
3

or quoted strings as values (with or without commas) – adamk Jul 22 '10 at 05:32
30

Python 3 now uses io.StringIO. (Hopefully save Python 3 users a little time). so import io and io.StringIO. – JStrahl Jul 20 '12 at 10:08
4

Instead of `.split('\n')`, you can use `.splitlines()`. – Denilson Sá Maia Sep 24 '14 at 23:06
This only works, I believe, for ascii only csv strings – leo Jul 14 '17 at 21:38
1

No, it works very well with Polish letters with ogonki :-) – Michał Niklas Jul 18 '17 at 05:25
Yes, beware of quoting rules/escapes if using just plain `split()`s, especially, two double quotes is really just one. cf. https://tools.ietf.org/html/rfc4180 – flow2k Jan 28 '19 at 20:46

adamk · Answer 2 · 2010-07-22T05:30:53.047

84

Simple - the csv module works with lists, too:

>>> a=["1,2,3","4,5,6"]  # or a = "1,2,3\n4,5,6".split('\n')
>>> import csv
>>> x = csv.reader(a)
>>> list(x)
[['1', '2', '3'], ['4', '5', '6']]

edited Jul 22 '10 at 05:30

answered Jul 22 '10 at 05:20

adamk

45,184
7
50
57

5

Good to know, but keep in mind that `.split('\n')` will do odd things if your fields contain newlines. – Inaimathi Apr 15 '13 at 14:52
1

@Inaimathi, If it's csv, the newlines inside should be escaped. – John La Rooy Dec 15 '15 at 20:55
2

Newlines don't need to be escaped if the field is quoted. – Jonathan Stray Jan 31 '17 at 17:36
1

This functionality is not well documented. Thank you. – cowlinator Apr 09 '19 at 20:59

score 28 · Answer 3 · answered Feb 23 '17 at 01:11

The official doc for csv.reader() https://docs.python.org/2/library/csv.html is very helpful, which says

file objects and list objects are both suitable

import csv

text = """1,2,3
a,b,c
d,e,f"""

lines = text.splitlines()
reader = csv.reader(lines, delimiter=',')
for row in reader:
    print('\t'.join(row))

score 15 · Answer 4 · answered Mar 22 '17 at 08:23

Per the documentation:

And while the module doesn’t directly support parsing strings, it can easily be done:

import csv
for row in csv.reader(['one,two,three']):
    print row

Just turn your string into a single element list.

Importing StringIO seems a bit excessive to me when this example is explicitly in the docs.

score 9 · Answer 5 · answered Jul 22 '10 at 19:05

9

As others have already pointed out, Python includes a module to read and write CSV files. It works pretty well as long as the input characters stay within ASCII limits. In case you want to process other encodings, more work is needed.

The Python documentation for the csv module implements an extension of csv.reader, which uses the same interface but can handle other encodings and returns unicode strings. Just copy and paste the code from the documentation. After that, you can process a CSV file like this:

with open("some.csv", "rb") as csvFile: 
    for row in UnicodeReader(csvFile, encoding="iso-8859-15"):
        print row

answered Jul 22 '10 at 19:05

roskakori

3,139
1
30
29

Make sure the Unicode file does not have a BOM (Byte Order Marker) – Pierre Oct 13 '14 at 14:04
1

Concerning BOM: Python should detect and skip official BOMs in UTF-32, UTF-16 etc. To skip the unofficial Microsoft BOM for UTF-8, use `'utf-8-sig'` as codec instead of `'utf-8'`. – roskakori Dec 07 '14 at 07:00

nvd · Answer 6 · 2021-10-19T14:13:50.427

7

Not a generic CSV parser but usable for simple strings with commas.

>>> a = "1,2"
>>> a
'1,2'
>>> b = a.split(",")
>>> b
['1', '2']

To parse a CSV file:

f = open(file.csv, "r")
lines = f.read().split("\n") # "\r\n" if needed

for line in lines:
    if line != "": # add other needed checks to skip titles
        cols = line.split(",")
        print cols

edited Oct 19 '21 at 14:13

answered Apr 13 '14 at 21:43

nvd

2,995
28
16

'Simple is better than complex!' – Abdelouahab Dec 06 '14 at 02:18
12

-1 The issue with this solution is that it doesn't take into account of "string escaping," i.e. `3, "4,5,6, 6` shall be treated as three fields instead of five. – Zz'Rot Feb 09 '16 at 04:16
Simple but only works in some specific cases, this is not generic CSV parsing code – Christophe Roussy May 03 '16 at 11:47

score 3 · Answer 7 · answered Apr 13 '14 at 22:12

https://docs.python.org/2/library/csv.html?highlight=csv#csv.reader

csvfile can be any object which supports the iterator protocol and returns a string each time its next() method is called

Thus, a StringIO.StringIO(), str.splitlines() or even a generator are all good.

dsgou · Answer 8 · 2016-12-05T12:45:10.053

2

Use this to have a csv loaded into a list

import csv

csvfile = open(myfile, 'r')
reader = csv.reader(csvfile, delimiter='\t')
my_list = list(reader)
print my_list
>>>[['1st_line', '0'],
    ['2nd_line', '0']]

edited Dec 05 '16 at 12:45

answered Oct 12 '15 at 11:10

dsgou

129
2
7

score 1 · Answer 9 · answered Dec 03 '14 at 13:46

Here's an alternative solution:

>>> import pyexcel as pe
>>> text="""1,2,3
... a,b,c
... d,e,f"""
>>> s = pe.load_from_memory('csv', text)
>>> s
Sheet Name: csv
+---+---+---+
| 1 | 2 | 3 |
+---+---+---+
| a | b | c |
+---+---+---+
| d | e | f |
+---+---+---+
>>> s.to_array()
[[u'1', u'2', u'3'], [u'a', u'b', u'c'], [u'd', u'e', u'f']]

Here's the documentation

Andrei · Answer 10 · 2022-10-17T17:45:59.277

For anyone still looking for a reliable way of converting a standard CSV str to a list[str] as well as in reverse, here are two functions I put together from some of the answers in this and other SO threads:

def to_line(row: list[str]) -> str:
    with StringIO() as line:
        csv.writer(line).writerow(row)
        return line.getvalue().strip()


def from_line(line: str) -> list[str]:
    return next(csv.reader([line]))

score 0 · Answer 11 · answered Mar 13 '23 at 11:50

0

For csv files:

data = blob.download_as_text()

pd.DataFrame(i.split(",") for i in data.split("\n"))

answered Mar 13 '23 at 11:50

Amine Dev

1

Python csv string to array

11 Answers11

Linked

Related