Read each entire Column of CSV file using python (preferably by help of pandas )

Question

I have some data in Microsoft excel that I save them as CSV file for ease of use. the data Structure is like this:

MS Excel format:

L1                      
0   1   0   0   0   1   1
0   0   1   0   0   1   0
0   0   0   1   0   0   1
0   0   0   0   1   0   0
1   1   1   1   1   1   1
1   1   1   1   1   1   1
1   1   1   1   1   1   1
1   1   1   1   1   1   1

CSV format

L1,,,,,,,,,,,,,,
0,1,0,0,0,1,1,
0,0,1,0,0,1,0,
0,0,0,1,0,0,1,
0,0,0,0,1,0,0,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,

As you see only the first column has label now I want to read the CSV file (or it's easier the excel file) to get each column and do some bit manipulation operation on them. How can I acheive this? I have read something about pandas But I can't find anything useful in order to fetch each coloumn

@aIKid I followed http://stackoverflow.com/questions/16503560/read-specific-columns-from-csv-file-with-python-csv, but it needs to know all the column's label which I don't have — Am1rr3zA, Nov 06 '13 at 15:49
@alexis each column for example I want the result of anding or XORing column 1 with column 3. — Am1rr3zA, Nov 06 '13 at 16:13

score 2 · Accepted Answer · answered Nov 06 '13 at 17:16

Given the .csv file temp.csv

L1x,,,,,,,
0,1,0,0,0,1,1,
0,0,1,0,0,1,0,
0,0,0,1,0,0,1,
0,0,0,0,1,0,0,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,

read it in as follows:

import pandas
a = pandas.read_csv('temp.csv', names = ["c%d" % i for i in range(8)], skiprows = 1)
a

Output:

   c0  c1  c2  c3  c4  c5  c6  c7
0   0   1   0   0   0   1   1 NaN
1   0   0   1   0   0   1   0 NaN
2   0   0   0   1   0   0   1 NaN
3   0   0   0   0   1   0   0 NaN
4   1   1   1   1   1   1   1 NaN
5   1   1   1   1   1   1   1 NaN
6   1   1   1   1   1   1   1 NaN
7   1   1   1   1   1   1   1 NaN

The 'NaN's in the last column come from the pesky trailing commas. The 8 in the range needs to match the number of columns. To access the columns in a use either

a.c3

or

a[c3]

both of which result in

The cool thing about pandas is that if you want to XOR two columns you can very simply.

a.c0^a.c2

Output

how can I do other bitwise operation like NOT, NAND, XNOR and so on, I mean in base 2 (causeI know I can do something like this a.c0& ~a.c2 or ~(a.c0^a.c2)) but it didn't work correctlhy — Am1rr3zA, Nov 11 '13 at 20:14
Bitwise operators return integers. NOT, NAND, XNOR etc. are not implemented as such since the not portion requires a known fixed bit length. However, you can combine the bitwise operators with modulo 2 operator. So for NOT (~a.c0)%2 works, (~(a.c1&a.c2))%2 and ((~a.c1)|(~a.c2))%2 are two of the many logical equivalents of NAND, (~(a.c1^a.c2))%2 will do XNOR, etc. Just watch out for operator precedence. See Section 5.4.1 of [Python Build-in Types Documentation](http://docs.python.org/2/library/stdtypes.html) as a reference. — Chris H., Nov 18 '13 at 02:49

dawg · Answer 2 · 2013-11-06T19:40:02.573

Assume I have:

enter image description here

Which you can save into a CSV file that looks like so:

L1,,,
L2,0,10,20
L3,1,11,21
L4,2,12,22
L5,3,13,23
L6,4,14,24
L7,5,15,25
L8,6,16,26
L9,7,17,27
L10,8,18,28

To get just any col, use CSV reader and transpose with zip:

import csv

with open('test.csv', 'rU') as fin:
    reader=csv.reader(fin)
    data=list(reader)

print 'data:', data
# data: [['L1', '', '', ''], ['L2', '0', '10', '20'], ['L3', '1', '11', '21'], ['L4', '2', '12', '22'], ['L5', '3', '13', '23'], ['L6', '4', '14', '24'], ['L7', '5', '15', '25'], ['L8', '6', '16', '26'], ['L9', '7', '17', '27'], ['L10', '8', '18', '28']]

Notice the data is a list of rows. You can transpose that List of Lists using zip to get a list of columns:

trans=zip(*data)
print 'trans:',trans
# trans: [('L1', 'L2', 'L3', 'L4', 'L5', 'L6', 'L7', 'L8', 'L9', 'L10'), ('', '0', '1', '2', '3', '4', '5', '6', '7', '8'), ('', '10', '11', '12', '13', '14', '15', '16', '17', '18'), ('', '20', '21', '22', '23', '24', '25', '26', '27', '28')]

Then just index to get a specific column:

print trans[0]
# ('L1', 'L2', 'L3', 'L4', 'L5', 'L6', 'L7', 'L8', 'L9', 'L10')

Of course if you want to do arithmetic on the cells, you will need to convert the string to ints or floats as appropriate.

score 0 · Answer 3 · answered Nov 06 '13 at 16:00

0

import pandas as pd
pd.read_excel("foo.xls", "Sheet 1",
              names=["c%d" % i for i in range(7)])

Output:

   c0  c1  c2  c3  c4  c5  c6
0   0   1   0   0   0   1   1
1   0   0   1   0   0   1   0
2   0   0   0   1   0   0   1
3   0   0   0   0   1   0   0
4   1   1   1   1   1   1   1
5   1   1   1   1   1   1   1
6   1   1   1   1   1   1   1
7   1   1   1   1   1   1   1

answered Nov 06 '13 at 16:00

Matt

17,290
7
57
71

I want to get each column, for example read whole xls or csv in A and by a[0] I get 0 0 0 0 1 1 1 1 and by a[1] get 1 0 0 0 1 1 1 1 an d so on – Am1rr3zA Nov 06 '13 at 16:10
tryp `a.c1` or `a['c1']`. – Matt Nov 06 '13 at 16:13
raise XLRDError('No sheet named <%r>' % sheet_name) xlrd.biffh.XLRDError: No sheet named <'aaa'>, and I have a shit name aaa – Am1rr3zA Nov 06 '13 at 16:18

score 0 · Answer 4 · answered Nov 06 '13 at 16:01

Sample Code returns column as array.:

input = """L1,,,,,,,,,,,,,,
0,1,0,0,0,1,1,
0,0,1,0,0,1,0,
0,0,0,1,0,0,1,
0,0,0,0,1,0,0,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
"""

def getColumn(data,column_number):

    dump_array=[]
    lines=data.split("\n")
    for line in lines:
        tmp_cell = line.split(",")
        dump_array.append(tmp_cell[3])
    return dump_array

#for ex. get column 3
getColumn(3,input)

This may give an idea to manuplate your grid...

Note: I dont have an interpreter for testing code now, so sorry if there is typo...

Read each entire Column of CSV file using python (preferably by help of pandas )

4 Answers4