how to combine two data csv in one file using python

Question

i have two data csv

The first:

v1,v2,v3,....v100
-0.6662942866484324,-1.0799718232204516,1.843649258216222,....1.0950462520122528
0.7452152929104426,-0.6032845087431591,0.7041161138126079,....-0.41362931908053513

The second:

c1,c2,c3,c4,c5
4,1,0,0,1
14,2,2,0,13

when I combine using my code, the results are like this:

v1,v2,v3..v100,c1,c2,c3,c4,c5
0.0,1.0,2,...0,0,0,1,0,0

my code is like this..

import pandas as pd
vector = pd.read_csv('../data/vector_data.csv',encoding = "ISO-8859-1")
cluster= pd.read_csv('../data/data_cluster.csv',encoding = "ISO-8859-1")
data=vector.merge(cluster, left_on='v1', right_on='c1')
export_csv = data.to_csv (r'../data/merge_label.csv',index=False)

the result should be like this

v1,v2,v3..v100,c1,c2,c3,c4,c5
-0.6662942866484324,-1.0799718232204516,1.843649258216222,....1.0950462520122528,4,1,0,0,1

please help me...

What was your code that you tried? – Quang Hoang Jul 19 '19 at 19:43 — Quang Hoang, Jul 19 '19 at 19:43
I've edited it, look again – Muhammad Rusli Jul 19 '19 at 19:49 — Muhammad Rusli, Jul 19 '19 at 19:49

score 1 · Answer 1 · answered Jul 19 '19 at 19:56

1

Pandas not needed

with open('third.csv', 'w') as fh:
    for f, s in zip(*map(open, ['first.csv', 'second.csv'])):
        fh.write(f.rstrip('\n') + ',' + s)

answered Jul 19 '19 at 19:56

piRSquared

285,575
57
475
624

Why wouldn't you want to use pandas in this use-case? – Umar.H Jul 19 '19 at 20:51
1

If the application is to just merge csv files, importing pandas might be too much overhead. Imagine this in a command line tool and needed to be run many times. It just isn't necessary to load up a heavy hitting data analytics library to do something this simple. – piRSquared Jul 19 '19 at 20:53
Thanks as always, you are a great teacher. – Umar.H Jul 19 '19 at 20:54
Now i know what you meant :) – anky Jul 20 '19 at 04:05

score 0 · Answer 2 · answered Jul 19 '19 at 19:49

0

try updating to this:

data=vector.merge(cluster, left_on='v1', right_on='c1', how='outer')

default is how=inner so looks like the only intersection may be 0 and creating the single row you are seeing.

answered Jul 19 '19 at 19:49

Connor John

433
2
8

score 0 · Answer 3 · answered Jul 19 '19 at 19:50

0

Can you try the following code if it works:

data=pd.concat([vector,cluster],axis=1)

answered Jul 19 '19 at 19:50

Krishna Rao

99
8

how to combine two data csv in one file using python

3 Answers3

Pandas not needed