0

Hello, I have google searched everything but I still can't find something that fits my needs.

I found this: Combining two csv files using pandas

But it's not doing what I want.

Code

df1 = pd.read_csv("a.csv")
df2 = pd.read_csv("b.csv")

out = df1.append(df2)

with open('main.csv', 'w', encoding='utf-8') as f:
    out.to_csv(f, index=False)

a.csv

col1    col2    col3
a        b        c
d        e        f

b.csv

col1    col2    col3
g        h        i
j        k        l

main.csv It seems to output nicely...

col1    col2    col3
a        b        c
d        e        f
g        h        i
j        k        l

However, when I try to remove data in a.csv or in b.csv, it seems to remove that specific data in main.csv

Example: a.csv (Removed a,b,c)

col1    col2    col3
d        e        f

b.csv

col1    col2    col3
g        h        i
j        k        l

main.csv

col1    col2    col3

d        e        f
g        h        i
j        k        l

It seems to leave a gap and removes the data if I remove some data in either csv. Basically, a.csv and b.csv is always changing and I want to combine these two without altering the original data that the main.csv have. I would also like that main.csv don't get duplicate rows.

Community
  • 1
  • 1
BakaDesu
  • 152
  • 1
  • 7
  • how are you removing `a,b,c`? – anky Mar 02 '19 at 06:16
  • I experimented and I manually remove those rows in excel. It seems that when I remove something and append the two csv files and output it into the third csv file, It removes that data and just leaves a gap. – BakaDesu Mar 02 '19 at 06:18
  • You need to code a merge function that will merge two csv files. Then you will first merge main.csv with a.csv followed by merge with b.csv. To code merge, you can read from csv to a List of tuples and then when you read from the second file, you can check for existence in the list. That will avoid duplicates. – jay.w Mar 02 '19 at 06:27
  • Thank you!! This is exactly what I am looking for! – BakaDesu Mar 02 '19 at 06:33

1 Answers1

0

you have to write the file in append mode instead of write mode , Below will be the correct way

with open('main.csv', 'a', encoding='utf-8') as f:
     out.to_csv(f, index=False)

for removing duplicated, refer the below question

Removing duplicate rows from a csv file using a python script