-1

I have two CSV's, the first (CSV1) contains data about files in a folder, such as name, path, size and also has a unqiue ID (DocID).

In the second CSV (CSV2) I have additional information about these files, such as Sensitivity, Retention Period, File Owner, etch. This CSV also has a DocID field.

I want to iterate through each item in CSV1 and append to the end of each row, the additional fields from CSV2 where the DocID fields match.

I imagine this is a simple problem, with a simple solution, and I apologies for such a n00b question and appreciate any guidance on this. I have Python and Pandas on my machine.

jm2583
  • 15
  • 5
  • You will want to post your own attempt at a solution and use this to ask a much more specific question. – DanielM Jul 13 '19 at 14:29

3 Answers3

0

May be this can help,

df1 = pd.read_csv("file1.csv")
df2 = pd.read_csv("file2.csv")
merged = df1.merge(df2, on="DOCID",how="outer").fillna("")
merged.to_csv("merged.csv", index=False)
  • Thanks, pd.merge is a nifty little function worked a charm. Have tested this on a small data set and seems to do exactly what I need. Thanks for the advice. Negated the index=False the first time, realised the importance of it after, thanks again. – jm2583 Jul 13 '19 at 22:51
  • @jm2583 Hey, could you please upvote my answer then. – chinmay gandi Jul 23 '19 at 09:25
0
import pandas as pd
df1 = pd.read_csv(<name of file 1>)
df2 = pd.read_csv(<name of file 2>)
result = pd.merge(df1, df2, on='<name of the id column>')
Yash Thenuan
  • 591
  • 5
  • 17
  • 1
    Please add some explanation to your answer. Doing this your answer will be more and more helpful for future users – DaFois Jul 13 '19 at 16:54
  • Thanks, pd.merge is a nifty little function worked a charm. Have tested this on a small data set and seems to do exactly what I need. Thanks for the advice. – jm2583 Jul 13 '19 at 22:51
0

Pandas has a merge function which more information can be found here https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html. It sounds like you want a one-to-one merge. Your code will look like this. result = pd.merge(left, right, on=['key1', 'key2']) with left being CSV1 and right be CSV2 Key being DocID

  • Thanks, yes, this merge function appears to have done the trick, thanks for taking the time to respond. – jm2583 Jul 13 '19 at 22:51