0

I have multiple files in a directory. I want to merge them in a way such that the rows are merged together, and file names are kept as column headers. For example, file1 looks like

ENSG1     12
ENSG2     13
ENSG3     14

file2 looks like

ENSG1     13
ENSG2     14
ENSG4     15

I'm looking forward to an output like

          file1      file2
ENSG1      12         13
ENSG2      13         14
ENSG3      14         0/na
ENSG4      0/na       15

Do you have any idea how to do this? Thank you for your time!

Jiaxin He
  • 1
  • 2
  • You need to perform an outer join. `result = file1.merge(file2, on='col1', how='outer')` Then you can rename your columns. Also, you can find more resources [here](https://stackoverflow.com/questions/53645882/pandas-merging-101). – Anurag Reddy Aug 06 '20 at 19:25

1 Answers1

0

Here's a way to do that using concat:

dfs = []
for f in ["file1", "file2"]: # iterate the relevant files here
    df = pd.read_csv(f, header=None, sep = "\s+", index_col=0)
    df.columns = [f]
    dfs.append(df)
res = pd.concat(dfs, axis=1)

The output it:

       file1  file2
ENSG1   12.0   13.0
ENSG2   13.0   14.0
ENSG3   14.0    NaN
ENSG4    NaN   15.0
Roy2012
  • 11,755
  • 2
  • 22
  • 35