0

Hi all I actually have two dataframe (blast ouput) but one with more informations than the other and the problem is that in the dataframe 2 with more information, there is less data (because of a lack of information but I actually would like to keep these rows from the first dataframe. Here is an exemple

Let say the first df is the one with all rows:

seq_id1     seq_id2     other columns
seq1_A      seq2_B      something
seq2_A      seq3_B      something
seq4_A      seq9_B      something
seq9_A      seq9_B      something
seq10_A     seq8_B      something

and the other one

seq_id1     seq_id2     other_columns aditionnal_info_columns
seq1_A      seq2_B      something      kingdom1
seq4_A      seq9_B      something      kingdom2

What I would like to get is a dataframe such:

seq_id1     seq_id2     other_columns  aditionnal_info_columns
seq1_A      seq2_B      something       Kingdom1
seq2_A      seq3_B      something       NA
seq4_A      seq9_B      something       Kingdom2
seq9_A      seq9_B      something       NA
seq10_A     seq8_B      something       NA

Is it clear? Thanks for your help :)

I tried

Tax_id=pd.read_csv("0042_HYposoter_tax_best-hit.csv",header=0)
data_grpd_max.read_csv("data_grpd_max_tax_0035.txt",sep='\t')

data=pd.merge(data_grpd_max,Tax_id, how='left')

data.to_csv("data_grpd_max_tax_0042_new.txt",sep='\t')

But it does not work as I would like

Grendel
  • 555
  • 1
  • 4
  • 11

0 Answers0