Hi all I actually have two dataframe (blast ouput) but one with more informations than the other and the problem is that in the dataframe 2
with more information, there is less data (because of a lack of information but I actually would like to keep these rows from the first dataframe
.
Here is an exemple
Let say the first df is the one with all rows:
seq_id1 seq_id2 other columns
seq1_A seq2_B something
seq2_A seq3_B something
seq4_A seq9_B something
seq9_A seq9_B something
seq10_A seq8_B something
and the other one
seq_id1 seq_id2 other_columns aditionnal_info_columns
seq1_A seq2_B something kingdom1
seq4_A seq9_B something kingdom2
What I would like to get is a dataframe such:
seq_id1 seq_id2 other_columns aditionnal_info_columns
seq1_A seq2_B something Kingdom1
seq2_A seq3_B something NA
seq4_A seq9_B something Kingdom2
seq9_A seq9_B something NA
seq10_A seq8_B something NA
Is it clear? Thanks for your help :)
I tried
Tax_id=pd.read_csv("0042_HYposoter_tax_best-hit.csv",header=0)
data_grpd_max.read_csv("data_grpd_max_tax_0035.txt",sep='\t')
data=pd.merge(data_grpd_max,Tax_id, how='left')
data.to_csv("data_grpd_max_tax_0042_new.txt",sep='\t')
But it does not work as I would like