given an empty dataframe with assigned column names :
colnames = ('ACCT', 'CTAT', 'AAAT', 'ATCG')*3
df = pd.DataFrame(columns=colnames)
I want to loop over dataframes which have the below structure: (giving 2 for demostration)
sample_df = pd.DataFrame()
sample_df['tetran'] = colnames
sample_df['Frequency'] = (423, 512, 25, 123,632,124,614,73,14,75,311,155)
conids = ("cl1_42", "cl1_41", "cl2_31")
rep_conids = [val for val in conids for _ in range(4)]
sample_df['contig_id'] = rep_conids
sample_df_2 = pd.DataFrame()
sample_df_2['tetran'] = colnames
sample_df_2['Frequency'] = (724, 132, 4, 102,423,402,616,734,153,751,31,55)
conids_2 = ("se1_51", "se1_21", "se2_53")
rep_conids_2 = [val for val in conids_2 for _ in range(4)]
sample_df_2['contig_id'] = rep_conids_2
The objective is:
- Add each 'Frequency' value from the 'sample_df's to the corresponding 'tetraN' value of the 'df' and add a new column to be the sample_df['contig_id']
There are multiple 'sample_df' dataframes , so this is the idea of the desired output:
index | ACCT | CTAT | AAAT | ATCG |
---|---|---|---|---|
cl1_42 | 423 | 512 | 25 | 123 |
cl1_41 | 632 | 124 | 614 | 73 |
cl2_31 | 14 | 75 | 311 | 155 |
se1_51 | 724 | 132 | 4 | 102 |
se1_21 | 423 | 402 | 616 | 734 |
se2_53 | 153 | 751 | 31 | 55 |
I know how to do this in R but I need this to be done in python so I cannot add here what I tried due it is in R.
Thanks for your time :)