I have generated a txt file from a stacked dataframe, where the output looks like this:
Name Rank F R
Sample1 0 CGGGGT GGGTTC
Sample1 1 GCTGC GCTGCGT
Sample1 2 ACGTG AGCTGA
Sample1 3 CGATCG AGCTAGC
Sample1 4 CGTCAG GGCTTT
Sample2 0 AGTCAG GTCAG
Sample2 1 CGATCA GCATGCA
Sample2 2 ACTAG GCATGCA
Sample2 3 ACTAGCA ACACCA
Sample2 4 ACTGTCG CCCAAAT
Sample3 0 GGCAT TTACTA
Sample3 1 GTCATG GCTTTA
Sample3 2 GTCAG TCGTAGC
Sample3 3 GCATGCA GCATGCA
Sample3 4 GTCAG AATCTC
The samples show up 5 times each and have a rank 0-4, 0 being the best and 4 being the worst.I then take the txt into my next function and calculate a frequency, so the updated table has a frequency calculation assigned to it:
Name Rank F R Frequency
Sample1 0 CGGGGT GGGTTC 5
Sample1 1 GCTGC GCTGCGT 8
Sample1 2 ACGTG AGCTGA 2
Sample1 3 CGATCG AGCTAGC 1
Sample1 4 CGTCAG GGCTTT 2
Sample2 0 AGTCAG GTCAG 10
Sample2 1 CGATCA GCATGCA 5
Sample2 2 ACTAG GCATGCA 3
Sample2 3 ACTAGCA ACACCA 4
Sample2 4 ACTGTCG CCCAAAT 1
Sample3 0 GGCAT TTACTA 0
Sample3 1 GTCATG GCTTTA 0
Sample3 2 GTCAG TCGTAGC 2
Sample3 3 GCATGCA GCATGCA 3
Sample3 4 GTCAG AATCTC 4
I would like to drop the rank and sort each Name
group by frequency from lowest to highest. This would be fairly straightforward for me, except I want to keep the samples grouped together.
I tried the following:
df = df.drop('Rank', axis=1)
df.groupby('Name').sort_values('Frequency')
But I get an error:
AttributeError: Cannot access callable attribute 'sort_values' of 'DataFrameGroupBy' objects, try using the 'apply' method
I want the resultant DF to look like:
Name F R Frequency
Sample1 CGATCG AGCTAGC 1
Sample1 ACGTG AGCTGA 2
Sample1 CGTCAG GGCTTT 2
Sample1 CGGGGT GGGTTC 5
Sample1 GCTGC GCTGCGT 8
Sample2 ACTGTCG CCCAAAT 1
Sample2 ACTAG GCATGCA 3
Sample2 ACTAGCA ACACCA 4
Sample2 CGATCA GCATGCA 5
Sample2 AGTCAG GTCAG 10
Sample3 GGCAT TTACTA 0
Sample3 GTCATG GCTTTA 0
Sample3 GTCAG TCGTAGC 2
Sample3 GCATGCA GCATGCA 3
Sample3 GTCAG AATCTC 4
Thanks in advance.