I've got a data frame with 34 columns and 12,964 rows, two of these columns are Gene.Name and Mutation_Frequency. For example:
Gene.Name | Mutation_Frequency |
---|---|
CTLA4 | 0 |
TP53 | 4 |
CTLA4 | 2 |
CTLA4 | 2 |
TP53 | 4 |
TP53 | 6 |
I now want to create a column called "Highest_Mutation_Frequency" which tells me the highest mutation frequency for the Gene.Name, and puts it in a new column, like this:
Gene.Name | Mutation_Frequency | Highest_Mutation_Frequency |
---|---|---|
CTLA4 | 0 | 2 |
TP53 | 4 | 6 |
CTLA4 | 2 | 2 |
CTLA4 | 2 | 2 |
TP53 | 0 | 6 |
TP53 | 6 | 6 |
I realize I could probably use the max() command, but I'm not sure how to implement this. As always, any help is appreciated!
Edit: Although this is quite similar to another question: Select the row with the maximum value in each group this question also involves producing unique rows and placing them in another data frame.