Find the maximum value for a specific value in a column?

Question

I've got a data frame with 34 columns and 12,964 rows, two of these columns are Gene.Name and Mutation_Frequency. For example:

Gene.Name	Mutation_Frequency
CTLA4	0
TP53	4
CTLA4	2
CTLA4	2
TP53	4
TP53	6

I now want to create a column called "Highest_Mutation_Frequency" which tells me the highest mutation frequency for the Gene.Name, and puts it in a new column, like this:

Gene.Name	Mutation_Frequency	Highest_Mutation_Frequency
CTLA4	0	2
TP53	4	6
CTLA4	2	2
CTLA4	2	2
TP53	0	6
TP53	6	6

I realize I could probably use the max() command, but I'm not sure how to implement this. As always, any help is appreciated!

Edit: Although this is quite similar to another question: Select the row with the maximum value in each group this question also involves producing unique rows and placing them in another data frame.

Try looking at [this post](https://stackoverflow.com/questions/25314336/extract-the-maximum-value-within-each-group-in-a-dataframe) — Ben, Aug 07 '21 at 16:55

score 5 · Answer 1 · answered Aug 07 '21 at 16:57

5

You could use:

library(dplyr)

data %>%
  group_by(Gene.Name) %>%
  mutate(Highest_Mutation_Frequency = max(Mutation_Frequency))

answered Aug 07 '21 at 16:57

Martin Gal

16,640
5
21
39

score 5 · Accepted Answer · answered Aug 07 '21 at 17:52

5

An option with base R

aggregate(cbind(Highest_Mutation_Frequency = Mutation_Frequency) ~ Gene.Name, data, FUN = max)

answered Aug 07 '21 at 17:52

akrun

874,273
37
540
662

score 3 · Answer 3 · answered Aug 07 '21 at 20:38

3

Another base R option using unique + ave

unique(
  transform(
    df,
    Highest_Mutation_Frequency = ave(Mutation_Frequency,Gene.Name,FUN = max)
  )[c("Gene.Name","Highest_Mutation_Frequency")]
)

answered Aug 07 '21 at 20:38

ThomasIsCoding

96,636
9
24
81

Find the maximum value for a specific value in a column?

3 Answers3