How to create new field that takes max value given group_by in R?

Question

I have a table that looks like so:

PARTY_ID | PARTYNUM | WEIGHTED_CONF | CONF_SCORE
1           ABC       HIGH            3
1           ABC       HIGH            3
1           ABC       MEDIUM          2
2           DEF       LOW             1
2           DEF       MEDIUM          2
2           DEF       HIGH            3
3           GHI       PERFECT         4
3           GHI       HIGH            3
3           GHI       HIGH            3

I would like to create a new field that takes the highest 'CONF_SCORE' by each 'PARTYNUM' group.

Desired output

PARTY_ID | PARTYNUM | WEIGHTED_CONF | CONF_SCORE | MAX
1           ABC       HIGH            3            3
1           ABC       HIGH            3            3
1           ABC       MEDIUM          2            3
2           DEF       LOW             1            3
2           DEF       MEDIUM          2            3
2           DEF       HIGH            3            3
3           GHI       PERFECT         4            4
3           GHI       HIGH            3            4
3           GHI       HIGH            3            4

I tried this but my output returns '-inf'

new_dataset_final <- new_dataset1 %>%
group_by(PARTYNUM) %>%
  mutate(MAX = max(as.numeric(new_dataset$Conf_Score)))

"Almost never" should you use `new_dataset$` *inside* of a dplyr-verb: when you do that, you ignore the `group_by` grouping completely. Remove that and it returns what you want. — r2evans, Sep 22 '21 at 16:32
Also, btw, `Conf_Score` and `CONF_SCORE` are not the same. `max(as.numeric(NAME_OF_NONEXISTENT_COLUMN))` is the same as `max(as.numeric(NULL))` which usually does two things: ***it warns you*** with "no non-missing arguments", and it returns `-Inf`. In general, do not ignore warnings, they likely are telling you at least where (if not how) you are munging your data. Either that, or it fails with `object 'Conf_Score' not found`. Either way, they're different. — r2evans, Sep 22 '21 at 16:36
Thank you - appreciate the clear and helpful standards I should be using. — Dinho, Sep 22 '21 at 16:38
FYI, a good discussion of summarizing by group: https://stackoverflow.com/q/11562656/3358272. It isn't the "gospel", so to speak, but it has a lot of good examples, including dplyr-based code, some good examples to imitate as you become more comfortable with dplyr in general and grouping ops specifically. — r2evans, Sep 22 '21 at 16:40

Dubukay · Accepted Answer · 2021-09-22T16:37:19.567

1

As r2evans mentions, you're requesting the max of the ungrouped data frame by using the $ notation and specifying new_dataset a second time. This should work:

new_dataset_final <- new_dataset1 %>%
group_by(PARTYNUM) %>%
  mutate(MAX = max(as.numeric(CONF_SCORE)))

edited Sep 22 '21 at 16:37

answered Sep 22 '21 at 16:34

Dubukay

1,764
1
8
13

1

@r2evans corrected, nice catch – Dubukay Sep 22 '21 at 16:38
Thank you! Output is corrected. – Dinho Sep 22 '21 at 16:39

score 1 · Answer 2 · answered Sep 22 '21 at 17:39

1

In base R we can do

aggregate(CONF_SCORE  ~PARTYNUM, 
        data = new_dataset1, max)

Or to add as a new column, use ave

new_dataset1$MAX <- with(new_dataset1, ave(CONF_SCORE, PARTYNUM, FUN = max))

answered Sep 22 '21 at 17:39

akrun

874,273
37
540
662

How to create new field that takes max value given group_by in R?

Desired output

2 Answers2