R: Counting the number of unique values for each ID

Question

I am trying to create a frequency table returning the number of unique diagnoses per ID but some IDs were assigned the same diagnosis code multiple times. Here is how my data looks like:

EncounterID  Diagnosis
    1            a
    1            a
    1            c
    2            a
    2            b
    3            e
    3            e 
    3            e

Here is what I would like my frequency table to look like:

EncounterID  Diagnosis
    1            2
    2            2
    3            1

I am relatively new to R, I figured the first part of the code. Here is the code I have tried, but it is not returning the number of unique diagnoses.

Num_Diag_Per_Enct <- data.table(table(diagn$ENCOUNTER_ID))

score 0 · Accepted Answer · answered Nov 17 '20 at 00:42

Try this. You can use group_by() and summarise() from dplyr or aggregate() from base R. Here the code:

library(dplyr)
#Code
newdf <- df %>% group_by(EncounterID) %>% summarise(Diagnosis=n_distinct(Diagnosis))

Output:

# A tibble: 3 x 2
  EncounterID Diagnosis
        <int>     <int>
1           1         2
2           2         2
3           3         1

Or the base R option:

#Code 2
newdf <- aggregate(Diagnosis~EncounterID,data=df,function(x) length(unique(x)))

Output:

  EncounterID Diagnosis
1           1         2
2           2         2
3           3         1

Some data used:

#Data
df <- structure(list(EncounterID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L
), Diagnosis = c("a", "a", "c", "a", "b", "e", "e", "e")), class = "data.frame", row.names = c(NA, 
-8L))

I decided to go with code 2 and got the result I was expecting. Thanks for the reply! — Ems, Nov 17 '20 at 00:55

R: Counting the number of unique values for each ID

1 Answers1