this is my first post here. I have a large dataset and I am trying to remove duplicate rows based on the value of one of the specified variables (ERRaw). When I use the following code, the resulting dataset excludes some cases that did not have duplicates in the original -- don't understand why. I need to keep all singleton cases and only remove duplicates. Please help!
new_data <- data_with_dups %>%
group_by(StudentID, District) %>%
distinct(StudentID, ERRaw, .keep_all = T) %>%
top_n(1, ERRaw)
Thank you!