I have a data frame, looking like this
Chrom Pos Ref Alt sample_id cluster_id cellular_prevalence
1 chr11 70176412 C G SRC125_1 0 0.5389
8 chr12 10370686 G A SRC125_1 0 0.5389
15 chr12 40892074 T A SRC125_1 0 0.5389
22 chr12 53663629 G T SRC125_1 0 0.5389
29 chr13 103387098 C T SRC125_1 0 0.5389
36 chr13 24334244 G T SRC125_1 0 0.5389
....
....
Chrom Pos Ref Alt sample_id cluster_id cellular_prevalence
1086 chr3 12531337 G C SRC125_1 6 0.2675
1093 chr3 12531455 G C SRC125_1 6 0.2675
1100 chr3 12531462 G A SRC125_1 6 0.2675
1107 chr5 178460018 T A SRC125_1 6 0.2675
1114 chr5 180048230 C T SRC125_1 6 0.2675
Total number of clusters:
unique(my_data$cluster_id)
0 1 2 3 4 5 6 7
I want to remove clusters that have only one mutation per sample_id and rename the clusters based on the removed cluster. Just as an example in my dataset, cluster 2 has only one mutation per sample_id, I removed it and now want rename the clusters after removing cluster2 so cluster 3 will renamed as cluster2, cluster 4 -> cluster3, cluster 5 -> cluster4 and so on
How can I do it in R?