Say I have a dataframe, df
, with three vectors:
colours individual value
1 white individual 1 0.4
2 white individual 1 0.7
3 black individual 2 1.1
4 black individual 3 0.5
Sometimes the same person shows up multiple times for the same colour but different values. I would like to write some code that would delete all of the instances in which this happens.
***EDIT: There are many more rows than 4 - millions - I don't think the current solutions work.
I would like to count how many times the string I am currently on, in my for loop, comes up and then delete them from the data.frame. So in the example above, I would like to get rid of individual 1. The df would then leave the other two rows.
So far my approach was this:
Get a list of all the colours
Get a list of all the individuals
Write two for loops.
colours <- unique(df$colours) ind <- unique(df$individual) for (i in ind) { for (c in colour) { #something here. Probably if, asking if the person I'm on in the loop #is found with the colour I am on, more than once, get rid of them } }
My expected output is this:
colours individual value
black individual 2 1.1
black individual 3 0.5
Source data
df <- data.frame(colours = c("white", "white", "black", "black"),
individual = c("individual 1", "individual 1", "individual 2", "individual 3"),
value = c(0.4, 0.7, 1.1, 0.5))