This is a further question regarding to one of my previous questions. The link is here. That question asked how to identify the different values within the same id rows for single column. But now, suppose I have dataset called df and it has multiple columns to check, in other words, I would like to do the similar thing but for all columns instead of specific one column.
The simulated dataset is here:
hhid<-c("hh001","hh001","hh001","hh002","hh002","hh002","hh003","hh003", "hh004","hh004")
gender<-c("m","m","m","m","m","m","f","f","f","f")
age<-c(12,22,12,11,11,11,18,23,9,9)
waterID<-c("W1","W2","W1","W7","W7","W7","W9","W10","W19","W19")
df_dup<-data.frame(hhid, gender, age, waterID)
hhid gender age waterID
hh001 m 12 W1
hh001 m 22 W2
hh001 m 12 W1
hh002 m 11 W7
hh002 m 11 W7
hh002 m 11 W7
hh003 f 18 W9
hh003 f 23 W10
hh004 f 9 W19
hh004 f 9 W19
...
Now in the dataset, there are two type of duplicates. One is completed duplicates such as hh002 and hh004.These are completed duplicate for all colunns.
Another is partial duplicates, such as hh001, hh003, they have different values in col age and waterID. Those have duplicated id but also have differences in some columns.
For this dataset, I want to do with a R code:
Create a new column (say.duptype) to mark type of duplicate, if it is a completed duplicate, then duptype=cdup, if it is partial duplicate, then the duptype=pdup. So later, I can decide which type of duplicate row I need to keep in dataset, and which I can filter them out.
I have very rough and immature thoughts for this given the reference of my previous question.
like first, I may need to extract the whole columns names, and save as vector;
z<-colnames(df_dup)
then I use loop to detect within hhid, if their values in all columns are the same, if so, mark cdup in duptype column, otherwise, mark pdup in duptype column.
for(i in z){
dfnew<-df_dup%>%
group_by(hhid)%>%
mutate(duptype=if_else(any(i!=lag(i), pdup, cdup))))%>%
}
But apparently, the code I wrote is far away from the result I expected. I appreciate it if anyone could help me to achieve this. Thanks a lot~~!