Comparing values in a data frame based on a column and if values are different concatenating them and deleting the other row in R

Question

I have a dataset which looks something like this:

https://i.stack.imgur.com/5W0HU.png

I want to compare all columns based on 'Agentdepartment'. If values are same, then pick it up as it is, if different then concatenate both the values via a '/', and then delete the lower row. In case of NA, I don't want them to be coalesced,i.e if one columns has Null and other non null, keep the non null value.If both are having NULL then keep NULL. So I want a single row for every department. This is how I want my final dataset to look like:

https://i.stack.imgur.com/8B819.png

I found a solution here: aggregate(.~Agentdepartment, df, function(x) paste0(unique(x), collapse = "/"), na.action = na.pass). But it is aggregating Null values too.

Can anyone please suggest.

Please provide sample data in a copy&paste-able format (e.g. using `dput`) not as an image. Meanwhile, you can check [here](https://stackoverflow.com/questions/28752805/use-dplyr-to-concatenate-a-column) — A. Suliman, Sep 02 '18 at 13:56

score 0 · Answer 1 · answered Sep 03 '18 at 07:00

There should be a more efficient way, but this works.

Agent <- read.csv("Agent.csv")
Agent
  AgentDepartment AgentId SignDate Experience
1               A     101 30-01-20          1
2               A     102 30-01-20          2
3               B     103 25-06-20          1
4               B     103 20-05-20          4
5               C     104 19-08-20          5
6               C     105     <NA>          5
7               D      NA     <NA>          6

Agent$SignDate <- as.Date(Agent$SignDate)


Agent_fixed <- Agent %>% group_by(AgentDepartment) %>% summarise( AgentId  = toString(unique(AgentId)),
                                                   SignDate = toString(na.omit(unique(SignDate))),
                                                   Experience = toString(na.omit(unique(Experience))))
Agent_fixed$SignDate[Agent_fixed$SignDate == ""] <- NA
Agent_fixed$AgentId <- gsub(", ", "/", Agent_fixed$AgentId)
Agent_fixed$SignDate <- gsub(", ", "/", Agent_fixed$SignDate)
Agent_fixed$Experience <- gsub(", ", "/", Agent_fixed$Experience)

Agent_fixed
  AgentDepartment AgentId SignDate          Experience   
1 A               101/102 30-01-20          1/2       
2 B               103     25-06-20/20-05-20 1/4       
3 C               104/105 19-08-20          5         
4 D               NA      NA                6

Comparing values in a data frame based on a column and if values are different concatenating them and deleting the other row in R

1 Answers1