0

I have a data frame with several column. In particular, there's a column of character type, and I want to know if there's at least 2 occurrences of the same element in this column, and if that's the case, I want to know the element and the number of occurrences

For instance, c("Test","Hi","Hello","Hi") should give me ("Hi", 2)

At first, I tried count, but since it doesn't work for characters, I tried to find a solution with str_count

I tried the following solution, which works :

test <- c("Test","Hi","Hello","Hi")
res = c()
for (i in unique(test)){
  if (sum(str_count(test, i))>=2){
    res = cbind(res,c(i, sum(str_count(test, i))))
  }
}

res
     [,1]
[1,] "Hi"
[2,] "2" 

But actually, the data on which I have to use this is quite big. And since this solution is far from optimal with its multiple loops, I'm quite displeased with the execution time

Have you got any advice to improve this code or try a different approach ?

MBB
  • 347
  • 3
  • 18

0 Answers0