How can I efficiently extract repeated elements in a Ruby array?

Question

I have an array like [1,1,1,2,4,6,3,3] and I would like to get the list of repeated elements, in this case [1,3]. I wrote this:

my_array.select{|obj|my_array.count(obj)>1}.uniq

But it is tragically inefficient (o(n²)). Do you have a better idea? If possible concise.

Thanks

score 9 · Accepted Answer · answered Apr 24 '09 at 18:49

9

Inspired by Ilya Haykinson's answer:

def repeated(array)
  counts = Hash.new(0)
  array.each{|val|counts[val]+=1}
  counts.reject{|val,count|count==1}.keys
end

answered Apr 24 '09 at 18:49

MiniQuark

46,633
36
147
183

Yeah, I think that's cleaner than mine. Just for fun, here's that method all on one line, assuming availability of the "tap" method from Ruby >= 1.8.7. array.inject(Hash.new(0)){|counts,val|counts.tap{|c|c[val]+=1}}.reject{|val,count|count==1}.keys I think yours is more readable, though. :) – Greg Campbell Apr 24 '09 at 19:16
1

I really, really like this solution, and I like it because it is the most readable/understandable one among all the O(n) solutions. Here's a one-liner modification, just for fun: `array.inject(Hash.new(0)) { |h, i| h[i] += 1; h }.reject { |v, c| c == 1 }.keys` – Marek Příhoda Dec 10 '11 at 20:14
Thanks! Amazing... I was suffering with `detect`, `find_all`, etc – rapcal Dec 18 '15 at 00:48
This is ok but anyone who thinks it's the best answer needs to get familiar with `Set.new`. It uses a hash under hood and is great when you need the O(1) hash key access but with the simplicity of an array. Plus it aides readability as all the logic shrinks to the beautifully obvious `dups.add(val) if seen_already.include?(val)` – Adamantish Jan 10 '19 at 11:59

score 6 · Answer 2 · answered Apr 24 '09 at 18:09

Using Ruby's Set library:

require 'set'

ary = [1,1,1,2,4,6,3,3]
dups = Set.new
test_set = Set.new
ary.each {|val| dups.add(val) unless test_set.add?(val)}
dups.to_a # [1, 3]

I believe this should be O(n), because Set#add and Set#add? are constant-time operations, as far as I know.

score 4 · Answer 3 · edited Apr 24 '09 at 18:37

4

How about something like this? It will run in O(n).

a = [1,1,1,2,4,6,3,3]
b = {}
a.each { |v| if b.has_key? v then b[v] = b[v]+1 else b[v]=1 end }
b.reject { |k,v| if v > 1 then false else true end }.keys

edited Apr 24 '09 at 18:37

MiniQuark

46,633
36
147
183

answered Apr 24 '09 at 18:34

Ilya Haykinson

569
4
7

2

I like the idea. You could beautify the last line like this: b.reject{|k,v| v==1}.keys – MiniQuark Apr 24 '09 at 18:40
3

Also, you could use b=Hash.new(0), and then you would have a simpler 3rd line: a.each{|v|b[v]+=1} – MiniQuark Apr 24 '09 at 18:45

tokland · Answer 4 · 2011-12-11T22:18:57.500

3

A O(n) solution (change << x to + [x] and update to merge to make it purely functional):

rs = xs.inject([[], {}]) do |(out, seen), x| 
  [(seen[x] == 1 ? (out << x) : out), seen.update(x => (seen[x] || 0)+1)]
end[0]

A much simpler yet less space-efficient approach:

rs = xs.group_by { |x| x }.select { |y, ys| ys.size > 1 }.keys

The same idea avoiding the intermediate hash using a "list-comprehension":

rs = xs.group_by { |x| x }.map { |y, ys| y if ys.size > 1 }.compact

edited Dec 11 '11 at 22:18

answered Dec 10 '11 at 19:00

tokland

66,169
13
144
170

1

There's a problem with this solution. See `xs = [1,1,1]`. – Jan Dec 10 '11 at 19:10
Wouldn't group_by be a better fit? – Andrew Grimm Dec 11 '11 at 09:44
@Andrew. I thought there was already a solution using group_by, but it seems it was in the other question. I'll add it. Now that Ruby has ordered hashes we can preserve the order of the original enumerable. However, it's less space efficient than a custom solution. – tokland Dec 11 '11 at 10:04

score 1 · Answer 5 · edited Jul 19 '13 at 19:37

1

Using inject

[1,1,1,2,4,6,3,3].inject({}){ |ele, n| ele[n] = nil; ele }.keys 
# => [1, 2, 4, 6, 3]

EXPLANATION:

ele hash it's initialled to {}, each iteration a key with the number n and nil value is added to the ele hash. At the end ele is returned as:

{1=>nil, 2=>nil, 4=>nil, 6=>nil, 3=>nil}

We only want the keys, so .keys ends the job.

edited Jul 19 '13 at 19:37

Diosney

10,520
15
66
111

answered Jul 19 '13 at 18:59

ivanxuu

842
9
10

Thanks but I only wanted *repeated* elements, as indicated in the example. – MiniQuark Jul 26 '13 at 19:42

marekj · Answer 6 · 2009-12-18T06:20:15.500

I was thinking of counting how many times a unique element appears in array. It may be really inefficient just like the original suggestion but it was fun looking at the problem. I didn't do any benchmarks on larger arrays so this is just an excercise.

a = [1,1,1,2,4,6,3,3]

dupes = []
a.uniq.each do |u|
  c = a.find_all {|e| e == u}.size
  dupes << [u, c] unless c == 1
end

puts dupes.inspect

# dupes = [[1, 3], [3, 2]]
# 1 appears 3 times
# 3 appears twice


# to extract just the elment a bit cleaner
dupes = a.uniq.select do |u|
  a.find_all {|e| e == u}.size != 1
end
puts dupes.inspect
# returns [1,3]

score 0 · Answer 7 · answered Dec 18 '09 at 06:45

This will work if the duplicated entries are always consecutive, as in your example; otherwise you would have to sort first. each_cons examines a rolling window of the specified size.

require 'set'

my_array = [1,1,1,2,4,6,3,3]
dups = Set.new
my_array.each_cons(2) {|a,b| dups.add(a) if (a == b)}
p dups.to_a

score 0 · Answer 8 · answered Apr 24 '09 at 17:45

Some ideas: you'd have to figure out the correct library data structures:

1 Sort the array O(nlogn), then run through the array

2 Create a set, search for the current array element in the set and if not found, insert and proceed for all the elements -- O(nlogn) again.

How can I efficiently extract repeated elements in a Ruby array?

8 Answers8

EXPLANATION:

Linked