6

I have an array like [1,1,1,2,4,6,3,3] and I would like to get the list of repeated elements, in this case [1,3]. I wrote this:

my_array.select{|obj|my_array.count(obj)>1}.uniq

But it is tragically inefficient (o(n²)). Do you have a better idea? If possible concise.

Thanks

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
MiniQuark
  • 46,633
  • 36
  • 147
  • 183

8 Answers8

9

Inspired by Ilya Haykinson's answer:

def repeated(array)
  counts = Hash.new(0)
  array.each{|val|counts[val]+=1}
  counts.reject{|val,count|count==1}.keys
end
MiniQuark
  • 46,633
  • 36
  • 147
  • 183
  • Yeah, I think that's cleaner than mine. Just for fun, here's that method all on one line, assuming availability of the "tap" method from Ruby >= 1.8.7. array.inject(Hash.new(0)){|counts,val|counts.tap{|c|c[val]+=1}}.reject{|val,count|count==1}.keys I think yours is more readable, though. :) – Greg Campbell Apr 24 '09 at 19:16
  • 1
    I really, really like this solution, and I like it because it is the most readable/understandable one among all the O(n) solutions. Here's a one-liner modification, just for fun: `array.inject(Hash.new(0)) { |h, i| h[i] += 1; h }.reject { |v, c| c == 1 }.keys` – Marek Příhoda Dec 10 '11 at 20:14
  • Thanks! Amazing... I was suffering with `detect`, `find_all`, etc – rapcal Dec 18 '15 at 00:48
  • This is ok but anyone who thinks it's the best answer needs to get familiar with `Set.new`. It uses a hash under hood and is great when you need the O(1) hash key access but with the simplicity of an array. Plus it aides readability as all the logic shrinks to the beautifully obvious `dups.add(val) if seen_already.include?(val)` – Adamantish Jan 10 '19 at 11:59
6

Using Ruby's Set library:

require 'set'

ary = [1,1,1,2,4,6,3,3]
dups = Set.new
test_set = Set.new
ary.each {|val| dups.add(val) unless test_set.add?(val)}
dups.to_a # [1, 3]

I believe this should be O(n), because Set#add and Set#add? are constant-time operations, as far as I know.

Greg Campbell
  • 15,182
  • 3
  • 44
  • 45
4

How about something like this? It will run in O(n).

a = [1,1,1,2,4,6,3,3]
b = {}
a.each { |v| if b.has_key? v then b[v] = b[v]+1 else b[v]=1 end }
b.reject { |k,v| if v > 1 then false else true end }.keys
MiniQuark
  • 46,633
  • 36
  • 147
  • 183
Ilya Haykinson
  • 569
  • 4
  • 7
3

A O(n) solution (change << x to + [x] and update to merge to make it purely functional):

rs = xs.inject([[], {}]) do |(out, seen), x| 
  [(seen[x] == 1 ? (out << x) : out), seen.update(x => (seen[x] || 0)+1)]
end[0]

A much simpler yet less space-efficient approach:

rs = xs.group_by { |x| x }.select { |y, ys| ys.size > 1 }.keys

The same idea avoiding the intermediate hash using a "list-comprehension":

rs = xs.group_by { |x| x }.map { |y, ys| y if ys.size > 1 }.compact
tokland
  • 66,169
  • 13
  • 144
  • 170
  • 1
    There's a problem with this solution. See `xs = [1,1,1]`. – Jan Dec 10 '11 at 19:10
  • Wouldn't group_by be a better fit? – Andrew Grimm Dec 11 '11 at 09:44
  • @Andrew. I thought there was already a solution using group_by, but it seems it was in the other question. I'll add it. Now that Ruby has ordered hashes we can preserve the order of the original enumerable. However, it's less space efficient than a custom solution. – tokland Dec 11 '11 at 10:04
1

Using inject

[1,1,1,2,4,6,3,3].inject({}){ |ele, n| ele[n] = nil; ele }.keys 
# => [1, 2, 4, 6, 3] 

EXPLANATION:

ele hash it's initialled to {}, each iteration a key with the number n and nil value is added to the ele hash. At the end ele is returned as:

{1=>nil, 2=>nil, 4=>nil, 6=>nil, 3=>nil}

We only want the keys, so .keys ends the job.

Diosney
  • 10,520
  • 15
  • 66
  • 111
ivanxuu
  • 842
  • 9
  • 10
0

I was thinking of counting how many times a unique element appears in array. It may be really inefficient just like the original suggestion but it was fun looking at the problem. I didn't do any benchmarks on larger arrays so this is just an excercise.

a = [1,1,1,2,4,6,3,3]

dupes = []
a.uniq.each do |u|
  c = a.find_all {|e| e == u}.size
  dupes << [u, c] unless c == 1
end

puts dupes.inspect

# dupes = [[1, 3], [3, 2]]
# 1 appears 3 times
# 3 appears twice


# to extract just the elment a bit cleaner
dupes = a.uniq.select do |u|
  a.find_all {|e| e == u}.size != 1
end
puts dupes.inspect
# returns [1,3]
marekj
  • 1,235
  • 2
  • 10
  • 11
0

This will work if the duplicated entries are always consecutive, as in your example; otherwise you would have to sort first. each_cons examines a rolling window of the specified size.

require 'set'

my_array = [1,1,1,2,4,6,3,3]
dups = Set.new
my_array.each_cons(2) {|a,b| dups.add(a) if (a == b)}
p dups.to_a
Justin Love
  • 4,397
  • 25
  • 36
0

Some ideas: you'd have to figure out the correct library data structures:

1 Sort the array O(nlogn), then run through the array

2 Create a set, search for the current array element in the set and if not found, insert and proceed for all the elements -- O(nlogn) again.

dirkgently
  • 108,024
  • 16
  • 131
  • 187