13

I'm just starting with Ruby and I personally find the following to be a violation of the "principle of least surprise". And that is, quoting from the documentation, that uniq! "removes duplicate elements from self. Returns nil if no changes are made (that is, no duplicates are found)."

Can anybody explain this, which seems completely counter-intuitive to me? This means that rather than being able to write one line of code below by appending .uniq! to end the first line, I instead have to write the following two lines:

  hooks = IO.read(wt_hooks_impl_file).scan(/wt_rt_00\w{2}/)
  hooks = hooks.uniq

Or am I missing something, a better way?

EDIT:

I understand that uniq! modifies its operand. Here's the problem illustrated better I hope:

  hooks = IO.read(wt_hooks_impl_file).scan(/wt_rt_00\w{2}/)
  puts hooks.length #50
  puts hooks.uniq!.length #undefined method `length' for nil:NilClass

I contend that the way uniq! works makes it completely senseless and useless. Sure in my case as pointed out I could just append .uniq to the first line. However later in the same program I am pushing elements onto another array inside of a loop. Then, under the loop, I'd like to "de-dupe" the array, but I dare not write 'hooks_tested.uniq!' because it could return nil; instead I must write hooks_tested = hooks_tested.uniq

Indeed I contend this is a particularly egregious mis-feature in that it is a well known principle that, when devising a method that returns an array, one should always at least return an empty array, rather than nil

dsolimano
  • 8,870
  • 3
  • 48
  • 63
Dexygen
  • 12,287
  • 13
  • 80
  • 147
  • 3
    The POLS has a specific meaning with Ruby: "Does not surprise Matz (Ruby's creator). – Wayne Conrad Jan 20 '10 at 15:22
  • Useless for the (incorrect) purpose to which you're trying to put it, agreed. Handy if you're trying to do something that - for example - tests the result of `uniq!`. That would probably be what Matz had in mind. ;-) – Mike Woodhouse Jan 20 '10 at 17:26
  • 1
    I agree. this is very PHP-esque (random, incongruous exceptional cases). all other cases, including `uniq` returns the original one. When `uniq` finds no duplicates, it returns the array itself. seriously wtf. – ahnbizcad Sep 07 '16 at 21:45
  • @MikeWoodhouse the method for that purpose should be `uniq?` not `uniq!` – ahnbizcad Sep 07 '16 at 21:46

5 Answers5

11

This is because uniq! modifies self and if uniq! would return a value you wouldn't be able to know whether a change actually occurred in the original object.

var = %w(green green yellow)
if var.uniq!
  # the array contained duplicate entries
else
  # nothing changed
end

In your code you can simply write

hooks = IO.read(wt_hooks_impl_file).scan(/wt_rt_00\w{2}/)
hooks.uniq!
# here hooks is already changed

If you need to return the value of hook perhaps because it's the last method statement just do

def method
  hooks = IO.read(wt_hooks_impl_file).scan(/wt_rt_00\w{2}/)
  hooks.uniq
end

or otherwise

def method
  hooks = IO.read(wt_hooks_impl_file).scan(/wt_rt_00\w{2}/)
  hooks.uniq!
  hooks
end
Simone Carletti
  • 173,507
  • 49
  • 363
  • 364
  • The return values from "exclamation" methods are often arbitrary like this because they modify the object in-place. What's unfortunate about this case is some of the semantical mess it creates, as illustrated by your example: if uniq! then...actually not unique. – tadman Jan 20 '10 at 15:53
5

The exclamation point on uniq! indicates that it modifies the array instead of returning a new one. You should do this:

hooks = IO.read(wt_hooks_impl_file).scan(/wt_rt_00\w{2}/).uniq

or this

hooks = IO.read(wt_hooks_impl_file).scan(/wt_rt_00\w{2}/)
hooks.uniq!
puts hooks.length
Mike Woodhouse
  • 51,832
  • 12
  • 88
  • 127
mckeed
  • 9,719
  • 2
  • 37
  • 41
  • 1
    Typically, if you use the bang-form of a method it will be the only method acting on that particular object (i.e., not in a chain). This is in part to eliminate ambiguous statements such as `hooks = hooks.uniq!.sort` or `hooks.uniq!.sort!` (where you can have what amounts to multiple assignments to the same variable in the same expression) or assignments to intermediate, temporary values such as `hooks.uniq.sort!`. `Array.uniq!` also returns `nil` instead of `[]` so you can do things like `if (hooks.uniq!)` to modify the array and perform a special action if anything was changed. – bta Jan 20 '10 at 17:27
  • You're second suggestion results in "undefined method `length' for nil:NilClass" if it hooks contains no duplicates -- this is precisely the unexpected behavior I am trying to draw attention to, but apparently it is falling on illogical -- er, deaf -- ears – Dexygen Jan 20 '10 at 17:37
  • Okay, sorry. If the second example doesn't work, there is something else going on here. What version of Ruby are you using? – mckeed Jan 20 '10 at 18:01
3

Since Ruby 1.9, Object#tap is available:

hooks = IO.read(wt_hooks_impl_file).scan(/wt_rt_00\w{2}/).tap do |hooks|
  hooks.uniq!
end
puts hooks.length

And perhaps more succinctly (h/t @Aetherus):

hooks = IO.read(wt_hooks_impl_file).scan(/wt_rt_00\w{2}/).tap(&:uniq!)
puts hooks.length
mwp
  • 8,217
  • 20
  • 26
2

You can append uniq (no exclamation mark at the end) to the end of the first line.

Or, if you insist on using uniq!, use

(hooks = IO.read(wt_hooks_impl_file).scan(/wt_rt_00\w{2}/)).uniq!
Mladen Jablanović
  • 43,461
  • 10
  • 90
  • 113
2

This is not an answer to why, but rather, a workaround.

Since uniq doesn't return nil, I use uniq and assign the the result to a new variable instead of using the bang version

original = [1,2,3,4]
new = original.uniq

#=> new is [1,2,3,4]
#=> ... rather than nil

Having a new variable is a small price to pay. It sure as hell beats doing if checks, with repeated complex calls to uniq! and uniq and checking for nil

ahnbizcad
  • 10,491
  • 9
  • 59
  • 85