8

Let's say I have a large array, @stuff, and a $thing, and I want to know if $thing is in @stuff. What's the best way to do that in Perl 6? And with “best” I mean: idiomatic, readable, performant; not necessarily in that order.

There are actually two separate cases. One is where you have to do a lot of checks for different $things, the other is where you only do this once or a few times.

Let's look at the first case first. I think I know the (or a) right answer.

my $set-of-stuff = set @stuff;
for @whatever -> $thing {
    do-something-with($thing) if $thing ∈ $set of stuff;
}

You can actually skip the first line and simply say ... if $thing ∈ @stuff, but that will almost certainly have much worse performance, since the set is created every time.

But now the second case, I only have one $thing to check. The above solution works, of course, but creating the set, just to check it once, seems a lot of overhead. The shortcut

do-something-with($thing) if $thing ∈ @stuff;

makes a little more sense here, since we only call it once. But still, we have to create a set for one use.

A bit more traditional is:

do-something-with($thing) if @stuff.grep($thing);

Or potentially faster:

do-something-with($thing) if @stuff.first($thing);

But this seems less idiomatic, and certainly the second one is less readable than $thing ∈ @stuff.

I don't think there's a smart match solution, right? Certainly this doesn't work:

do-something-with($thing) if $thing ~~ @stuff;

Any thoughts?

mscha
  • 6,509
  • 3
  • 24
  • 40

1 Answers1

13

Depends on what your definition of "best" or "smart" is.

If you're talking about performance, I'm pretty sure

@stuff.first($thing)

is the fastest.

Idiomatically, and close to the above solution, would be:

$thing ~~ any @stuff

which has the potential of better wallclock performance due to auto-threading.

Using sets to do this, makes the code look closer to formal logic. But it won't make things faster, because the set needs to be created (unless maybe it could be created at compile time).

Not sure there is a "best" answer to this one.

Elizabeth Mattijsen
  • 25,654
  • 3
  • 75
  • 105
  • Thanks, I should have thought of `any`. However, in my experience, junctions in current Rakudo versions are very slow, so that's probably not a good option (yet). – mscha Jan 20 '17 at 14:43
  • And what's “best”? Well, that depends on the situation, I guess. Ideally, best performance, idiomaticity (sic?) and readability, but in practice you have to compromise. You may sacrifice some performance for readability, for instance, but not a lot. – mscha Jan 20 '17 at 14:46
  • Actually `?@stuff.grep($thing)` should be just as fast as `@stuff.first($thing)` ( because it stops after it finds something that matches ) – Brad Gilbert Jan 20 '17 at 16:02
  • The `.grep` still has to set up an iterator and a Seq with it, to only have it torn down by the `.Bool` at the first pull. Whereas a `.first` only ever returns a scalar value (which could be `Nil`). One could argue the optimizer should be smart enough to see the `.grep.Bool` combo and replace it by a `.first.Bool` combo, but we're not there just yet. – Elizabeth Mattijsen Jan 20 '17 at 17:34
  • 2
    "But only if you're there's only one $thing in @stuff." Surely `first` will work fine even if there are two or more. (If I'm right then I think you can delete that caveat.) – raiph Jan 20 '17 at 18:11
  • 1
    Hi, Could you please extend to __get indices and/or number of elements__ ? It is a FAQ as [the same on strings](https://stackoverflow.com/questions/60853431) ? – Tinmarino Mar 25 '20 at 16:50