2

I have the following struct:

Name = Struct.new(:first_name, :last_name) do
  def greeting
    "Hello #{first_name}!"
  end
end

I am adding these objects to an array like this:

full_names << Name.new(first_name, last_name)

Now, I'd like to find the N most common first names.

Mark Locklear
  • 5,044
  • 1
  • 51
  • 81
  • 1
    I feel something like `full_names.uniq.map(&:first_name) { |n| full_names.count(n) }.max` is a first step, but that gives me a block/arg error. – Mark Locklear Oct 23 '19 at 15:33

5 Answers5

1

You can use group_by to create a hash mapping each first name to an array of all of its occurrences in the array, transform_values to turn each value array of strings into a count, then max_by to extract the largest n counts.

Name = Struct.new(:first_name, :last_name) do
  def greeting
    "Hello #{first_name}!"
  end
end

full_names = [
  Name.new("a", "b"),
  Name.new("b", "c"),
  Name.new("d", "c"),
  Name.new("c", "d"),
  Name.new("d", "c"),
  Name.new("b", "b"),
  Name.new("b", "e")
]
n = 2

p full_names
  .group_by(&:first_name)
  .transform_values(&:size)
  .max_by(n, &:last)

Output:

[["b", 3], ["d", 2]]

If you only want the first names and not the counts, append .map(&:first) to the chain.

ggorlen
  • 44,755
  • 7
  • 76
  • 106
  • 1
    Nice work ggorlen! I'm adding a solution I came up with in the meantime below. – Mark Locklear Oct 23 '19 at 16:02
  • I personally don't like the `[0...n]` syntax after a block and would rather do `.first(n)` or `.take(n)` the line below, this is just personal preference though. – 3limin4t0r Oct 23 '19 at 16:20
  • Yeah, that's certainly valid. Without doubt `.first` is more semantically meaningful. I'm coming from Python where slices are pretty much everywhere, so seeing the `[]`s tends to click more instantly for me than parsing an English word in my brain, but I think your suggestion is more idiomatic in Ruby. – ggorlen Oct 23 '19 at 16:29
  • 1
    As an aside `max_by(n, &:last)` can replace `.sort_by{|k, v| -v}.first(n)` in the first example. `Enumerable#max_by` takes accepts an argument for the maximum number of results. – engineersmnky Oct 23 '19 at 17:51
  • 1
    @engineersmnky Thanks--whenever I answer a Ruby question, there's always some better function folks tell me about! – ggorlen Oct 23 '19 at 18:59
1
full_names = [
  Name.new("Bob", "Feller"),
  Name.new("Hank", "Jones"),
  Name.new("Annie", "Oakley"),
  Name.new("Cher", ""),
  Name.new("Annie", "Hall"),
  Name.new("Melba", "Toast"),
  Name.new("Bob", "Dylan"),
  Name.new("Hank", "Wiliams"),
  Name.new("Bob", "Marley")
]

nbr_most_common = 3

full_names.each_with_object(Hash.new(0)) { |i,h| h[i[:first_name]] += 1 }.
           max_by(nbr_most_common, &:last).
           map(&:first)
  #=> ["Bob", "Hank", "Annie"]

If you wish to also display the frequency, change the last line to to_h:

full_names.each_with_object(Hash.new(0)) { |i,h| h[i[:first_name]] += 1 }.
           max_by(nbr_most_common, &:last).
           to_h
   #=> {"Bob"=>3, "Hank"=>2, "Annie"=>2} 

See the version of Hash::new that creates a default value (here zero) and Enumerable#max_by.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
1

You can use Enumerable#tally in 2.7

Name = Struct.new(:first_name, :last_name) do
  def greeting
    "Hello #{first_name}!"
  end
end

full_names = [
  Name.new("Yui", "Yoko"),
  Name.new("Bob", "Feller"),
  Name.new("Hank", "Jones"),
  Name.new("Annie", "Oakley"),
  Name.new("Cher", ""),
  Name.new("Annie", "Hall"),
  Name.new("Melba", "Toast"),
  Name.new("Bob", "Dylan"),
  Name.new("Hank", "Wiliams"),
  Name.new("Bob", "Marley")
]

full_names.map(&:first_name).tally.max_by(3, &:last) 
#=> [["Bob", 3], ["Annie", 2], ["Hank", 2]]
cavin kwon
  • 501
  • 3
  • 5
0

Using the same example:

Name = Struct.new(:first_name, :last_name) do
  def greeting
    "Hello #{first_name}!"
  end
end

full_names = [
  Name.new("a", "b"),
  Name.new("b", "c"),
  Name.new("d", "c"),
  Name.new("c", "d"),
  Name.new("d", "c"),
  Name.new("b", "b"),
  Name.new("b", "e")
]

We can use inject to create a Hash with first_name => frequency:

first_names = full_names.map(&:first_name)
first_names #["a", "b", "d", "c", "d", "b", "b"]
hash_of_frequency = first_names.inject(Hash.new(0)) { |h,v| h[v] += 1; h }
hash_of_frequency # [["b", 3], ["d", 2], ["a", 1], ["c", 1]]

Now we just execute a sort_by and get your results:

hash_of_frequency.sort_by{|k, v| -v}.first(number_desired)

The result is the same, but I've preferred to use inject

Dimitrius Lachi
  • 1,277
  • 2
  • 9
  • 21
0

Here was a solution I came up with based on Sophie's answer HERE.

First, this gives me a hash of first names and the number of occurrences of each:

freq_of_first_names = full_names.map(&:first_name).inject(Hash.new(0)) { |h,v| h[v] += 1; h }
 => "Haley"=>122, "Auer"=>119, "Lakin"=>96, "Macejkovic"=>98...

Now I can sort this with the .sort_by method, and also adding the .last method along with the number of records I want returned (in this case 10).

freq_of_first_names.sort_by {|_key, value| value}.last(10)
Mark Locklear
  • 5,044
  • 1
  • 51
  • 81
  • 1
    As an aside `max_by(10, &:last)` can replace `.sort_by {|_key, value| value}.last(10)`. Enumerable#max_by takes accepts an argument for the maximum number of results. – engineersmnky Oct 23 '19 at 17:53