0

i a m a begginner in ruby please i need directions in how to get the program to return a list containing "inlets"

Question: Given a word and a list of possible anagrams, select the correct sublist.

Given "listen" and a list of candidates like "enlists" "google" "inlets" "banana" the program should return a list containing "inlets".

This is what i have been able to do

puts 'Enter word'
word_input = gets.chomp
puts 'Enter anagram_list'
potential_anagrams = gets.chomp
potential_anagrams.each do |anagram|
end

this is how the program should behave assuming my word_input was "hello" but i do not know how to get this working.

is_anagram? word: 'hello', anagrams: ['helo', 'elloh', 'heelo', 'llohe']
# => 'correct anagrams are: elloh, llohe'

Would really appreciate ideas.

Holger Just
  • 52,918
  • 14
  • 115
  • 123
  • 1
    Your starting point is the proper task definition. You need to find anagrams but _what anagram is_? Define it "in terms of calculations" and you'll get 80% of the solution. Why "elloh" is an anagram for "hello" but "eloh" or "ellloh" aren't? How they are different? – Konstantin Strukov Dec 14 '22 at 11:25
  • hello" - elloh because the characters in hello are same in elloh . it is just been rearranged. – Precious Dec 14 '22 at 14:00
  • 1
    Turn each string into a `Set` of characters. Then compare for Set equality. See [here](https://ruby-doc.org/stdlib-2.6.4/libdoc/set/rdoc/index.html) how to work with a `Set` in Ruby. – user1934428 Dec 14 '22 at 14:22
  • 1
    Hint: `s.chars.sort.join('')` – tadman Dec 14 '22 at 14:30
  • 3
    @user1934428: bear in mind that `"helllllo".chars.to_set == "hello".chars.to_s` => `true` – Chris Dec 14 '22 at 15:24
  • Please note the ruby convention that [methods with names ending in '?' should return a boolean](https://stackoverflow.com/questions/1345843/what-does-the-question-mark-at-the-end-of-a-method-name-mean-in-ruby). – pjs Dec 14 '22 at 22:54
  • @Chris : Good point. So we need to simulate a multiset of characters. I think the cheapest way to simulat it is to sort the characters in each string alphabetically (i.e. `s.chars.sort.join` for a String `s`, as tadman suggested. – user1934428 Dec 15 '22 at 12:13

3 Answers3

1

As hinted in comments, a string can be easily converted to an array of its characters.

irb(main):005:0> "hello".chars
=> ["h", "e", "l", "l", "o"]
irb(main):006:0> "lleho".chars
=> ["l", "l", "e", "h", "o"]

Arrays can be easily sorted.

irb(main):007:0> ["h", "e", "l", "l", "o"].sort
=> ["e", "h", "l", "l", "o"]
irb(main):008:0> ["l", "l", "e", "h", "o"].sort
=> ["e", "h", "l", "l", "o"]

And arrays can be compared.

irb(main):009:0> ["e", "h", "l", "l", "o"] == ["e", "h", "l",
l", "o"]
=> true

Put all of this together and you should be able to determine if one word is an anagram of another. You can then pair this with #select to find the anagrams in an array. Something like:

def is_anagram?(word, words)
  words.select do |w|
    ...
  end
end
Chris
  • 26,361
  • 5
  • 21
  • 42
  • what if you need to compare a string to an array with multiple strings in it? – Precious Dec 14 '22 at 15:59
  • That's what the `...` is for in the last bit of code. `#select` will return an array of any of the strings in `words` for which that expression is `true`. – Chris Dec 14 '22 at 16:12
1

Here's how I would do it.

Method

def anagrams(list, word)
  ltr_freq = word.each_char.tally
  list.select { |w| w.size == word.size && w.each_char.tally == ltr_freq }
end

Example

list = ['helo', 'elloh', 'heelo', 'llohe']
anagrams(list, 'hello')
  #=> ["elloh", "llohe"]

Computational complexity

For practical purposes, the computational complexity of computing w.each_char.tally for any word w of length n can be regarded as O(n). That's because hash key lookups are nearly constant-time. It follows that the computational complexity of determining whether a word of length n is an anagram of another word of the same length can be regarded as O(n).

This compares with methods that sort the letters of a word, which have a computational complexity of O(n*log(n)), n being the word length.

Explanation

See Enumerable#tally. Note that w.each_char.tally is not computed when w.size == word.size is false.

Now let's add some puts statements to see what is happening.

def anagrams(list, word)
  ltr_freq = word.each_char.tally
  puts "ltr_freq = #{ltr_freq}"
  list.select do |w|
    puts "\nw = '#{w}'"
    if w.size != word.size
      puts "words differ in length"
      false
    else
      puts "w.each_char.tally = #{w.each_char.tally}"
      if w.each_char.tally == ltr_freq
        puts "character frequencies are the same"
        true
      else
        puts "character frequencies differ"
        false
      end
    end
  end
end
anagrams(list, 'hello')
ltr_freq = {"h"=>1, "e"=>1, "l"=>2, "o"=>1}

w = 'helo'
words differ in length

w = 'elloh'
w.each_char.tally = {"e"=>1, "l"=>2, "o"=>1, "h"=>1}
character frequencies are the same

w = 'heelo'
w.each_char.tally = {"h"=>1, "e"=>2, "l"=>1, "o"=>1}
character frequencies differ

w = 'llohe'
w.each_char.tally = {"l"=>2, "o"=>1, "h"=>1, "e"=>1}
character frequencies are the same
  #=>["elloh", "llohe"]

Possible improvement

A potential weakness of the expression

w.each_char.tally == ltr_freq

is that the frequencies of all unique letters in the word w must be determined before a conclusion is reached, even if, for example, the first letter of w does not appear in the word word. We can remedy that as follows.

def anagrams(list, word)
  ltr_freq = word.each_char.tally
  list.select do |w|
    next false unless w.size == word.size
    ltr_freqs_match?(w, ltr_freq.dup)
  end
end
def ltr_freqs_match?(w, ltr_freq)
  w.each_char do |c|
    return false unless ltr_freq.key?(c)
    ltr_freq[c] -= 1
    ltr_freq.delete(c) if ltr_freq[c].zero?
  end
  true
end

One would have to test this variant against the original version of anagrams above to determine which tends to be fastest. This variant has the advantage that it terminates (short-circuits) the comparison as soon as is found that the cumulative count of a given character in the word w is greater than the total count of the same letter in word. At the same time, tally is written in C so it may still be faster.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
1

I took the tack of creating a class in pursuit of re-usability. This is overkill for a one-off usage, but allows you to build up a set of known words and then poll it as many times as you like for anagrams of multiple candidate words.

This solution is built on hashes and sets, using the sorted characters of a word as the index to a set of words sharing the same letters. Hashing is O(1), Sets are O(1), and if we view words as having a bounded length the calculation of the key is also O(1), yielding an overall complexity of a constant time per word.

I've commented the code, but if anything is unclear feel free to ask.

require 'set'

class AnagramChecker
  def initialize
    # A hash whose default value is an empty set object
    @known_words = Hash.new { |h, k| h[k] = Set.new }
  end

  # Create a key value from a string by breaking it into individual
  # characters, sorting, and rejoining, so all strings which are
  # anagrams of each other will have the same key.
  def key(word)
    word.chars.sort.join
  end

  # Add individual words to the class by generating their key value
  # and adding the word to the set. Using a set guarantees no
  # duplicates of the words, since set contents are unique.
  def add_word(word)
    word_key = key(word)
    @known_words[word_key] << word
    # return the word's key to avoid duplicate work in find_anagrams
    word_key
  end

  def find_anagrams(word)
    word_key = add_word(word)     # add the target word to the known_words
    @known_words[word_key].to_a   # return all anagramatic words as an array
  end

  def inspect
    p @known_words
  end
end

Producing a library of known words looks like this:

ac = AnagramChecker.new
["enlists", "google", "inlets", "banana"].each { |word| ac.add_word word }

ac.inspect   # {"eilnsst"=>#<Set: {"enlists"}>, "eggloo"=>#<Set: {"google"}>, "eilnst"=>#<Set: {"inlets"}>, "aaabnn"=>#<Set: {"banana"}>}

Using it looks like:

p ac.find_anagrams("listen")  # ["inlets", "listen"]
p ac.find_anagrams("google")  # ["google"]

If you don't want the target word to be included in the output, adjust find_anagrams accordingly.

pjs
  • 18,696
  • 4
  • 27
  • 56