0

I am trying to create method to switch words in string with keywords from hash. For example, there is the string:

my_string = "France france USA usa ENGLAND england ENGland"

Here is my hash:

my_hash = {"england" => "https://google.com"}

And there is the loop:

occurrences = {}
my_string.gsub!(/\w+/) do |match|
  key = my_hash[match.downcase]
  count = occurrences.store(key, occurrences.fetch(key, 0).next)

  count > 2 ? match : "<a href = #{key}>#{match}</a>"
end

The output of this loop is:

 <a href = >France</a> <a href = >france</a> USA usa <a href = https://google.com>ENGLAND</a> <a href = https://google.com>england</a> ENGland

Expected output:

France france USA usa <a href = https://google.com>ENGLAND</a> <a href = https://google.com>england</a> ENGland

The problem you see here is that my loop always took over an <a href> tag the first two words from string, no matter if they are in the hash or not (as you can see in 'France' example) and it should work as in 'England' example (the first two 'Englands' became a hyperlinks but not the third, as it should work).

P.S - additional question: is there any way to avoid already existing hyperlinks in string and not to touch them? For example - if there already would be an 'England' hyperlink in string but with another href.

2 Answers2

1
my_string = "France france USA usa ENGLAND england ENGland"
my_hash = {"england"=>"https://google.com"}
my_string.split
         .chunk(&:downcase)
         .flat_map do |country,a|
            a.flat_map.with_index do |s,i|
              if i < 2 && my_hash.key?(country)    
                "<a href = #{my_hash[country]}>#{s}</a>"
              else
                s    
              end
            end
          end.join(' ')
  #=> "France france USA usa <a href = https://google.com>ENGLAND</a> <a href = https://google.com>england</a> ENGland"

See Enumerable#chunk and Enumerable#flat_map.

Note that

enum0 = my_string.split.chunk(&:downcase)
  #=> #<Enumerator: #<Enumerator::Generator:0x00007ff90c13bc28>:each>

The values generated by this enumerator can be seen by converting it to an array.

enum0.to_a
  #=> [["france", ["France", "france"]], ["usa", ["USA", "usa"]],
  #    ["england", ["ENGLAND", "england", "ENGland"]]]

Then

enum1 = enum0.flat_map
  #=> #<Enumerator: #<Enumerator: #<Enumerator::Generator:0x00007ff90c113e58>:each>:flat_map>

The initial value generated by enum1 and assigned to the two block variables is as follows.

country, a = enum1.next
  #=> ["france", ["France", "france"]] 
country
  #=> "france"
a #=> ["France", "france"]
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
  • Thank you very much! This is very interesting decision, I never used flat_map before, so I would need to learn more about it. Also do you have idea how can we ignore punctuation marks, that stands with the words? – RealOne0912 Sep 16 '21 at 09:20
0

It isn't 100% clear to me from your question what the desired output is, but if you want to only replace words that match a key in your hash, simply add an if (or a next) after your hash lookup. Also, the variable key was used to store this looked up value, so I renamed it and incremented the key not the value in the occurrences hash. This seemed to be more in-line with what you want.

occurrences = {}
my_string.gsub!(/\w+/) do |match|
  key = match.downcase
  value = my_hash[key]
  next match unless value

  count = occurrences.store(key, occurrences.fetch(key, 0).next)

  count > 2 ? match : "<a href = #{value}>#{match}</a>"
end
melcher
  • 1,543
  • 9
  • 15
  • Thank you for your answer but it's quite now what I mean. I added expected output. In your example the output is the only that word, that has matches with keyword from the hash but I want all of the string to stay the same except if the word in string meets with keyword from the hash and will be switched by hyperlink. And it would work only for the first two same 'matched' keywords. (as in my example England meets 3 times but switches twice, same for any words if hash was bigger). But my problem is that ANY first two words switches to hyperlink no matter if they are in the hash. – RealOne0912 Sep 15 '21 at 04:40
  • There was a bug, need to add the match to `next` – melcher Sep 16 '21 at 20:38