This might be a bit more 'elegant'. Whether it's more or less efficient than your solution, I don't know.
puts "Give input text:"
original_text = gets.chomp
puts "Give redacted word:"
redacted = gets.chomp
redacted_words = redacted.split
print(
redacted_words.inject(original_text) do |text, redacted_word|
text.gsub(/\b#{redacted_word}\b/, 'REDACTED')
end
)
So what's going on here?
- I'm using
String#split
without an argument, because ' '
is the default, anyway.
- With
Array#inject
, the following block (staring at do
and ending at end
is executed for each element in the array—in this case, our list of forbidden words.
- In each round, the second argument to the block will be the respective element from the array
- The first argument to the block will be the block's return value from the previous round. For the first round, the argument to the inject function (in our case
original_text
) will be used.
- The block's return value from the last round will be used as return value of the inject function.
- In the block, I replace all occurrences of the currently handled redacted word in the text.
String#gsub
performs a global substitution
- As the pattern to be substituted, I use a regexp literal (
/.../
). Except, it's not really a literal as I'm performing a string substitution (#{...}
) on it to get the currently handled redacted word into it.
- In the regexp, I'm surrounding the word to be redacted with
\b
word boundary matchers. They match the boundary between alphanumeric and non-alphanumeric characters (or vice verca), without matching any of the characters themselves. (They match the zero-lenght 'position' between the characters.) If a string starts or ends with alphanumeric characters, \b
will also match the start or end of the string, respectively, so that we can use it to match whole words.
- The result of
inject
(which is the result of the last execution of the block, i.e., the text when all the substitutions have taken place) is passed as an argument to print
, which will output the now redacted text.
Note that, other than your solution, mine will not consider punctuation as parts of adjacent words.
Also note that my solution will be vulnerable to regex injection.
Example 1:
Give input text:
A fnord is a fnord.
Give redacted word:
ford fnord foo
My output:
A REDACTED is a REDACTED.
Your output:
A REDACTED is a fnord.
Example 2:
Give input text:
A fnord is a fnord.
Give redacted word:
fnord.
My output:
A REDACTEDis a fnord.
(Note how the .
was interpreted to match any character.)
Your output:
A fnord is a REDACTED.