I am trying to write a gsub expression that will replace a hyphen (-) with an endash (–) where the hyphen is preceded by a number. Basically because I want to show date periods as 1978 – 1980 rather than 1978-1980 as they appear in my data source.
Hyphens and endashes look pretty darn similar to me so I want to be specific and use the unicode character for the endash which is U+2013 while the hyphen is U+002D.
As a test I would like to convert:
"america-the-beautiful. 1760-about 1780"
to
"america-the-beautiful. 1760 – about 1780"
with test_string = "america-the-beautiful. 1760-about 1780"
I've confirmed that the regex is correctly identifying only the hyphens preceded by a number and that gsub replaces with a placeholder for the endash.
test_string.gsub(/(\d)-/, '\1 endash_placeholder ')
=> "america-the-beautiful. 1760 endash_placeholder ca. 1780"
I am struggling to remove both the hyphen and the endash_placeholder and use the actual unicode character.
I've used a number of SO questions to get further with this Ruby Output Unicode Character.
In irb I can return the unicode character for endash with puts "\u{2013}"
I've tried amending my gsub expression to test_string.gsub(/(\d)-/, '\1 \u{2013} ')
=> "america-the-beautiful. 1760 \\u{2013} ca. 1780"
I've also tried double quoting the unicode:
test_string.gsub(/(\d)-/, "\1 \u{2013} ")
=> "america-the-beautiful. 176\u0001 – ca. 1780"
What am I missing in order to use the specific unicode character code in the gsub expression?