0

I'm using Ruby 2.4 with Rails 5. I have the following String

2.4.0 :005 > tr = doc.search("//td").first.text
 => "\r\n  \xA0PLACE \r\n  " 

the String is UTF-8 encoded, although this is just an example. I'm not guaranteed that all data will have UTF-8 encoding.

2.4.0 :007 > tr = doc.search("//td").first.text.encoding
 => #<Encoding:UTF-8> 

but when I apply the "strip" method to it, the call dies

2.4.0 :006 > tr = doc.search("//td").first.text.strip
ArgumentError: invalid byte sequence in UTF-8

This is just an example, but how can I override the strip method so it doesn't die when it encounters a character it doesn't like? I don't want to change the string in question. I'm happy to write my own "strip" method, but I want to be able to keep the above exactly as I have it and just not have the error thrown.

Dave
  • 15,639
  • 133
  • 442
  • 830

1 Answers1

0

The best idea would be to make sure your data has correct encoding.

Alternative

You could use :

"\r\n  \xA0PLACE \r\n  ".force_encoding('binary').strip

It returns :

"\xA0PLACE"

Handle with care!!!!

If you feel adventurous, you could just override String#strip :

### Proof of concept
### HANDLE WITH CARE!!!
class String
  alias_method :original_strip, :strip
  def strip
    self.force_encoding('binary').original_strip
  end
end

p "\r\n  \xA0PLACE \r\n  ".strip
#=> "\xA0PLACE"

A refinement might limit the damage.

For a Rails project, you'd just need to put this code in an initializer.

Community
  • 1
  • 1
Eric Duminil
  • 52,989
  • 9
  • 71
  • 124
  • Thanks but how do I override the strip function so that this is automatically included? Although this works for my specific example, it would mean I would have to go through my entire application and replace "strip" with "force_encoding('binary').strip," which I'm trying to avoid. – Dave Jan 10 '17 at 16:52
  • You don't want to touch the strip method. It is used everywhere in many gems. – Eric Duminil Jan 10 '17 at 16:57
  • 1
    I do. If other stuff breaks, that is my own damn fault, but when I said "how can I override the strip method" in my question, I genuinely mean it, crazy as it may seem. – Dave Jan 10 '17 at 17:29
  • Ahah! :D Roger that. – Eric Duminil Jan 10 '17 at 17:39
  • Hey thanks so much for your updated answer. THis is awesome! Jus tone question -- where do I place the "class String" file? Would I put that in my Rails installation or is there a project-specific place I can stick the file (please say a project-specific place)? – Dave Jan 12 '17 at 19:39
  • In Ruby, every class can be re-opened at any time. So you really can put this code wherever you want, as long as it is executed before you try to use this modified method. See update – Eric Duminil Jan 12 '17 at 19:49