0

I am relatively new to Ruby, and am attempting to strip email signatures from emails sent into my Rails application, based on this answer: Strip signatures and replies from emails. While I know there are a number of ways to strip strings and match characters, etc., for some reason these are not matching new lines in the email body.

For instance, to try and pinpoint a signature beginning with '-- \n', I have:

email_body.include? '-- \n'

Which evaluates to false even though the email contains this in the signature. Furthermore, when I try to truncate the string, based on: http://api.rubyonrails.org/classes/ActionView/Helpers/TextHelper.html#method-i-truncate:

truncate(email_body, :separator => '--')

I get 'undefined method truncate'. It seems like something along these lines should be working, however is not so far.

What is the proper way to remove all text after the signature delimiter in the string? Any help is appreciated.

Community
  • 1
  • 1
Drew
  • 2,601
  • 6
  • 42
  • 65

2 Answers2

3

This will give you the info after the signature.

email = "content -- \n sig"
email[/(?<=--\s\n).*/] #=> " sig"

This works too, but I like the regex

email.split("-- \n").last

Important Note: In you above code, you are using ' when you should be using ". This has to do with how each string handles and interprets escape characters. See this answer which explains the difference.

Just remember "-- \n" != '-- \n'

Community
  • 1
  • 1
Dan Grahn
  • 9,044
  • 4
  • 37
  • 74
2

Are you sure you want to check

email_body.include? '-- \n'

That is a rare case. I suspect you want

email_body.include? "-- \n"

As for truncate, you need to use it on the string object or in the context of TextHelper.

sawa
  • 165,429
  • 45
  • 277
  • 381
  • Just for clarification: @sawa is saying that `'\n'` represents the literal characters *backslash* + n, whereas `"\n"` with double quotes represents the character *newline*. That is because strings in single quotes are not escaped and thus taken literally, whereas strings in double quotes are escaped. The only sequence that gets escaped in single quoted strings is `\'` because else one would not be able to input the single quote character `'`. – Patrick Oscity Sep 03 '13 at 13:47
  • This makes sense. The literal quotes were a slight oversight on my part as even when I used the double quotes, email_body.include? "-- \n" was returning false, while email_body.include? "--" returned true. It's as if the new lines where not recognized in the string passed in. – Drew Sep 03 '13 at 13:55