56

How should I convert a post title to a slug in Ruby?

The title can have any characters, but I only want the slug to allow [a-z0-9-_] (Should it allow any other characters?).

So basically:

  • downcase all letters
  • convert spaces to hyphens
  • delete extraneous characters
ma11hew28
  • 121,420
  • 116
  • 450
  • 651

5 Answers5

108

Is this Rails? (works in Sinatra)

string.parameterize

That's it. For even more sophisticated slugging, see ActsAsUrl. It can do the following:

"rock & roll".to_url => "rock-and-roll"
"$12 worth of Ruby power".to_url => "12-dollars-worth-of-ruby-power"
"10% off if you act now".to_url => "10-percent-off-if-you-act-now"
"kick it en Français".to_url => "kick-it-en-francais"
"rock it Español style".to_url => "rock-it-espanol-style"
"tell your readers 你好".to_url => "tell-your-readers-ni-hao"
thumbtackthief
  • 6,093
  • 10
  • 41
  • 87
Mark Thomas
  • 37,131
  • 11
  • 74
  • 101
  • It's not Rails, but it looks like that gem will work with plain Ruby as well. Thanks! I like how it converts & to and, but I want it to convert / and . to -. It converts them to slash and dot, respectively. Also, in this case, to keep things simple, I'd rather not require extra gems. So, I updated my solution to `slug = title.strip.downcase.gsub(/(&|&)/, ' and ').gsub(/[\s\.\/\\]/, '-').gsub(/[^\w-]/, '').gsub(/[-_]{2,}/, '-').gsub(/^[-_]/, '').gsub(/[-_]$/, '')`. – ma11hew28 Nov 30 '10 at 18:00
  • Does it produces ASCII slugs, as [pandoc does](https://pandoc.org/MANUAL.html#extension-ascii_identifiers)? – somenxavier Oct 11 '22 at 20:43
  • 1
    @somenxavier `parameterize` does just that. It comes from [ActiveSupport::Inflector](https://apidock.com/rails/ActiveSupport/Inflector/parameterize). If you view the source, you'll see the comment "`# Replace accented chars with their ASCII equivalents`" – Mark Thomas Nov 03 '22 at 18:15
89
slug = title.downcase.strip.gsub(' ', '-').gsub(/[^\w-]/, '')

downcase makes it lowercase. The strip makes sure there is no leading or trailing whitespace. The first gsub replaces spaces with hyphens. The second gsub removes all non-alpha non-dash non-underscore characters (note that this set is very close to \W but includes the dash as well, which is why it's spelled out here).

Ben Lee
  • 52,489
  • 13
  • 125
  • 145
  • Your character class could be expressed more concisely as `/[^\w-]/`. – Daniel Vandersluis Nov 29 '10 at 21:57
  • 2
    Thanks, Ben. I added some more complexity to account for . \ / and to remove multiple -'s in a row and remove them from the end: `slug = title.strip.downcase.gsub(/[\s\.\/\\]/, '-').gsub(/[^\w-]/, '').gsub(/[-_]{2,}/, '-').gsub(/^[-_]/, '').gsub(/[-_]$/, '')`. I stopped after realizing it's pretty darn complicated to get it perfect. Also, `tr` is faster than `gsub`, so it's better to do: `tr(' ', '-')` than `gsub(' ', '-')`. – ma11hew28 Nov 29 '10 at 23:36
  • 1
    @MattDiPasquale. There is a ruby method called String#squeeze that will convert all sequences of two or more of the passed character to one. So you could write the above as `slug = title.downcase.gsub('/[\s.\/_]/, ' ').squeeze(' ').strip.gsub(/[^\w-]/, '').tr(' ', '-')`. This first turns all whitespace, `.`, `/`, and '_' to spaces. Then it squeezes spaces (all sequences of 2 or more spaces become a single one), then it it strips spaces (removes leading and trailing spaces), then it converts the remaining spaces back to dashes. – Ben Lee Nov 30 '10 at 04:12
  • 1
    As far as speed of `gsub` processing versus `tr`, you're really just talking processor cycles -- nanoseconds, really. Unless you are creating hundreds of thousands of posts per second, that speed difference will make absolutely no difference. What you should take into account is personal style and clarity of code. In this case, I `tr` may still better, but for those two reasons, not because it's faster. – Ben Lee Nov 30 '10 at 04:13
  • Oh, and this is the answer for plain Ruby. If you are using Rails, you can do just do `slug = title.parameterize` as Mark Thomas pointed out. Even if you are not using rails, you can get the same support from the active support gem by doing: `require 'active_support'; $KCODE = 'UTF8';` – Ben Lee Nov 30 '10 at 04:19
  • 1
    Oh and I just noticed that in my comment a few above, I put the things in the wrong order. It should have been: `slug = title.downcase.gsub('/[\s.\/_]/, ' ').squeeze(' ').strip.tr(' ', '-').gsub(/[^\w-]/, '')` – Ben Lee Nov 30 '10 at 04:21
  • @Ben Lee, Thanks for your recommendation to use String#squeeze. However, the refactoring doesn't work exactly as I have it. E.g., it returns `"hello---world"` when `title = "hello - world"`, but I want it to return `"hello-world"`. Also, delete the single quote after the first opening parenthesis. – ma11hew28 Nov 30 '10 at 17:16
  • @Matt, I didn't actually test the code I posted, so there's probably a bug or two in there, but you get the idea, right? You can play around with `gsub`, `squeeze`, and `tr` to get the desired result. Or you can stick with what you already had that worked =). – Ben Lee Nov 30 '10 at 19:10
  • @BenLee I think you should put the `gsub(/[^\w-]/, '')` part before you strip the string. That way this expression will handle strings like `' ? ? hello ? ? world ? ?'` correctly – Niyaz Jun 20 '12 at 07:27
7

to_slug is a great Rails plugin that handles pretty much everything, including funky characters, but its implementation is very simple. Chuck it onto String and you'll be sorted. Here's the source condensed down:

String.class_eval do
  def to_slug
    value = self.mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/n, '').to_s
    value.gsub!(/[']+/, '')
    value.gsub!(/\W+/, ' ')
    value.strip!
    value.downcase!
    value.gsub!(' ', '-')
    value
  end
end
Yarin
  • 173,523
  • 149
  • 402
  • 512
Jamie Rumbelow
  • 4,967
  • 2
  • 30
  • 42
  • 4
    @JamieRumbelow- Your sample code had an error. you need to explicitely return `value`, because .gsub! returns nil when no substitutions are performed (e.g. `"test".to_slug` would return nil). I fixed the code for you. – Yarin Aug 21 '13 at 00:24
3

I've used this gem.It's simple but helpful.

https://rubygems.org/gems/string_helpers

0

I like FriendlyId, the self-professed "Swiss Army Bulldozer" of creating slugs. https://github.com/norman/friendly_id

Aaron Sumner
  • 226
  • 1
  • 3