71

You know, like myblog.com/posts/donald-e-knuth.

Should I do this with the built in parameterize method?

What about a plugin? I could imagine a plugin being nice for handling duplicate slugs, etc. Here are some popular Github plugins -- does anyone have any experience with them?

  1. http://github.com/rsl/stringex/tree/master
  2. http://github.com/norman/friendly_id/tree/master

Basically it seems like slugs are a totally solved problem, and I don't to reinvent the wheel.

Community
  • 1
  • 1
Tom Lehman
  • 85,973
  • 71
  • 200
  • 272

12 Answers12

245

In Rails you can use #parameterize

For example:

> "Foo bar`s".parameterize 
=> "foo-bar-s"
Mikey
  • 2,942
  • 33
  • 37
grosser
  • 14,707
  • 7
  • 57
  • 61
54

The best way to generate slugs is to use the Unidecode gem. It has by far the largest transliteration database available. It has even transliterations for Chinese characters. Not to mention covering all European languages (including local dialects). It guarantees a bulletproof slug creation.

For example, consider those:

"Iñtërnâtiônàlizætiøn".to_slug
=> "internationalizaetion"

>> "中文測試".to_slug
=> "zhong-wen-ce-shi"

I use it in my version of the String.to_slug method in my ruby_extensions plugin. See ruby_extensions.rb for the to_slug method.

Paweł Gościcki
  • 9,066
  • 5
  • 70
  • 81
  • 4
    I think even better is the stringex gem http://github.com/rsl/stringex/tree/master because it uses this inside of it and also adds other usefull things. – Rytis Lukoševičius Oct 04 '10 at 12:02
  • 3
    Do you really want to convert everything to ascii? Aren't unicode urls better for SEO? – Paul McMahon Oct 07 '10 at 11:25
  • I'm not entirely convinced about the uniode urls. I'm not sure the time has come for them. Yet. Although many disagree. – Paweł Gościcki Oct 07 '10 at 17:46
  • The problem is that, say if I was to write Japanese, but since Japanese borrows from Chinese all the transliterations will be based on Chinese – Bill Mar 11 '12 at 08:35
  • @Bill that's very much true. Transliteration is by no means an be-all-end-all solution. You can read about its flaws in a similar gem: https://github.com/norman/unidecoder#readme – Paweł Gościcki Mar 12 '12 at 08:59
  • `4 years later` @PaulMcMahon I use https://github.com/norman/babosa it doesnt convert to unicode. – Yana Agun Siswanto Jul 29 '14 at 23:50
38

I use the following, which will

  • translate & --> "and" and @ --> "at"
  • doesn't insert an underscore in place of an apostrophe, so "foo's" --> "foos"
  • doesn't include double-underscores
  • doesn't create slug that begins or ends with an underscore

  def to_slug
    #strip the string
    ret = self.strip

    #blow away apostrophes
    ret.gsub! /['`]/,""

    # @ --> at, and & --> and
    ret.gsub! /\s*@\s*/, " at "
    ret.gsub! /\s*&\s*/, " and "

    #replace all non alphanumeric, underscore or periods with underscore
     ret.gsub! /\s*[^A-Za-z0-9\.\-]\s*/, '_'  

     #convert double underscores to single
     ret.gsub! /_+/,"_"

     #strip off leading/trailing underscore
     ret.gsub! /\A[_\.]+|[_\.]+\z/,""

     ret
  end

so, for example:


>> s = "mom & dad @home!"
=> "mom & dad @home!"
>> s.to_slug
> "mom_and_dad_at_home"
klochner
  • 8,077
  • 1
  • 33
  • 45
  • but I find the result dissatisfying. For my example above, you would get mom-dad-home. I like to try preserving symbol meanings. – klochner Aug 19 '09 at 20:48
  • 22
    According to Google, dashes are better than underscores: http://www.google.com/support/webmasters/bin/answer.py?answer=76329 `Consider using punctuation in your URLs. The URL http://www.example.com/green-dress.html is much more useful to us than http://www.example.com/greendress.html. We recommend that you use hyphens (-) instead of underscores (_) in your URLs.` – Rei Miyasaka Aug 06 '10 at 07:35
  • 1
    Interesting, they recommend hyphens but don't say why. – klochner Aug 10 '10 at 00:02
  • 21
    hyperlinks are very often underlined. If you use underscores, it's often difficult to differentiate between "a link" and "a_link". Hyphens don't have that issue. – kikito Jul 05 '11 at 20:40
  • 2
    +1 on the cool trick of changing @+&. You can drop the non-alpha (et al.) removal by using `.parameterize` though. – Soup Jul 11 '12 at 16:41
  • Anywhere you would use a space or hyphen in English, you should use a hyphen in a URI. – NARKOZ Sep 17 '12 at 07:46
9

Here is what I use:

class User < ActiveRecord::Base
  before_create :make_slug
  private

  def make_slug
    self.slug = self.name.downcase.gsub(/[^a-z1-9]+/, '-').chomp('-')
  end
end

Pretty self explanatory, although the only problem with this is if there is already the same one, it won't be name-01 or something like that.

Example:

".downcase.gsub(/[^a-z1-9]+/, '-').chomp('-')".downcase.gsub(/[^a-z1-9]+/, '-').chomp('-')

Outputs: -downcase-gsub-a-z1-9-chomp

Garrett
  • 7,830
  • 2
  • 41
  • 42
6

The Unidecoder gem hasn't been updated since 2007.

I'd recommend the stringex gem, which includes the functionality of the Unidecoder gem.

https://github.com/rsl/stringex

Looking at it's source code, it seems to repackage the Unidecoder source code and add new functionality.

Tilo
  • 33,354
  • 5
  • 79
  • 106
6

The main issue for my apps has been the apostrophes - rarely do you want the -s sitting out there on it's own.

class String

  def to_slug
    self.gsub(/['`]/, "").parameterize
  end

end
Mark Swardstrom
  • 17,217
  • 6
  • 62
  • 70
6

I modified it a bit to create dashes instead of underscores, if anyone is interested:

def to_slug(param=self.slug)

    # strip the string
    ret = param.strip

    #blow away apostrophes
    ret.gsub! /['`]/, ""

    # @ --> at, and & --> and
    ret.gsub! /\s*@\s*/, " at "
    ret.gsub! /\s*&\s*/, " and "

    # replace all non alphanumeric, periods with dash
    ret.gsub! /\s*[^A-Za-z0-9\.]\s*/, '-'

    # replace underscore with dash
    ret.gsub! /[-_]{2,}/, '-'

    # convert double dashes to single
    ret.gsub! /-+/, "-"

    # strip off leading/trailing dash
    ret.gsub! /\A[-\.]+|[-\.]+\z/, ""

    ret
  end
Victor S
  • 5,098
  • 5
  • 44
  • 62
3

We use to_slug http://github.com/ludo/to_slug/tree/master. Does everything we need it to do (escaping 'funky characters'). Hope this helps.

EDIT: Seems to be breaking my link, sorry about that.

theIV
  • 25,434
  • 5
  • 54
  • 58
3

Recently I had the same dilemma.

Since, like you, I don't want to reinvent the wheel, I chose friendly_id following the comparison on The Ruby Toolbox: Rails Permalinks & Slugs.

I based my decision on:

  • number of github watchers
  • no. of github forks
  • when was the last commit made
  • no. of downloads

Hope this helps in taking the decision.

Marius Butuc
  • 17,781
  • 22
  • 77
  • 111
2

I found the Unidecode gem to be much too heavyweight, loading nearly 200 YAML files, for what I needed. I knew iconv had some support for the basic translations, and while it isn't perfect, it's built in and fairly lightweight. This is what I came up with:

require 'iconv' # unless you're in Rails or already have it loaded
def slugify(text)
  text.downcase!
  text = Iconv.conv('ASCII//TRANSLIT//IGNORE', 'UTF8', text)

  # Replace whitespace characters with hyphens, avoiding duplication
  text.gsub! /\s+/, '-'

  # Remove anything that isn't alphanumeric or a hyphen
  text.gsub! /[^a-z0-9-]+/, ''

  # Chomp trailing hyphens
  text.chomp '-'
end

Obviously you should probably add it as an instance method on any objects you'll be running it on, but for clarity, I didn't.

coreyward
  • 77,547
  • 20
  • 137
  • 166
  • I loved this solution. Thanks! – kikito Jul 05 '11 at 20:53
  • just made some improvements for your code: require 'iconv' require 'active_support/core_ext/string' String.class_eval do def slugify Iconv.conv('ASCII//TRANSLIT//IGNORE', 'UTF8', self.mb_chars.downcase).parameterize end end – Dmytro Jul 12 '14 at 21:52
  • @Dmitry Yeah, “improvements”, like adding a method to every instance of String or subclass thereof. You should go ahead and include ActiveModel in there while you're at it, because non-primitives are scary! – coreyward Jul 12 '14 at 22:15
  • @coreyward it is not important what class I used String class or utils, I added mb_chars call to make possible using not just english language, and replaced your regexs with parameterize, because it worked better for me. – Dmytro Jul 13 '14 at 06:11
  • Hmm. `mb_chars` shouldn't be necessary because Iconv is stripping anything that isn't ASCII, and ActiveSupport's `parameterize` is almost identical to what I have here. What input are you working with? – coreyward Jul 13 '14 at 23:44
  • @coreyward mb_chars is necessary to make downcase work with non-english chars, iconv sucks to translit uppercase non-english chars correctly. So after some time expirementing with iconv moved to stringex gem. It makes much useful strings. – Dmytro Jul 17 '14 at 19:17
  • @Dmitry My use case did not entail handling more than common, non-english characters like é, thus I wanted to avoid the weight of Unidecode. If you're handling characters that are outside the capabilities of Iconv you might just want to use Unidecode instead of re-inventing the wheel, though. – coreyward Jul 17 '14 at 19:59
  • @coreyward your solution is fine for many cases, Iconv works fast and almost with all languages, but I wanted more accurate results. – Dmytro Jul 17 '14 at 20:31
0

With Rails 3, I've created an initializer, slug.rb, in which I've put the following code:

class String
  def to_slug
    ActiveSupport::Inflector.transliterate(self.downcase).gsub(/[^a-zA-Z0-9]+/, '-').gsub(/-{2,}/, '-').gsub(/^-|-$/, '')
  end
end

Then I use it anywhere I want in the code, it is defined for any string.

The transliterate transforms things like é,á,ô into e,a,o. As I am developing a site in portuguese, that matters.

Thiago Ganzarolli
  • 1,161
  • 12
  • 17
  • 2
    FYI: There is already [String#parameterize](http://api.rubyonrails.org/classes/ActiveSupport/Inflector.html#method-i-parameterize) included in ActiveSupport – haraldmartin Sep 06 '12 at 13:40
-2

I know this question has some time now. However I see some relatively new answers.

Saving the slug on the database is problematic, and you save redundant information that is already there. If you think about it, there is no reason for saving the slug. The slug should be logic, not data.

I wrote a post following this reasoning, and hope is of some help.

http://blog.ereslibre.es/?p=343

ereslibre
  • 493
  • 4
  • 3