10

I am after a pure bash solution to "slugify" a variable and that is not as ugly as mine.

slugify: lowercased, shortened to 63 bytes, and with everything except 0-9 and a-z replaced with -. No leading / trailing -. A string suitable to use in URL hostnames and domain names is the result. An input is most likely a series of words with undesired characters in throughout such as:

'Effrafax_mUKwT'uP7(Garkbit<\1}@NJ"RJ"Hactar*S;-H%x.?oLazlarl(=Zss@c9?qick.:?BZarquonelW{x>g@'k'

Of which a slug would look like: 'effrafax-mukwt-up7-garkbit-1-njrjhactar-s-h-x-olazlarl-zss-c9-q'

slugify () {
  next=${1//+([^A-Za-z0-9])/-}
  next=${next:0:63}
  next=${next,,}
  next=${next#-}
  next=${next%-}
  echo $next
}

Also why doesn't ${next//^-|-$} strip the prefix and suffix '-'? Other suggestions?

Tom
  • 981
  • 11
  • 24

2 Answers2

10

I'm using this function, in my bash profile:

slugify () {
    echo "$1" | iconv -t ascii//TRANSLIT | sed -r s/[~\^]+//g | sed -r s/[^a-zA-Z0-9]+/-/g | sed -r s/^-+\|-+$//g | tr A-Z a-z
}

Based on: https://gist.github.com/oneohthree/f528c7ae1e701ad990e6

okdewit
  • 2,406
  • 1
  • 27
  • 32
  • 3
    Works great! Add `-c` to `iconv` to silently discard characters that cannot be converted. https://ss64.com/bash/iconv.html – Illya Moskvin Dec 23 '18 at 11:31
7

OS X and linux compatible variant of answer above

slugify () {
    echo "$1" | iconv -c -t ascii//TRANSLIT | sed -E 's/[~^]+//g' | sed -E 's/[^a-zA-Z0-9]+/-/g' | sed -E 's/^-+|-+$//g' | tr A-Z a-z
}
D. Naumov
  • 140
  • 2
  • 7