9

I want to convert any title e.g. of a blog entry to a user friendly url. I used rawurlencode() to do that but it gives me a lot of strange strings like %s.

The algorithm should consider german chars like Ö, Ä, etc. I want to make a url from title and be able to get the title by decoding the url.

I tried some of this code: http://pastebin.com/L1SwESBn that is provided in some other questions but it seems to be one way.

E.g. HÖRZU.de -> hoerzu-de -> HÖRZU.de

Any ideas?

Jon
  • 428,835
  • 81
  • 738
  • 806
DarkLeafyGreen
  • 69,338
  • 131
  • 383
  • 601

4 Answers4

8

You want to create slugs, but from experience i can tell you the decode possibilities are limited. For example "Foo - Bar" will become "foo-bar" so how do you then can possibly know that it wasn't "foo bar" or "foo-bar" all along?

Or how about chars that you don't want in your slug and also have no representation for like " ` "? So you can ether use a 1 to 1 converstion like rawurlencode() or you can create a Slug, here is an example for a function - but as i said, no reliable decoding possible - its just in its nature since you have to throw away Information.

function sanitizeStringForUrl($string){
    $string = strtolower($string);
    $string = html_entity_decode($string);
    $string = str_replace(array('ä','ü','ö','ß'),array('ae','ue','oe','ss'),$string);
    $string = preg_replace('#[^\w\säüöß]#',null,$string);
    $string = preg_replace('#[\s]{2,}#',' ',$string);
    $string = str_replace(array(' '),array('-'),$string);
    return $string;
}
Hannes
  • 8,147
  • 4
  • 33
  • 51
2
function url_title($str, $separator = 'dash', $lowercase = FALSE)
 {
  if ($separator == 'dash')
  {
   $search  = '_';
   $replace = '-';
  }
  else
  {
   $search  = '-';
   $replace = '_';
  }

  $trans = array(
      '&\#\d+?;'    => '',
      '&\S+?;'    => '',
      '\s+'     => $replace,
      '[^a-z0-9\-\._]'  => '',
      $replace.'+'   => $replace,
      $replace.'$'   => $replace,
      '^'.$replace   => $replace,
      '\.+$'     => ''
       );

  $str = strip_tags($str);

  foreach ($trans as $key => $val)
  {
   $str = preg_replace("#".$key."#i", $val, $str);
  }

  if ($lowercase === TRUE)
  {
   $str = strtolower($str);
  }

  return trim(stripslashes($str));
 }
Dr. Dan
  • 2,288
  • 16
  • 19
1

The most elegant way I think is using a Behat\Transliterator\Transliterator.

I need to extends this class by your class because it is an Abstract, some like this:

<?php
use Behat\Transliterator\Transliterator;

class Urlizer extends Transliterator
{
}

And then, just use it:

$text = "Master Ápiu";
$urlizer = new Urlizer();
$slug = $urlizer->transliterate($slug, "-");
echo $slug; // master-apiu

Of course you should put this things in your composer as well.

composer require behat/transliterator

More info here https://github.com/Behat/Transliterator

Paulo Victor
  • 3,814
  • 2
  • 26
  • 29
0

there is no reliable way to 'decode' the slug back to its original form. the best solution here would be to database the slug and its original.