31

I have problems with removing special characters. I want to remove all special characters except "( ) / . % - &", because I'm setting that string as a title.

I edited code from the original (look below):

preg_replace('/[^a-zA-Z0-9_ -%][().][\/]/s', '', $String);

But this is not working to remove special characters like: "’s, "“", "â€", among others.

original code: (this works but it removes these characters: "( ) / . % - &")

preg_replace('/[^a-zA-Z0-9_ -]/s', '', $String);
tonyjmnz
  • 536
  • 4
  • 13
user453089
  • 719
  • 2
  • 13
  • 23
  • 2
    These „special characters“ seem to be encoded character sequences of a multi-byte character encoding like UTF-8. `’` is the result when the character `’` (U+ 2019) is encoded in UTF-8 (0xE28099) and interpreted with a single-byte character encoding like [Windows-1252](http://en.wikipedia.org/wiki/Windows-1252). – Gumbo May 20 '11 at 14:24
  • 1
    I was actually looking for this: `preg_replace('/[^a-zA-Z0-9_ -]/s', '', $String);`. Thank you! – hitautodestruct Jul 23 '13 at 13:57

8 Answers8

56

Your dot is matching all characters. Escape it (and the other special characters), like this:

preg_replace('/[^a-zA-Z0-9_ %\[\]\.\(\)%&-]/s', '', $String);
Luke Sneeringer
  • 9,270
  • 2
  • 35
  • 32
  • 4
    You don’t need to escape the `[`, `.`, `(`, and `)` inside a character class. – Gumbo May 20 '11 at 14:20
  • -1 Your explanation is wrong, but your regex does (accidentally) work because you putted the `-` at the end of your character class and additional escaping does most times not hurt. The problem of @user453089 is the part `_ -%` where he created a range from space to `%`. I also don't understand why it worked at all because he created 3 character classes in a row. – stema Oct 11 '12 at 07:31
  • 2
    Actually, I moved the hyphen to the end of the character class on purpose, but you're right that I didn't call it out. – Luke Sneeringer Dec 13 '12 at 17:05
15
preg_replace('#[^\w()/.%\-&]#',"",$string);
Thomas Hupkens
  • 1,570
  • 10
  • 16
6

You want str replace, because performance-wise it's much cheaper and still fits your needs!

$title = str_replace( array( '\'', '"', ',' , ';', '<', '>' ), ' ', $rawtitle);

(Unless this is all about security and sql injection, in that case, I'd rather to go with a POSITIVE list of ALLOWED characters... even better, stick with tested, proven routines.)

Btw, since the OP talked about title-setting: I wouldn't replace special chars with nothing, but with a space. A superficious space is less of a problem than two words glued together...

Frank N
  • 9,625
  • 4
  • 80
  • 110
5

Good try! I think you just have to make a few small changes:

  • Escape the square brackets ([ and ]) inside the character class (which are also indicated by [ and ])
  • Escape the escape character (\) itself
  • Plus there's a quirk where - is special: if it's between two characters, it means a range, but if it's at the beginning or the end, it means the literal - character.

You'll want something like this:

preg_replace('/[^a-zA-Z0-9_%\[().\]\\/-]/s', '', $String);

See http://docs.activestate.com/activeperl/5.10/lib/pods/perlrecharclass.html#special_characters_inside_a_bracketed_character_class if you want to read up further on this topic.

Anonymoose
  • 5,662
  • 4
  • 33
  • 41
2
<?php
$string = '`~!@#$%^&^&*()_+{}[]|\/;:"< >,.?-<h1>You .</h1><p> text</p>'."'";
$string=strip_tags($string,"");
$string = preg_replace('/[^A-Za-z0-9\s.\s-]/','',$string); 
echo $string = str_replace( array( '-', '.' ), '', $string);
?>
0
mysqli_set_charset($con,"utf8");
$title = ' LEVEL – EXTENDED'; 
$newtitle = preg_replace('/[^(\x20-\x7F)]*/','', $title);     
echo $newtitle;

Result :  LEVEL EXTENDED

Many Strange Character be removed by applying below the mysql connection code. but in some circumstances of removing this type strange character like †you can use preg_replace above format.

vs97
  • 5,765
  • 3
  • 28
  • 41
Solomon Suraj
  • 1,162
  • 8
  • 8
0
preg_replace('/[^a-zA-Z0-9_ \-()\/%-&]/s', '', $String);
Alex
  • 32,506
  • 16
  • 106
  • 171
-1

See example.

/**
 * nv_get_plaintext()
 *
 * @param mixed $string
 * @return
 */
function nv_get_plaintext( $string, $keep_image = false, $keep_link = false )
{
    // Get image tags
    if( $keep_image )
    {
        if( preg_match_all( "/\<img[^\>]*src=\"([^\"]*)\"[^\>]*\>/is", $string, $match ) )
        {
            foreach( $match[0] as $key => $_m )
            {
                $textimg = '';
                if( strpos( $match[1][$key], 'data:image/png;base64' ) === false )
                {
                    $textimg = " " . $match[1][$key];
                }
                if( preg_match_all( "/\<img[^\>]*alt=\"([^\"]+)\"[^\>]*\>/is", $_m, $m_alt ) )
                {
                    $textimg .= " " . $m_alt[1][0];
                }
                $string = str_replace( $_m, $textimg, $string );
            }
        }
    }

    // Get link tags
    if( $keep_link )
    {
        if( preg_match_all( "/\<a[^\>]*href=\"([^\"]+)\"[^\>]*\>(.*)\<\/a\>/isU", $string, $match ) )
        {
            foreach( $match[0] as $key => $_m )
            {
                $string = str_replace( $_m, $match[1][$key] . " " . $match[2][$key], $string );
            }
        }
    }

    $string = str_replace( ' ', ' ', strip_tags( $string ) );
    return preg_replace( '/[ ]+/', ' ', $string );
}
spenibus
  • 4,339
  • 11
  • 26
  • 35
binkute
  • 71
  • 1
  • 1
  • 4