0

i am using the following regex to check input for anyting other then the allowed characters...(a-zA-z0-9 a dot, comma, dash and a singlequote and the br tag)

<?php

$regex_char_appearance = '/([A-Za-z0-9 \-\.\,\']|(<br>))/';

?>

I have been trying to create a regex to clean user input. I just can't get it to work so tried different things like...

<?php

    $regex_char_appearance = '/(?!<br>)([^A-Za-z0-9 \-\.\,\'])/';

    $regex_char_appearance = '/([^A-Za-z0-9 \-\.\,\']|[^(<br>)])/';

   //remove anything other then alphabetic and allowed
    $post_char_appearance = preg_replace( $regex_char_appearance , '' , $post_char_appearance);

?>

so the goal is to remove anything other then a-zA-z0-9, a dot, comma, dash , singlequote and the br tag for output with preg_replace.

Can someone help me put a regex together that works?

Joe Boss
  • 23
  • 4

1 Answers1

0

You can match your disallowed characters with [^A-Za-z0-9.,'-] (a negated character class matching any character but those defined in the class).

To keep br tags intact, match and capture it with a pair of unescaped parentheses (a grouping construct (...)) and restore with a backreference $1:

$regex_char_appearance = '~(<br\s*/?>)|[^A-Za-z0-9.,'-]~';
$post_char_appearance = preg_replace($regex_char_appearance, '$1' , $post_char_appearance);

See the regex demo

Note that [A-z] does not only match all ASCII letters, see more on this in Why is this regex allowing a caret?

Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563