73

I have installed PHP 8.1 and I started testing my old project. I have used the filter FILTER_SANITIZE_STRING like so:

$username = filter_input(INPUT_POST, 'username', FILTER_SANITIZE_STRING);

Now I get this error:

Deprecated: Constant FILTER_SANITIZE_STRING is deprecated

The same happens when I use FILTER_SANITIZE_STRIPPED:

Deprecated: Constant FILTER_SANITIZE_STRIPPED is deprecated

What can I replace it with?

Dharman
  • 30,962
  • 25
  • 85
  • 135

5 Answers5

76

This filter had an unclear purpose. It's difficult to say what exactly it was meant to accomplish or when it should be used. It was also confused with the default string filter, due to its name, when in reality the default string filter is called FILTER_UNSAFE_RAW. The PHP community decided that the usage of this filter should not be supported anymore.

The behaviour of this filter was very unintuitive. It removed everything between < and the end of the string or until the next >. It also removed all NUL bytes. Finally, it encoded ' and " into their HTML entities.

If you want to replace it, you have a couple of options:

  1. Use the default string filter FILTER_UNSAFE_RAW that doesn't do any filtering. This should be used if you had no idea about the behaviour of FILTER_SANITIZE_STRING and you just want to use a default filter that will give you the string value.

  2. If you used this filter to protect against XSS vulnerabilities, then replace its usage with htmlspecialchars(). Don't call this function on the input data. To protect against XSS you need to encode the output!

  3. If you knew exactly what that filter does and you want to create a polyfill, you can do that easily with regex.

    function filter_string_polyfill(string $string): string
    {
        $str = preg_replace('/\x00|<[^>]*>?/', '', $string);
        return str_replace(["'", '"'], ['&#39;', '&#34;'], $str);
    }
    

Don’t try to sanitize input. Escape output.

Dharman
  • 30,962
  • 25
  • 85
  • 135
  • 4
    Note `htmlspecialchars` might not be a one to one replacement. `FILTER_SANITIZE_STRING` trips tags. `strip_tags` would be useful in that case. – Louis Charette Jan 01 '22 at 22:11
  • 1
    @LouisCharette If you need one to one replacement, you can use the polyfill I created. `strip_tags` is also not one to one replacement. – Dharman Jan 01 '22 at 22:12
  • 2
    I understand the need to encode the output. However, decision about not filtering the input using FILTER_UNSAFE_RAW is by my oppinion very bad decision. Nobody wants to have a total mess in the database. I am definitelly missing FILTER_SANITIZE_STRING a lot. – Ondrej Jul 14 '22 at 13:41
  • 2
    @Dharman, how about `FILTER_SANITIZE_SPECIAL_CHARS` as a replacement for `FILTER_SANITIZE_STRING`? Alternatively, looping through the [values](https://stackoverflow.com/questions/3794465/filtering-form-inputs/3794543#3794543). – Motivated Oct 23 '22 at 22:34
  • @motivated It's not an exact replacement because it doesn't remove data as FILTER_SANITIZE_STRING did. But it's ok to use when printing the data out to HTML. Just make sure you are not using this on input values like $_POST or $_GET – Dharman Oct 24 '22 at 08:57
  • @Dharman, do you mean avoiding `FILTER_SANITIZE_SPECIAL_CHARS` for `$_POST` and/or `$_GET`? What filter should be used with `filter_var`? – Motivated Oct 24 '22 at 09:28
  • @Motivated Did you read the article linked at the end of my post? You should never use HTML formatting on the input data. HTML formatting should only be used when outputting to HTML. All `FILTER_SANITIZE*` filters are pretty useless and if I were you I would avoid them like fire. If you want to protect your site against XSS, use `htmlspecialchars` for HTML. But of course, you need to know where you output the data and format it accordingly. There's no magic sanitize solutions to protect you from evil. – Dharman Oct 24 '22 at 09:38
  • @Dharman, I'm confused. Do you mean that I should never sanitize `$_POST` and `$_GET` and instead use `htmlspecialchars` when outputting the results? – Motivated Oct 26 '22 at 21:07
  • 1
    @Motivated Yes, that's exactly what I said. Validate input, format output, and never sanitize (remove parts of data) anything. – Dharman Oct 26 '22 at 21:08
  • @Dharman, if we are never sanitizing for example `$_GET`, doesn't that potentially introduce a vector to be exploited? As an example, it's suggested [here](https://wordpress.stackexchange.com/questions/64967/how-to-properly-validate-data-from-get-or-request-using-wordpress-functions) – Motivated Oct 30 '22 at 07:44
  • @motivated No, if you are using other proper security measures. You are being very vague in your question. There are lots of attack vectors and they have a proper security fix too. Sanitizing isn't one of them – Dharman Oct 30 '22 at 08:40
  • Even though the old `FILTER_SANITIZE_STRING` is a bit weird, it does something resembling `strip_tags`, which was very good to avoid stored XSS. The "Don't sanitize input, escape output" mention is foolish, we don't want to store XSS. Therefore: 1) Sanitize input; 2) Validate input and 3) Escape output, always in that order. – royarisse Jul 07 '23 at 19:40
  • @royarisse it's foolish to try and sanitize input as you can't avoid all xss this way. Xss happens when you output. Everything before then it's just text. – Dharman Jul 08 '23 at 10:38
48

The closest constant you can use instead, if you intend to convert your variable in a safe html string, is FILTER_SANITIZE_FULL_SPECIAL_CHARS

Chouettou
  • 1,009
  • 1
  • 9
  • 10
  • This worked for me, so just to clarify for everyone else (because there is a difference): From W3: "The FILTER_SANITIZE_STRING filter removes tags and remove(s) or encode(s) special characters from a string." versus "The FILTER_SANITIZE_SPECIAL_CHARS filter HTML-escapes special characters." – Design.Garden May 31 '23 at 20:15
8

From documentation you should replace it with htmlspecialchars().

FILTER_SANITIZE_STRING

Strip tags and HTML-encode double and single quotes, optionally strip or encode special characters. Encoding quotes can be disabled by setting FILTER_FLAG_NO_ENCODE_QUOTES. (Deprecated as of PHP 8.1.0, use htmlspecialchars() instead.)

FILTER_SANITIZE_STRIPPED

Alias of "string" filter. (Deprecated as of PHP 8.1.0, use htmlspecialchars() instead.)

francisco
  • 1,387
  • 2
  • 12
  • 23
-1

This is how I replaced the FILTER_SANITIZE_STRING constant. This way you can also use flags.

I had some PHP backend tests, and they work fine with this method. Improvements are appreciated.

/**
 * @param string $value
 * @param array $flags
 * @return string
 */
private static function sanitizeFilterString($value, array $flags): string
{
    $noQuotes = in_array(FILTER_FLAG_NO_ENCODE_QUOTES, $flags);
    $options = ($noQuotes ? ENT_NOQUOTES : ENT_QUOTES) | ENT_SUBSTITUTE;
    $optionsDecode = ($noQuotes ? ENT_QUOTES : ENT_NOQUOTES) | ENT_SUBSTITUTE;

    // Strip the tags
    $value = strip_tags($value);

    // Run the replacement for FILTER_SANITIZE_STRING
    $value = htmlspecialchars($value, $options);

    // Fix that HTML entities are converted to entity numbers instead of entity name (e.g. ' -> &#34; and not ' -> &quote;)
    // https://stackoverflow.com/questions/64083440/use-php-htmlentities-to-convert-special-characters-to-their-entity-number-rather
    $value = str_replace(["&quot;", "&#039;"], ["&#34;", "&#39;"], $value);

    // Decode all entities
    return html_entity_decode($value, $optionsDecode);
}
th3_sh0w3r
  • 37
  • 7
-7

It can easily replaced with this:

FILTER_UNSAFE_RAW
Jodyshop
  • 656
  • 8
  • 12
  • 1
    It CAN be replaced with that, but the behavior is different. If the FILTER_SANITIZE_STRING was being relied upon to protect against malicious input, switching as you recommend will remove that protection. – thelr May 01 '23 at 13:33