0

The php docs page on htmlspecialchars mentions:

The default is ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401.

My knowledge of | in programming documentations is only this MDN explanation of its use in css docs. This seems to be not applicable in php documentation, as $flags can be left blank in htmlspecialchars().

In php htmlspecialchars, I noticed that, without mentioning the flag, it does not convert single quotes into &#039. This certainly implies, ENT_QUOTES is not the default value.

So, what does | mean in php documentation, and what is the default value of $flags in htmlspecialchars?

user31782
  • 7,087
  • 14
  • 68
  • 143
  • 1
    Its an OR of course – RiggsFolly Feb 17 '22 at 15:11
  • @NicoHaase It does partially, however, I don't know the inbuilt values of these flags. What is the value of `ENT_QUOTES` - 0,1 or 101? And what would be the resultant of `ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401` in empty `htmlspecialchars($str)`? And in `htmlspecialchars($string, ENT_COMPAT,'ISO-8859-1', true);` would the other two defaults are already present? – user31782 Feb 17 '22 at 15:23
  • Why do you care about the values of the single constants? Why not dump them, each single one, to check which value they contain? Also, please add all clarification to your question by editing it – Nico Haase Feb 17 '22 at 15:34
  • @NicoHaase This is new info to me to grasp. But my original question is still valid, _The default is ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401_ -- what does this mean? Give me some time to add more details, I need to read the linked post first. – user31782 Feb 17 '22 at 15:36
  • So, what's your question then? What do you mean by "as $flags can be left blank"? That's basic PHP stuff to have default values for optional arguments – Nico Haase Feb 17 '22 at 15:42
  • @NicoHaase Consider [this](https://developer.mozilla.org/en-US/docs/Web/CSS/background-position#formal_syntax) example of css documentation. In css you cannot have `.someclass { backgound-poistion: ;}` -- you can't leave the value empty. `|` in css docs mean `OR` with at least one value being mentioned, but in php you can call `htmlspecialchars()` without mentioning any flag. – user31782 Feb 17 '22 at 15:51
  • 1
    @user31782 I would suggest not trying to compare procedural languages like PHP and JS against specialist languages like CSS. Trying to draw parallels between them is just going to get you more confused. Instead, have a look for some introductory PHP tutorials if you're not familiar; or browse the PHP manual - for instance, [default function arguments are discussed in this section](https://www.php.net/manual/en/functions.arguments.php#functions.arguments.default). (The mention of "C++-style" is not particularly helpful, but the rest of the section is hopefully helpful.) – IMSoP Feb 17 '22 at 16:07

1 Answers1

3

It is a bitwise OR.

It combines a set of options (internally expressed as numbers) into a single value that means "All of these options combined".


Let's look at how that works.

Consider that you might have flags:

FLAG_A = 1
FLAG_B = 2
FLAG_C = 4

In binary that would be

FLAG_A = 001
FLAG_B = 010
FLAG_C = 100

So if you had

FLAG_A | FLAG_C

You'd get 101 and that could be compared successfully to FLAG_A or FLAG_C.

In decimal it is represented as 5, but the point of flags like these is to store a combination of yes/no options in a compact form.

Here's a practical example in JS (this kind of bitwise logic is foundational to computer programming so works the same in most languages).

const FLAG_A = 0b001;
const FLAG_B = 0b010;
const FLAG_C = 0b100;

const ACTIVE_FLAGS = FLAG_A | FLAG_C;

console.log(Boolean(FLAG_A & ACTIVE_FLAGS));
console.log(Boolean(FLAG_B & ACTIVE_FLAGS));
console.log(Boolean(FLAG_C & ACTIVE_FLAGS));

Re your comment

however, I don't know the inbuilt values of these flags. What is the value of ENT_QUOTES - 0,1 or 101? And what would be the resultant of ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401 in empty htmlspecialchars($str)?

The actual values don't matter. You can consider them internal to PHP. You only need to worry about the constants.

ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401 is the default, so those three options are all turned on.

The documentation tells you that ENT_QUOTES means "Will convert both double and single quotes..". So you know that that is how the function will work by default. (Along with whatever it says about the other two options that are turned on).

And in htmlspecialchars($string, ENT_COMPAT,'ISO-8859-1', true); would the other two defaults are already present?

No. If you pass a different set of values for the options, you change the defaults.

If you say ENT_COMPAT then that turns ENT_COMPAT on and everything else off.

Quentin
  • 914,110
  • 126
  • 1,211
  • 1,335
  • *ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401 is the default, so those three options are all turned on.* -- No, `ENT_QUOTES` is not turned on by default. – user31782 Feb 17 '22 at 15:29
  • @user31782 — The [documentation](https://www.php.net/manual/en/function.htmlspecialchars.php) disagrees with you. – Quentin Feb 17 '22 at 15:30
  • I have a running example of it. Thats why I got down the rabbit hole of this topic. I have `href` getting dynamic string with mysqli-php through `htmlspecialchars` and it doesn't convert single quotes to html entities. Try: `$str = "'"; var_dump(htmlspecialchars($str, ENT_QUOTES));` and `$str = "'"; var_dump(htmlspecialchars($str));` You get different results. – user31782 Feb 17 '22 at 15:31
  • 1
    @user31782 — You're running an out of date version of PHP. The documentation says it was changed to include `ENT_QUOTES` in version 8.1. – Quentin Feb 17 '22 at 15:35