2

I understand why mb_ functions are useful. But I'm not sure if there's any reason to keep using old plain string functions. As programming teacher, I wonder if I should just skip those in favor of their multibyte versions.

Related Questions:

Community
  • 1
  • 1
Capi Etheriel
  • 3,542
  • 28
  • 49
  • At least you can show difference between one byte ASCII symbols and multi-byte characters. – Viacheslav Kondratiuk Nov 19 '13 at 13:58
  • Yeah, when I teach Python we usually bump into encoding issues and talk about string representation and stuff. But this is a more of a high-level hands-on PHP class... – Capi Etheriel Nov 19 '13 at 14:03
  • Since PHP doesn't have any "high level" concept of strings beyond *byte arrays*, it's all the more pertinent to discuss these topics in PHP. In, say, Javascript you can mostly ignore the topic of encodings until you try to use very high code points, in Python 3 you can AFAIK mostly ignore encodings if you properly set up a Unicode sandwich. In PHP though you're always working with low-level bytes. – deceze Nov 19 '13 at 14:07

1 Answers1

3

Not all string operations are reimplemented as mb_ function, for instance there's no mb_ equivalent to str_replace. The reason is that it's not really necessary, since str_replace works just fine on strings of any encoding if you take care that the arguments are all in a consistent encoding.

So, you cannot simply ignore all "old plain string functions" altogether. You need to use the mb_ functions if you're doing something which requires encoding- and character-awareness. For other purposes, you don't necessarily.

The "old plain string functions" are also helpful if you're explicitly trying to work with bytes rather than characters. For example, you can use substr to test for the presence of a BOM:

if (substr($str, 0, 3) == "\xEF\xBB\xBF")
deceze
  • 510,633
  • 85
  • 743
  • 889