0

How to prevent characters like this one on my website:

Ả̴̢̦̙̬̲̯̖̲̟̟̬̲̻̣̩͕͍̦͍̮̠̤͇̿́̾͋́̾̎̔̐̓̾̐̉͒̅͛̈́̀̇͋͋̔̕͘͝͝͝

They are really annoying. Ḧ̶̡̡̢͙͚̝̖͙͓̝̘̯̜̗͙̩͎̻̥̩͈͈͈̘̰͇̞͇͇̦̼̺̙̲͔́̿͌̀̅͊̌́͂̋̃̽̔̀̇̎̈̆́̽̇͂͘͘͜͝͝A̸̡̧̲̦͕̦̦̘̫͍̺͙̫͉̠͆̈́̅̚ͅͅḦ̴̪̱̠̦̜̩͒̃͌̎̇͌̒̍̒̇̾̀͑̂̆̉̓͌͘̚̚̕͜͝ͅA̶̻͐̔̍̃͆̆̓̿͋͊̽͝

Black
  • 18,150
  • 39
  • 158
  • 271
  • If you have them on your site, someone put them there. You need input sanitation if you allow someone to add content to your site – mplungjan Sep 15 '19 at 13:13
  • I don't have them on my site, I noticed them on another website. How to sanitize it? – Black Sep 15 '19 at 13:14
  • 1
    This might also be worth looking at: https://stackoverflow.com/questions/10414864/whats-up-with-these-unicode-combining-characters-and-how-can-we-filter-them – Nick Parsons Sep 15 '19 at 13:16
  • _How can I prevent unicode characters like this Ả̴̢̦̙̬̲̯̖̲̟̟̬̲̻̣̩͕͍̦͍̮̠̤͇̿́̾͋́̾̎̔̐̓̾̐̉͒̅͛̈́̀̇͋͋̔̕͘͝͝͝ on **my** site_ – mplungjan Sep 15 '19 at 13:27

1 Answers1

1

Replace all the unicode characters outside of your desired range(s).

$annoying_string = 'Ả̴̢̦̙̬̲̯̖̲̟̟̬̲̻̣̩͕͍̦͍̮̠̤͇̿́̾͋́̾̎̔̐̓̾̐̉͒̅͛̈́̀̇͋͋̔̕͘͝͝͝Ả̴̢̦̙̬̲̯̖̲̟̟̬̲̻̣̩͕͍̦͍̮̠̤͇̿́̾͋́̾̎̔̐̓̾̐̉͒̅͛̈́̀̇͋͋̔̕͘͝͝͝Ả̴̢̦̙̬̲̯̖̲̟̟̬̲̻̣̩͕͍̦͍̮̠̿́̾͋́̾̎̔̐̓̾̐̉͒̅͛̈́̀̇͋͋̔̕͘͝͝͝foobar̤͇';

$cleaned_string = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $annoying_string);

echo $cleaned_string; // AAAfoobar
Gavin
  • 2,214
  • 2
  • 18
  • 26