-2

I want to sanitize a $string using the next white list:

It includes a-z, A-Z,0-9 and some usual characters included on posts []=+-¿?¡!<>$%^&*'"()/#@*,.:;_|.
As well spanish accents like á,é,í,ó,ú and ÁÉÍÓÚ

WHITE LIST

abcdefghijklmnñopqrstuvwxyzñáéíóúABCDEFGHIJKLMNÑOPQRSTUVWXYZÁÉÍÓÚ0123456789[]=+-¿?¡!<>$%^&*'"()/#@*,.:;_|

I want to sanitize this string

 $string="//abcdefghijklmnñopqrstuvwxyzñáéíóúABCDEFGHIJKLMNÑOPQRSTUVWXYZÁÉÍÓÚ0123456789[]=+-¿?¡!<>$%^&*'()/#@*,.:;_| |||||||||| ] ¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶¸¹º»¼½ mmmmm onload onclick='' [ ? / < ~ # ` ! @ $ % ^ & * ( ) + = } | :  ; ' , > { space !#$%&'()*+,-./:;<=>?@[\]^_`{|}~ <html>sdsd</html> ** *`` `` ´´ {} {}[] ````... ;;,,´'¡'!!!!¿?ña ñaña ÑA á é´´ è ´ 8i ó ú à à` à è`ì`ò ù &  > < ksks < wksdsd '' \" \' <script>alert('hi')</script>";

I tried this regex but it doesnt work

//$regex = '/[^\w\[\]\=\+\-\¿\?\¡\!\<\>\$\%\^\&\*\'\"\(\)\/\#\@\*\,\.\/\:\;\_\|]/i';
//preg_replace($regex, '', $string);

Does anyone has a clue how to sanitize thisstring according to the whitelist values?

M. Eriksson
  • 13,450
  • 4
  • 29
  • 40
Dan
  • 517
  • 1
  • 3
  • 13
  • Hit and run @downvoter. You could teach how to sanitze an string with spanish characters instead. – Dan Aug 08 '18 at 06:02
  • Possible duplicate of [What's the best method for sanitizing user input with PHP?](https://stackoverflow.com/questions/129677/whats-the-best-method-for-sanitizing-user-input-with-php) – kallosz Aug 08 '18 at 06:17
  • I just checked and there is no spanish accents on it, why do you considerit as a duplicate? – Dan Aug 08 '18 at 06:23
  • you can add it. – kallosz Aug 08 '18 at 06:50

1 Answers1

1

If you known your white list characters use the white list in the regex instead of including the black list. The blacklist could be really big. Specially if the encoding something like UTF-8 or UTF-16

There is a lot of ways to do this. One could be to create a regex with capture groups of the desired range of posibilities (also include the spaces and new lines) and compose a new string with the groups.

Also take carefully that some of the characters could be reserved regex characters and need to be scaped. Like "[ ? +"

You could test a regex like:

$string ="Your test string";
$pattern= "([a-zA-Z0-9\[\]=\+\-\¿\?¡!<>$%\^&\*'\"\sñÑáéíóúÁÉÍÓÚ]+)";
preg_match_all($pattern, $string, $matches);
$newString =  join('', $matches);

This is only and simple example of how to apply the whilte list with the regex.

Dubas
  • 2,855
  • 1
  • 25
  • 37