0

I am trying to build a function (unless there is already one, I was not able to find one) that satisfies:

  • being saved in a MySQL database → mysqli_real_escape_string
  • being saved in a serialized array in a MySQL database (I had issues when unserialize failed)

as for output:

  • doesn't interfer with HTML → utf8_encode(htmlentities($source, ENT_QUOTES | ENT_HTML401, 'UTF-8'));
  • doesn't interfer with it being a query in an URL, thus encoding the '&','%'

Please give me any advice if there is an idea on how to improve secure encoding.
And I am not sure about the functions give, whether they are the best to be used.

I also had issues with non-printable characters and tried
PHP: How to remove all non printable characters in a string? $s = preg_replace('/[\x00-\x08\x0B\x0C\x0E-\x1F\x80-\x9F]/u', '', $s);

EDIT
Because of the diversity of this question, I want to substantiate the question on how to clean a string that is an element of an array that is put with serialize() in a database ´?

For instance, I had a failure when trying to unserialize after having put a string containing a newline (\n or \r) into an string element of an array that has been serialized successfully...

EDIT_2
The reason for why I have tried to issue encoding HTML entities before saving them into the DB using mysqli_real_escape_string() is that when recalling/loading this object from the DB, the data has changed. For example a user wants to put the string test'test into the database that is encoded by mysqli_real_escape_string() to test\'test and then when loaded from the DB it's still test\'test whcih is NOT what the user wants to have neither what he has sent . Please if you could find a solution for this -- mine was to apply sth. like where mysqli_real_escape_string() had no effect as the quotes have already been HTML encoded.

Community
  • 1
  • 1
  • 2
    This is a very wrong-headed idea. Database escaping is by its very nature a completely separate concept than output escaping. Escape content *when you're emitting it* to a thing that needs escaping, never any earlier. Please read the answer given in the dupe candidate. The question isn't an exact dupe, but the *answer* totally is. – Charles Apr 21 '14 at 19:44
  • 1
    If you're getting *escaped* data from the database, please make sure that your PHP environment doesn't have the *poisonous*, **deprecated** (5.3), ***removed*** (5.4) "magic quotes" feature enabled. See [`get_magic_quotes_runtime`](http://php.net/get_magic_quotes_runtime) – Charles Apr 22 '14 at 02:53
  • 1
    +1 what Charles said. There is no “global escape”, every escaping scheme is necessarily context specific. For HTML-escaping you should just use `htmlspecialchars` alone, at the point of inserting into the HTML page *not* anywhere near the database. There is no need for `utf8_encode` here. When you insert `mysql_real_escape_string`ed content into a string literal in an SQL query that is decoded at the database end before being used, so you will not find `\'` inside the database unless there is another problem. Better: use parameterised queries (in mysqli or PDO), avoiding escaping issues. – bobince Apr 22 '14 at 14:59
  • 1
    A byte string produced by `serialize()` may not fit in a database text column because it is arbitrary binary and not text (for example, your text collation might be UTF-8 and the output of `serialize()` may not be a valid UTF-8 byte sequence). You could use a BINARY column or a text-safe encoding like JSON, which would also avoid a bunch of security problems with `unserialize()` (essentially PHP serialized data acts somewhat like active code and so it's not a good idea to be executing it out of the database). Better: normalise the database schema so you can store list items in separate rows. – bobince Apr 22 '14 at 15:06
  • revised my db scheme and yes utf8_encode is not necessary on user inputs as they are already utf8 because a corresponding http header was sent (+ in the head I added the encoding as a meta tag); only when writing data like the useragent or the requested URI, I needed utf8_encode. now all fine –  Apr 24 '14 at 16:47

1 Answers1

0

From the top of my head, I feel you should try json_encode and json_decode

nazim
  • 1,439
  • 2
  • 16
  • 26
  • Do I understand it correctly: json_encode() will always preserve associative arrays while for json_decode() you need to set the corresponding paramter to true? Do these function work **definitely** with all kind of strings (like those containing linebreaks or special chars like } which is a control character in Json) –  Apr 21 '14 at 20:43