1

I am trying to populate a textarea defined like this

$this->addElement('textarea', 'body', array(
  'label' => $translate->translate('Contents:'),
  'cols' => '80',
  'rows' => '24',
  'required' => true
));

from a database record. The record is a BLOB containing HTML text of pages, in UTF-8.

$form->populate(array(
  // ...
  'body' =>
    str_replace("\\n", "\n",
      html_entity_decode(
        $page['body']
      )
    ),
  // ...
));

Unfortunately, when the length of the text is larger than 2934 bytes, the field is not populated at all. I tried setting maxlength by issuing

  'maxlength' => '4096',

but it seems to have no effect.

Now, from what I could find on the web, textarea limits should be larger than 2934 bytes, more closer to 30-60 KB. Other than resorting to splitting the field into two separate form elements, how could I fix this problem?

Update It seems that the culprit was the character "ß", that is encoded in the database as two characters, "Ã�". The first occurence of that character is at the position 2934 within the text, so the second character in a two byte representation somehow breaks the field from receiving the text entirely.

Vilinkameni
  • 266
  • 1
  • 4
  • 12
  • 1
    Are you sure the problem is not getting the blob out of the database? Seems more likely the issue would be there and not with getting the form element to take that much data. – ficuscr Mar 11 '13 at 16:33
  • No, I am positive the text is being retrieved from the database, for two reasons: 1) I tried plainly `echo`ing it, with success, 2) if I do `substr($page['body'], 1, 2934)` instead, the form element is properly populated. With `substr($page['body'], 1, 2935)` it already isn't. – Vilinkameni Mar 11 '13 at 18:10
  • Spitting out HTML into the textarea... If you look at the page source is everything there? You might need to look at using a WYSIWYG editor or running it through `htmlentitites()`. See: http://stackoverflow.com/questions/3777297/php-html-form-textarea-containing-html – ficuscr Mar 11 '13 at 20:01
  • Does it work without the html_entity_decode and/or the str_replace? – Tim Fountain Mar 11 '13 at 20:41
  • @ficuscr: No, the field doesn't receive any text. I think I found the cause of the problem, I updated the question. What would be the proper way of handling UTF-8 encoded text in BLOBs with ZF? I assume I need to encode the text with `utf8_encode` and decode it with `utf8_decode`? @Tim: Those two functions didn't affect the outcome. – Vilinkameni Mar 11 '13 at 21:02
  • Oddly enough, my `resources.db.params.charset` is set to `"utf8"`. – Vilinkameni Mar 11 '13 at 21:09
  • Again, if putting database aside, only factor really for UTF8 there might be using [multibyte safe string methods](http://php.net/manual/en/ref.mbstring.php). Are you doing any string function on the data that are not `mb` methods? – ficuscr Mar 11 '13 at 21:44
  • Only `html_entity_decode` and that `str_replace`. The text is supplied to a form field after applying those two functions and written to a database by applying a reverse `str_replace` and `htmlentities`, then `insert()` or `update()` from a model class. – Vilinkameni Mar 11 '13 at 21:50
  • 1
    Default encoding on `html_entity_decode` is UTF-8. As for the `str_replace`..."Using a singlebyte string function on a multibyte string can cause an unexpected results", ref: http://stackoverflow.com/questions/3786003/str-replace-on-multibyte-strings-dangerous Perhaps change to a `preg_replace` - pretty sure no need to ever touch `mb_ereg_replace`. – ficuscr Mar 11 '13 at 21:57
  • Thanks, it works perfectly now, with `preg_replace`. – Vilinkameni Mar 13 '13 at 15:56

1 Answers1

0

As stated in an update to the question, the culprit was the character "ß", that was encoded in the database as two characters, "Ã�". The first occurence of that character was at the position 2934 within the text, so the second character in a two byte representation somehow broke the field from receiving the text entirely.

As ficuscr pointed out in his comment, what corrupted the text was a call to the str_replace function. After replacing it with a call to the preg_replace, the text was saved in the database properly and the field received the full text.

Vilinkameni
  • 266
  • 1
  • 4
  • 12