0

I recently migrate a website from a dedicated ubuntu server running apache 2 to a dedicated debian 6 server running nginx.

This website is using CakePHP 2.0 + ichikaway's mongodb plugin (and so using MongoDB)

Since I change my server, I have a strange notice when I try to save a "tags" subdocuments with special chars like "français" or "èéï".

It works in other Controllers / Models / Collections (like when I save a new comment with special chars for exemple).

I already force nginx to use utf-8, all my website pages have the meta charset utf-8 and all the .php (and .ctp) scripts are encoded in utf-8. I also tried to force utf8_encode() and also utf8_encode(utf8_decode()) (yeah that's bad...) the string bug got the same error.

Here is the notice : (and note that the document isn't saved)

Notice (1024): non-utf8 string: fran��ais [APP/Plugin/Mongodb/Model/Datasource/MongodbSource.php, line 715]

And the context :

MongodbSource::update() - APP/Plugin/Mongodb/Model/Datasource/MongodbSource.php, line 715
Model::save() - CORE/Cake/Model/Model.php, line 1614
FiltersController::edit() - APP/Plugin/Administration/Controller/FiltersController.php, line 137
ReflectionMethod::invokeArgs() - [internal], line ??
Controller::invokeAction() - CORE/Cake/Controller/Controller.php, line 473
Dispatcher::_invoke() - CORE/Cake/Routing/Dispatcher.php, line 107
Dispatcher::dispatch() - CORE/Cake/Routing/Dispatcher.php, line 89
[main] - APP/webroot/index.php, line 96

I pray the "Stack Overflow God" to save me, I really don't know where to look anymore to get it working like before :(

Thanks you for reading.

Mush
  • 153
  • 1
  • 1
  • 8

3 Answers3

2

Ok I finally found it !

I was using strtolower(), and it's this function who was breaking the encoding.

So I change it with mb_strtolower() forcing utf-8 and it works well again

Mush
  • 153
  • 1
  • 1
  • 8
0

utf8_encode() only handles ISO-8859-1 input data, so you may need to look into iconv for handling other character sets. A challenge here may be detecting the character set of the incoming data (I've faced this before with responses from Facebook's API), but this question should offer a few possibilities.

Off-hand, what version of MongoDB and the PECL driver are you using? I'm on MongoDB 2.1 and 1.2.11dev of the PECL driver and was able to do this in PHP without issue:

$m = new Mongo();
$m->test->foo->insert(array('fran��ais' => 'français'));

I was also able to view the document via the Mongo shell:

> db.foo.find()
{ "_id" : ObjectId("4fe9d924e84df1844f000002"), "fran��ais" : "français" }

I realize the BSON spec requires UTF-8, but Mongo didn't complain in this case. I'm curious if older versions are more strict about this.

Community
  • 1
  • 1
jmikola
  • 6,892
  • 1
  • 31
  • 61
0

I solved that problem adding these lines:

for($i=0; $i<=count($values)-1; $i++){
            if(is_string($values[$i]))
                $values[$i] = utf8_encode($values[$i]);
        }

in ../Pluggin/Mongodb/Model/DataSource/MongodbSource.php right after

if (!$this->isConnected()) {
            return false;
        }

in the update and create functions