1

I use debian9 and debian10 mongocxx driver to insert data in mongodb 6.0. Then I use mongosh 1.6 for monitoring ops with db.currentOp. And I get an error BSONError: Invalid UTF-8 string in BSON document.

I can switch on profiling and look for ops profiles with db.system.profile.find. I see, that profile, except user data, contains driver description data, like this:

'$client': { driver: { name: 'Н\x07\x18_\x7F', version: 'kafkaclient' }, ... platform: 'networks/lib/x86_64-linux-gnu`G\b\x18_\x7F',

Such driver name DONT lead to error with BSON. But sometime I see driver description data, like this:

'$client': { driver: { name: 'Н\x07��\x7F', version: 'kafkaclient' }, ... platform: 'networks/lib/x86_64-linux-gnu`G\b��\x7F'

�� - this simbols cause the BSON error for db.system.profile.find and db.currentOp. For db.system.profile.find I can use enableUtf8Validation = false and avoid the error. But for currentOp I can't use enableUtf8Validation. But I should to use db.currentOp for ops monitoring. I have next questions:

  1. For the same OS, mongocxx driver, mongodb, mongosh..driver name data can to diff from session to session between c++ client and mongodb. Why?
  2. Can I yet disable UTF8 validation for db.currentOp?
  3. Why mongocxx+mongodb+debian can bourn BSON UTF8 mistake? What reason can be major: driver mongocxx, os debian, mongodb?

1 Answers1

1

Well, some investigtions have led me to next result.

  1. At fact, we have 3-levels arch: mongosh (https://www.mongodb.com/docs/mongodb-shell/) => node.js driver (https://github.com/mongodb/node-mongodb-native) => mongodb

  2. Mongosh and mongodb not know about enableUtf8Validation, but node.js - yes. Command db.system.profile.find({},{},{enableUtf8Validation: true}) from mongosh is transfered to node.js driver. Driver send to mongodb clear db.system.profile.find(). Gets result - any data. And validates it or not (depends by enableUtf8Validation). Node.js driver is source of error BSONError: Invalid UTF-8 string in BSON document. Not mongodb, not mongosh.

  3. How to fix it for db.currentOp? Node.js driver not let us to send enableUtf8Validation with this command. But it can process options in mongodb connection string https://www.mongodb.com/docs/manual/reference/connection-string/. And it can get enableUtf8Validation in connection object https://www.mongodb.com/docs/drivers/node/current/fundamentals/utf8-validation/. So, this uri format works for me: mongosh mongodb://user:pass@host:port/db?enableUtf8Validation=false

  4. The question why bad data was bourned by mongocxx driver (or by mongodb) - is still actual. But enableUtf8Validation=false in connstr helps to bypass it for monitoring mongo ops.