1

I have a MongoDB StatefulSet running on OpenShift v3.11. The PersistentVolume is using NFSv4.

On our environment, I set up so that the directory in the NFS server are owned by nfsnobody:nfsnobody. SELinux has also been set to Permissive. All the inner directories and files are also granted with chmod ug+rwx,o-rwx.

This is done so that at runtime, when the Pod accesses to the shared path using the user with group root (gid=0), due to NFS by default squashes user and group root to nfsnobody, the Pod will be able to read and write to the shared path.

$> ls -halZ /srv/share/openshift/mongo/
drwxrwx---. nfsnobody nfsnobody unconfined_u:object_r:default_t:s0 data

This set up has been working for months. But then it starts to fail.

However, when I deploy the Pod, it fails to start with the following error:

021-01-26T16:12:48.163+0000 W STORAGE  [initandlisten] Detected unclean shutdown - /var/lib/mongodb/data/mongod.lock is not empty.
2021-01-26T16:12:48.163+0000 I STORAGE  [initandlisten] Detected data files in /var/lib/mongodb/data created by the 'wiredTiger' storage engine, so setting theactive storage engine to 'wiredTiger'.
2021-01-26T16:12:48.163+0000 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2021-01-26T16:12:48.164+0000 I STORAGE  [initandlisten] wiredtiger_open config:create,cache_size=31220M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2021-01-26T16:12:48.688+0000 E STORAGE  [initandlisten] WiredTiger error (1) [1611677568:688148][457:0x7f9b59cc1ca8], file:WiredTiger.wt, connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operationnot permitted Raw: [1611677568:688148][457:0x7f9b59cc1ca8], file:WiredTiger.wt,connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operation not permitted
2021-01-26T16:12:48.708+0000 E STORAGE  [initandlisten] WiredTiger error (1) [1611677568:708810][457:0x7f9b59cc1ca8], file:WiredTiger.wt, connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operationnot permitted Raw: [1611677568:708810][457:0x7f9b59cc1ca8], file:WiredTiger.wt,connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operation not permitted
2021-01-26T16:12:48.728+0000 E STORAGE  [initandlisten] WiredTiger error (1) [1611677568:728860][457:0x7f9b59cc1ca8], file:WiredTiger.wt, connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operationnot permitted Raw: [1611677568:728860][457:0x7f9b59cc1ca8], file:WiredTiger.wt,connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operation not permitted
2021-01-26T16:12:48.744+0000 W STORAGE  [initandlisten] Failed to start up WiredTiger under any compatibility version.
2021-01-26T16:12:48.744+0000 F STORAGE  [initandlisten] Reason: 1: Operation not permitted
2021-01-26T16:12:48.744+0000 F -        [initandlisten] Fatal Assertion 28595 at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp 638
2021-01-26T16:12:48.744+0000 F -        [initandlisten]

At first glance, one could say "it's probably the mongod process does not have the permission to read the file". However, when I run in debug mode to access to the Terminal, I can totally access to the path /var/lib/mongo/data.

$> id
id=1000230000 gid=0(root) groups=0(root),1000230000
$> cd /var/lib/mongodb/data

/var/lib/mongodb/data$> echo "This is a test" >new_file
/var/lib/mongodb/data$> rm new_file
/var/lib/mongodb/data$> cat WiredTiger.wt | wc -l
23
/var/lib/mongodb/data$> mongod --dbpath $(pwd)
....failed...

The above commands show that I can read the /var/lib/mongod/data/WiredTiger.wt to count the lines but the mongod process cannot.

Only if I do

# 1000230000 is the random UID and GID granted by OpenShift for the Pod.
$> chown -R 1000230000:nfsnobody /srv/share/openshift/mongo/

...the Pod is able to read the files.

Is there anything else I should check to resolve this problem?

UPDATE:

  • The MongoDB version is 4.0.5.
  • Add more log which could pinpoint where the error happened. The wiredtiger_kv_engine.cpp.
Genzer
  • 2,921
  • 2
  • 25
  • 38
  • Check permissions for .lock files – Alex Blex Jan 26 '21 at 17:36
  • possible question duplication: https://stackoverflow.com/questions/51200115/chown-changing-ownership-of-data-db-operation-not-permitted – R2D2 Jan 26 '21 at 18:57
  • 1
    @Alex Blex the `mongo.lock` has the same permission as `WiredTiger.wt`. Even I delete the `mongo.lock` does not change the outcome, the `mongo.lock` will be created again. – Genzer Jan 26 '21 at 20:48
  • @R2D2: I checked the other answer and it did not provide solutions that could resolve mine. – Genzer Jan 26 '21 at 20:51
  • Fair enough, after closer look the error comes from here https://github.com/wiredtiger/wiredtiger/blob/9b32813d625d3dbdf0fd83a7eb4ce10fda0d18f3/src/os_posix/os_fs.c#L662 for the "WiredTiger.wt" itself. – Alex Blex Jan 27 '21 at 00:02
  • Hi @AlexBlex, thanks a lot for the link! I managed to track down the [wiredtiger_kv_engine.cpp](https://github.com/mongodb/mongo/blob/r4.0.5/src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp#L638) which I believe the correct impl. But I'm a little bit struggled to explore the code base. – Genzer Jan 27 '21 at 02:30
  • There is nothing interesting there tbh. It's on the mongo side - it calls wiredtiger_open() 3 times in a row for different version assuming that if there was an error it is because of format of the file. It's why there are 3 consecutive "Operation not permitted" in the logs. The error message itself comes from the `open` syscall. It must be something with openshift permissions. Security layers at https://docs.openshift.com/enterprise/3.1/install_config/persistent_storage/pod_security_context.html look quite complicated. – Alex Blex Jan 27 '21 at 09:44

1 Answers1

1

By reading the MongoDB source code at the tag r4.0.5, I can now understand why I got the error.

Thanks to @Alex Blex for hinting the source code!

Summary

When mongod attempts to read the WiredTiger.wt (or any other files), it tries to not update the file's last access time (st_time in the inode). This reason for doing this is to increase performance. Under the hood, it uses system call open() with a flag O_NOATIME.

According to open() man page:

This flag can be employed only if one of the following conditions is true:

  • The effective UID of the process matches the owner UID of the file.

  • The calling process has the CAP_FOWNER capability in its user namespace and the owner UID of the file has a mapping in the namespace.

The call fails with the error

EPERM  The O_NOATIME flag was specified, but the effective user
       ID of the caller did not match the owner of the file and
       the caller was not privileged.

In my case, the file is being owned by the nfsnobody, not the current UID, hence the error. That explains only by doing chown $UID:nfsnobody the problem goes away.

Some further details

The error comes from posix/os_fs.c when it tries to open a file. At line 693, the flag NO_ATIME is set if the __posix_open_file is called with WT_FS_OPEN_FILE_TYPE_DATA.

Genzer
  • 2,921
  • 2
  • 25
  • 38
  • [NFS Volume Security](https://docs.openshift.com/container-platform/3.11/install_config/persistent_storage/persistent_storage_nfs.html#nfs-volume-security) may be of help, and the use of `supplementalGroups` in particular... – apisim Jan 27 '21 at 23:29