0

i seem to have run into an unclean shutdown after a power failure that I can't seem to recover from. I've tried running mongod --repair within my controller container but it doesn't seem to help. Any suggestions? I don't want to just blow away my unifi_mongo container, since I'm not sure if I'll lose all my configs.

As a somewhat related question, should I be enabling journaling somehow in this config even though I'm on a 32-bit raspbian lite OS? Not sure how I'd do that, but maybe it'd prevent these sorts of issues in the future?

docker logs -f unifi_mongo

2021-03-06T18:35:51.917+0000 I STORAGE  [initandlisten] exception in initAndListen: 12596 old lock file, terminating
2021-03-06T18:35:51.917+0000 I CONTROL  [initandlisten] dbexit:  rc: 100
2021-03-06T18:36:44.913+0000 I CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 32-bit host=a282e3addaec
2021-03-06T18:36:44.913+0000 I CONTROL  [initandlisten] db version v3.0.14
2021-03-06T18:36:44.913+0000 I CONTROL  [initandlisten] git version: 08352afcca24bfc145240a0fac9d28b978ab77f3
2021-03-06T18:36:44.914+0000 I CONTROL  [initandlisten] build info: Linux raspberrypi 4.9.41-v7+ #1023 SMP Tue Aug 8 16:00:15 BST 2017 armv7l BOOST_LIB_VERSION=1_49
2021-03-06T18:36:44.914+0000 I CONTROL  [initandlisten] allocator: system
2021-03-06T18:36:44.914+0000 I CONTROL  [initandlisten] options: { storage: { journal: { enabled: true } } }
2021-03-06T18:36:44.935+0000 W -        [initandlisten] Detected unclean shutdown - /data/db/mongod.lock is not empty.
2021-03-06T18:36:44.972+0000 I STORAGE  [initandlisten] **************
old lock file: /data/db/mongod.lock.  probably means unclean shutdown,
but there are no journal files to recover.
this is likely human error or filesystem corruption.
please make sure that your journal directory is mounted.
found 3 dbs.
see: http://dochub.mongodb.org/core/repair for more information

docker-compose.yml:

version: '2.3'
services:
 mongo:
   #   image: mongo
   image: andresvidal/rpi3-mongodb3
   container_name: ${COMPOSE_PROJECT_NAME}_mongo
   networks:
     - unifi
   restart: always
   volumes:
     - db:/data/db
     - dbcfg:/data/configdb
 controller:
   image: "jacobalberty/unifi:${TAG:-latest}"
   container_name: ${COMPOSE_PROJECT_NAME}_controller
   depends_on:
     - mongo
   init: true
   networks:
     - unifi
   restart: always
   privileged: true
   volumes:
     - dir:/unifi
     - data:/unifi/data
     - log:/unifi/log
     - cert:/unifi/cert
     - init:/unifi/init.d
     - run:/var/run/unifi
     # Mount local folder for backups and autobackups
     - ./backup:/unifi/data/backup
   user: unifi
   sysctls:
     net.ipv4.ip_unprivileged_port_start: 0
   environment:
     DB_URI: mongodb://mongo/unifi
     STATDB_URI: mongodb://mongo/unifi_stat
     DB_NAME: unifi
     TZ: America/Toronto
   ports:
     - "3478:3478/udp" # STUN
     - "1900:1900/udp"
     - "6789:6789/tcp" # Speed test
     - "8080:8080/tcp" # Device/ controller comm.
     - "8443:8443/tcp" # Controller GUI/API as seen in a web browser
     - "8880:8880/tcp" # HTTP portal redirection
     - "8843:8843/tcp" # HTTPS portal redirection
     - "10001:10001/udp" # AP discovery
 logs:
   image: bash
   container_name: ${COMPOSE_PROJECT_NAME}_logs
   depends_on:
     - controller
   command: bash -c 'tail -F /unifi/log/*.log'
   restart: always
   volumes:
     - log:/unifi/log


volumes:
 db:
 dbcfg:
 data:
 log:
 cert:
 init:
 dir:
 run:

networks:
 unifi:

I tried blowing away the lock and re-running "docker-compose up -d" but it didn't solve the problem.

unifi@me:/unifi/data/db$ ls
local  local.0  local.ns  storage.bson  version

Output of docker ps:

docker ps
CONTAINER ID   IMAGE                       COMMAND                  CREATED       STATUS                            PORTS                                                                                                                                                                                              NAMES
aaaaaaaaaaaa   jacobalberty/unifi:latest   "/usr/local/bin/dock…"   4 weeks ago   Up 3 days (unhealthy)             0.0.0.0:1900->1900/udp, 0.0.0.0:6789->6789/tcp, 0.0.0.0:8080->8080/tcp, 0.0.0.0:8443->8443/tcp, 0.0.0.0:8843->8843/tcp, 0.0.0.0:3478->3478/udp, 0.0.0.0:10001->10001/udp, 0.0.0.0:8880->8880/tcp   unifi_controller
bbbbbbbbbbbb   andresvidal/rpi3-mongodb3   "/docker-entrypoint.…"   4 weeks ago   Restarting (100) 13 seconds ago                                                                                                                                                                                                      unifi_mongo

Do I run mongod --repair inside the mongo container? How do I do that if it keeps restarting?

Thanks

Edit:

I tried setting an entrypoint in the docker-compose.yml to run mongod --repair instead of the normal mongo startup, but i got this backtrace:

2021-03-06T18:52:46.599+0000 I INDEX    [initandlisten]          building index using bulk method
2021-03-06T18:52:46.649+0000 I -        [initandlisten] Fatal Assertion 17441
2021-03-06T18:52:46.769+0000 I CONTROL  [initandlisten]
 0x1622348 0x15c50a0 0x15abc08 0xdc63ec 0x13e0730 0x13e026c 0x13f883c 0x13faf84 0x122a440 0xcf1838 0xcf2f1c 0xcf3acc 0xcf4e00 0xcf3e28 0x76bc9678
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"10000","o":"1612348","s":"_ZN5mongo15printStackTraceERSo"},{"b":"10000","o":"15B50A0","s":"_ZN5mongo10logContextEPKc"},{"b":"10000","o":"159BC08","s":"_ZN5mongo13fassertFailedEi"},{"b":"10000","o":"DB63EC","s":"_ZN5mongo7fassertEib"},{"b":"10000","o":"13D0730","s":"
_ZNK5mongo17RecordStoreV1Base21getNextRecordInExtentEPNS_16OperationContextERKNS_7DiskLocE"},{"b":"10000","o":"13D026C","s":"_ZNK5mongo17RecordStoreV1Base13getNextRecordEPNS_16OperationContextERKNS_7DiskLocE"},{"b":"10000","o":"13E883C","s":"_ZN5mongo27SimpleRecordStoreV1Iterator7getNe
xtEv"},{"b":"10000","o":"13EAF84","s":"_ZN5mongo12MMAPV1Engine14repairDatabaseEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEbb"},{"b":"10000","o":"121A440","s":"_ZN5mongo14repairDatabaseEPNS_16OperationContextEPNS_13StorageEngineERKNSt7__cxx1112basic_str
ingIcSt11char_traitsIcESaIcEEEbb"},{"b":"10000","o":"CE1838"},{"b":"10000","o":"CE2F1C"},{"b":"10000","o":"CE3ACC","s":"_ZN5mongo13initAndListenEi"},{"b":"10000","o":"CE4E00"},{"b":"10000","o":"CE3E28","s":"main"},{"b":"76BB3000","o":"16678","s":"__libc_start_main"}],"processInfo":{ "m
ongodbVersion" : "3.0.14", "gitVersion" : "08352afcca24bfc145240a0fac9d28b978ab77f3", "uname" : { "sysname" : "Linux", "release" : "5.4.83-v7+", "version" : "#1379 SMP Mon Dec 14 13:08:57 GMT 2020", "machine" : "armv7l" }, "somap" : [ { "elfType" : 2, "b" : "10000", "buildId" : "77BB9B
C6C28CA032211CCD119B903FDEE2C6A7D8" }, { "b" : "7EDCA000", "path" : "linux-vdso.so.1", "elfType" : 3, "buildId" : "8E8ADD944B36D89CB5A4AE6DAB825D428D5407ED" }, { "b" : "76F22000", "path" : "/lib/arm-linux-gnueabihf/librt.so.1", "elfType" : 3, "buildId" : "4C7E415AA306267E5BA73CD0FE8F6F
8ABC5D9370" }, { "b" : "76F0F000", "path" : "/lib/arm-linux-gnueabihf/libdl.so.2", "elfType" : 3, "buildId" : "99B3CD788031A72A37B9C9F10C5A63FEABF1BCDB" }, { "b" : "76DC7000", "path" : "/usr/lib/arm-linux-gnueabihf/libstdc++.so.6", "elfType" : 3, "buildId" : "5909F48F93D947CDD017977DA4
79EC563E8B426E" }, { "b" : "76D48000", "path" : "/lib/arm-linux-gnueabihf/libm.so.6", "elfType" : 3, "buildId" : "1128E26D3F2FA311FE65EDF9E3930D2162AF9BE8" }, { "b" : "76D1B000", "path" : "/lib/arm-linux-gnueabihf/libgcc_s.so.1", "elfType" : 3, "buildId" : "030EF284554E9F6259572226A3F2
6F86F86E1B35" }, { "b" : "76CF2000", "path" : "/lib/arm-linux-gnueabihf/libpthread.so.0", "elfType" : 3, "buildId" : "4B15D4A8FE60C9A013D924976C36C1281A60E04D" }, { "b" : "76BB3000", "path" : "/lib/arm-linux-gnueabihf/libc.so.6", "elfType" : 3, "buildId" : "B84C7156F66DE515C6257D0A4A71
D9F31CE6F9CF" }, { "b" : "76F39000", "path" : "/lib/ld-linux-armhf.so.3", "elfType" : 3, "buildId" : "21F72FB00897D4F06093D6F0451C9CA7D1F6E14C" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x2C) [0x1622348]
 mongod(_ZN5mongo10logContextEPKc+0x88) [0x15c50a0]
 mongod(_ZN5mongo13fassertFailedEi+0x78) [0x15abc08]
 mongod(_ZN5mongo7fassertEib+0x34) [0xdc63ec]
 mongod(_ZNK5mongo17RecordStoreV1Base21getNextRecordInExtentEPNS_16OperationContextERKNS_7DiskLocE+0x90) [0x13e0730]
 mongod(_ZNK5mongo17RecordStoreV1Base13getNextRecordEPNS_16OperationContextERKNS_7DiskLocE+0x30) [0x13e026c]
 mongod(_ZN5mongo27SimpleRecordStoreV1Iterator7getNextEv+0x8C) [0x13f883c]
 mongod(_ZN5mongo12MMAPV1Engine14repairDatabaseEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEbb+0xD00) [0x13faf84]
 mongod(_ZN5mongo14repairDatabaseEPNS_16OperationContextEPNS_13StorageEngineERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEbb+0x1C8) [0x122a440]
 mongod(+0xCE1838) [0xcf1838]
 mongod(+0xCE2F1C) [0xcf2f1c]
 mongod(_ZN5mongo13initAndListenEi+0x20) [0xcf3acc]
 mongod(+0xCE4E00) [0xcf4e00]
 mongod(main+0x28) [0xcf3e28]
 libc.so.6(__libc_start_main+0x114) [0x76bc9678]
-----  END BACKTRACE  -----
2021-03-06T18:52:46.770+0000 I -        [initandlisten]

***aborting after fassert() failure

Edit2: Trying to run a repair manually doesn't seem to solve the problem

docker run -it -v db:/data/db andresvidal/rpi3-mongodb3:latest mongod --repair
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm/v7) and no specific platform was requested
2021-03-06T19:23:47.049+0000 I CONTROL
2021-03-06T19:23:47.049+0000 W CONTROL  32-bit servers don't have journaling enabled by default. Please use --journal if you want durability.
2021-03-06T19:23:47.049+0000 I CONTROL
2021-03-06T19:23:47.075+0000 I CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 32-bit host=fa58e6e86cff
2021-03-06T19:23:47.075+0000 I CONTROL  [initandlisten] db version v3.0.14
2021-03-06T19:23:47.075+0000 I CONTROL  [initandlisten] git version: 08352afcca24bfc145240a0fac9d28b978ab77f3
2021-03-06T19:23:47.075+0000 I CONTROL  [initandlisten] build info: Linux raspberrypi 4.9.41-v7+ #1023 SMP Tue Aug 8 16:00:15 BST 2017 armv7l BOOST_LIB_VERSION=1_49
2021-03-06T19:23:47.075+0000 I CONTROL  [initandlisten] allocator: system
2021-03-06T19:23:47.075+0000 I CONTROL  [initandlisten] options: { repair: true }
2021-03-06T19:23:47.159+0000 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2021-03-06T19:23:47.159+0000 I CONTROL  [initandlisten]
2021-03-06T19:23:47.159+0000 I CONTROL  [initandlisten]
2021-03-06T19:23:47.159+0000 I CONTROL  [initandlisten] ** NOTE: This is a 32 bit MongoDB binary.
2021-03-06T19:23:47.159+0000 I CONTROL  [initandlisten] **       32 bit builds are limited to less than 2GB of data (or less with --journal).
2021-03-06T19:23:47.159+0000 I CONTROL  [initandlisten] **       Note that journaling defaults to off for 32 bit and is currently off.
2021-03-06T19:23:47.159+0000 I CONTROL  [initandlisten] **       See http://dochub.mongodb.org/core/32bit
2021-03-06T19:23:47.159+0000 I CONTROL  [initandlisten]
2021-03-06T19:23:47.166+0000 I STORAGE  [initandlisten] finished checking dbs
2021-03-06T19:23:47.167+0000 I CONTROL  [initandlisten] now exiting
2021-03-06T19:23:47.167+0000 I NETWORK  [initandlisten] shutdown: going to close listening sockets...
2021-03-06T19:23:47.168+0000 I NETWORK  [initandlisten] removing socket file: /tmp/mongodb-27017.sock
2021-03-06T19:23:47.168+0000 I NETWORK  [initandlisten] shutdown: going to flush diaglog...
2021-03-06T19:23:47.168+0000 I NETWORK  [initandlisten] shutdown: going to close sockets...
2021-03-06T19:23:47.168+0000 I STORAGE  [initandlisten] shutdown: waiting for fs preallocator...
2021-03-06T19:23:47.169+0000 I STORAGE  [initandlisten] shutdown: closing all files...
2021-03-06T19:23:47.169+0000 I STORAGE  [initandlisten] closeAllFiles() finished
2021-03-06T19:23:47.169+0000 I STORAGE  [initandlisten] shutdown: removing fs lock...
2021-03-06T19:23:47.169+0000 I CONTROL  [initandlisten] dbexit:  rc: 0

Edit 3:

mongodump --repair -d /data/db on a stopped instance can't find the database

mongodump --repair -d /data/db on a running mongo gives me the following, after which my container crashes again.

 Failed: error getting collections for database `/data/db`: error running `listCollections`. Database: `/data/db` Err: Invalid ns [/data/db.$cmd]

mongodump --repair on a running mongo instance gives me:

2021-03-07T17:31:44.556+0000    writing repair of unifi.wlanconf to dump/unifi/wlanconf.bson
2021-03-07T17:31:44.560+0000            repair cursor found 4 documents in unifi.wlanconf
2021-03-07T17:31:44.561+0000    writing unifi.wlanconf metadata to dump/unifi/wlanconf.metadata.json
2021-03-07T17:31:44.564+0000    done dumping unifi.wlanconf (0 documents)
2021-03-07T17:31:44.565+0000    writing repair of unifi.site to dump/unifi/site.bson
2021-03-07T17:31:44.569+0000            repair cursor found 4 documents in unifi.site
2021-03-07T17:31:44.569+0000    writing unifi.site metadata to dump/unifi/site.metadata.json
2021-03-07T17:31:44.572+0000    done dumping unifi.site (0 documents)
2021-03-07T17:31:44.572+0000    writing repair of unifi.networkconf to dump/unifi/networkconf.bson
2021-03-07T17:31:44.576+0000            repair cursor found 4 documents in unifi.networkconf
2021-03-07T17:31:44.576+0000    writing unifi.networkconf metadata to dump/unifi/networkconf.metadata.json
2021-03-07T17:31:44.579+0000    done dumping unifi.networkconf (0 documents)
2021-03-07T17:31:44.580+0000    writing repair of unifi.privilege to dump/unifi/privilege.bson
2021-03-07T17:31:44.610+0000            repair cursor found 4 documents in unifi.privilege
2021-03-07T17:31:44.610+0000    writing unifi.privilege metadata to dump/unifi/privilege.metadata.json
2021-03-07T17:31:44.616+0000    done dumping unifi.privilege (0 documents)
2021-03-07T17:31:44.616+0000    writing repair of unifi.apgroup to dump/unifi/apgroup.bson
2021-03-07T17:31:44.640+0000            repair cursor found 2 documents in unifi.apgroup
2021-03-07T17:31:44.640+0000    writing unifi.apgroup metadata to dump/unifi/apgroup.metadata.json
2021-03-07T17:31:44.672+0000    done dumping unifi.apgroup (0 documents)
2021-03-07T17:31:44.675+0000    Failed: repair error: error reading collection: EOF
Jordan
  • 9,014
  • 8
  • 37
  • 47

2 Answers2

0

You have a datafile corruption from the unclean dismount of the disk during the shutdown. Even if you restore the database, you can still face problems due to inconsistency of the keys in the database. Following is a procedure the properly address this issues

Recovery MongoDB from abrupt failure

  1. If the database files are on your host, make copy of them before starting this procedure. To copy them, you can use

    docker cp <container_name>:<location of files in container> <location on host>
    

    If the database files are still inside the container, get the outside the container, and make a copy

  2. Start a repair container over the files as follows:

    docker run -it -v <data folder>:/data/db <image name>:<image-version> mongod --repair
    

    The image name depends on the platform, and for Raspberry PI3 the name is andresvidal/rpi3-mongodb3, for arm64v8 or for amd64 the container is mongo

    Make sure you have the same version of MongoDB image as the one used to create the data files.

    If the files are beyond repair, try:

    docker run -it -v <data folder>:/data/db mongo:<image-version> mongodump --repair --dbpath /data/db
    
  3. Once the files are repaired, you need to start a container over the database and export the files with

    docker run -it -v <data folder>:/data/db mongo:<image-version> mongodump --dbpath /data/db
    
  4. Start a clean database for your project and use mongorestore to import the data in the new database.

You can check following links for more information:

jordanvrtanoski
  • 5,104
  • 1
  • 20
  • 29
  • 1
    Thanks - how would I copy the database files in a docker setup? I'm not sure how to access them. – Jordan Mar 06 '21 at 18:54
  • Updated with the instruction. – jordanvrtanoski Mar 06 '21 at 19:04
  • What files do I need to copy for , something like: docker cp unifi_mongo:/data/db tmpbak ? Also, what version do I use for mongo:, as well as the ? I was thinking something like "docker run -it -v db:/data/db andresvidal/rpi3-mongodb3:latest mongod --repair" but I get a "WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm/v7) and no specific platform was requested" – Jordan Mar 06 '21 at 19:06
  • 1
    Also I'm getting a "error parsing command line options: --dbpath and related flags are not supported in 3.0 tools" error, and if I omit that, I get "Failed: error connecting to db server: no reachable servers" – Jordan Mar 06 '21 at 19:26
  • You need to use `andresvidal/rpi3-mongodb3` for armv7 (RPI3). Your files, as I can see from the docker-compose are in `/data/db`. So you need to make copy of them. – jordanvrtanoski Mar 06 '21 at 19:46
  • Yeah, I can't seem to run mongodump, it's giving me the "dbpath and related flags are not supported in 3.0 tools" error. I updated with log output above – Jordan Mar 06 '21 at 19:47
  • In your last log it looks like the file s are corrected. For `mongodump` since is older version use `-d` instead of `--dbpath`. Check [this thread](https://stackoverflow.com/a/49063652/2816703) for more details – jordanvrtanoski Mar 06 '21 at 19:53
  • Yeah, -d just gives me " Failed: error connecting to db server: no reachable servers" when I run "docker run -it -v db:/data/db andresvidal/rpi3-mongodb3:latest mongodump --repair -d /data/db" – Jordan Mar 06 '21 at 20:00
  • Ok, start the container with the databse, enter in the container with `docker exect -ti sh` and try the `mongodump` from inside the container. – jordanvrtanoski Mar 06 '21 at 20:01
  • Same problem from within: root@38bf31a0f50b:/data# mongodump --repair -d /data/db 2021-03-06T20:09:58.318+0000 Failed: error connecting to db server: no reachable servers – Jordan Mar 06 '21 at 20:10
  • Do I need to start mongo first to do this? I just ran "mongod" and then "mongodump --repair -d /data/db" together, but that gave me "Failed: error getting collections for database `/data/db`: error running `listCollections`. Database: `/data/db` Err: Invalid ns [/data/db.$cmd]" – Jordan Mar 06 '21 at 20:21
  • Aha! Turns out i actually could just delete the lock, even though the mongod --repair failed. Started the mongo container without starting mongo by overriding the entrypoint to tail /dev/null and was able to exec -it into it and remove the lock – Jordan Mar 06 '21 at 20:46
  • When I run mongodump --repair "I get 2021-03-07T17:31:44.675+0000 Failed: repair error: error reading collection: EOF" (updated full log in description under Edit3" – Jordan Mar 07 '21 at 17:32
  • Do you have the backup files from step #1? If so, start from that point. Other Idea which comes to my mind is to export collection by collection and mark which collections are failing. Also the EOF that you are getting looks like the failure of the MongoDB not the export, so check the logfies of the database to see what is the error code. – jordanvrtanoski Mar 07 '21 at 17:53
  • Oh, you mean run the repair directly on the dumped files on the host machine instead of the container? – Jordan Mar 07 '21 at 17:55
  • No, it's not required to do it on the host, you need to run with the same version of the container. You can start the procedure from the beginning (run the step 2, 3 and 4). – jordanvrtanoski Mar 07 '21 at 19:02
0

Aha! So for some reason none of the mongod --repair stuff was working for me. The problem was keeping the mongo container up long enough to be able to remove the lock.

I did this by overriding the entrypoint in my docker-compose.yml, and then I was able to get into the container and remove the lock with " docker exec -ti unifi_mongo /bin/bash"

 mongo:
    #    image: mongo
    image: andresvidal/rpi3-mongodb3
    container_name: ${COMPOSE_PROJECT_NAME}_mongo
    networks:
      - unifi
    restart: on-failure
    volumes:
      - db:/data/db
      - dbcfg:/data/configdb
    entrypoint: ["tail", "-f", "/dev/null"] ############### For repairing mongo, just loop so container stays alive while not crashing trying to run mongo on startup 
Jordan
  • 9,014
  • 8
  • 37
  • 47