7

I have built an image based on mariadb:10.1 which basically adds a new cluster.conf but facing the following error on the second node after the first node started working successfully. Can somebody help me debug here?

Error log tail

2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
     at gcomm/src/pc.cpp:connect():162
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1380: Failed to open channel 'test_cluster' at 'gcomm://172.17.0.2,172.17.0.3,172.17.0.4': -110 (Connection timed out)
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: gcs connect failed: Connection timed out
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: wsrep::connect(gcomm://172.17.0.2,172.17.0.3,172.17.0.4) failed: 7
2016-09-28 10:12:55 139799503415232 [ERROR] Aborting

MySQL init process failed.

Debugging steps taken

NOTE: Container IP addresses were ensured to be the same as shown.

  1. To ensure networking between containers is working, tried creating another container which could login to the first container's mysql instance.
  2. This is definitely not related to MYSQL_HOST
  3. To see if the container was running out of memory, I used docker stats and saw that the failed container was using only a meagre 142MB all through its lifecycle until it failed, which is way lesser than the total memory it was allowed (~4GB).
  4. I am using Docker for Mac, but tried running the same on a CentOS VirtualBox and gives the same results. Doesn't look like Docker on Mac has a problem.

Config

[mysqld]
user=mysql
binlog_format=ROW
bind-address=0.0.0.0
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=0
innodb_buffer_pool_size=122M
innodb_file_per_table=1
innodb_doublewrite=1
query_cache_size=0
query_cache_type=0
wsrep_on=ON
wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_sst_method=rsync

Steps to start containers

# bootstrap node
docker run --rm -e MYSQL_ROOT_PASSWORD=123 \ 
  activatedgeek/mariadb:devel \
    --wsrep-cluster-name=test_cluster \
    --wsrep-cluster-address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4 \
    --wsrep-new-cluster

# add node into cluster
docker run --rm -e MYSQL_ROOT_PASSWORD=123 \ 
  activatedgeek/mariadb:devel \
    --wsrep-cluster-name=test_cluster \
    --wsrep-cluster-address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4

# add node into cluster
docker run --rm -e MYSQL_ROOT_PASSWORD=123 \ 
  activatedgeek/mariadb:devel \
    --wsrep-cluster-name=test_cluster \
    --wsrep-cluster-address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4
psiyumm
  • 6,437
  • 3
  • 29
  • 50
  • One of the better docker-Galera docs: https://www.pythian.com/blog/building-a-mariadb-galera-cluster-with-docker/ – Rick James Sep 28 '16 at 23:37
  • @RickJames That doesn't really help the case. If you could rather point out what is wrong with the above setup because it is pretty much the same. – psiyumm Sep 29 '16 at 05:27
  • One page suggests just plain `gcomm://`, another says never do that, yet another suggest that all 3 addresses need to be in the list. I've had problems, too, and don't recall the "right" answer. – Rick James Sep 29 '16 at 15:31
  • @RickJames The documentation says that all 3 should be in the URL also I need to have a provider option for primary component `wsrep_provider_options` in case these crash, so they can recover. The documentation suggests `--wsrep-new-cluster` as the recommended way. Do you see any other problems with the above setup? I was trying for a 3-node cluster by the way and the second node declared the first node stable but mysql init failed. – psiyumm Sep 30 '16 at 05:43

2 Answers2

0

This problem is caused due to the hanging init process. The configurations and CLI arguments above are correct. The only thing to be done before the init process starts is to create and empty mysql directory in the data directory (/var/lib/mysql by default). The must only be created on all nodes except the bootstrap node.

mkdir -p /var/lib/mysql/mysql

See sample MariaDB Cluster for usage which uses a custom MariaDB image and is a proof of concept for creating clusters.

psiyumm
  • 6,437
  • 3
  • 29
  • 50
0

I guess your containers should either expose the required ports:

-p 3306:3306 -p 4444:4444 -p 4567:4567 -p 4568:4568

or should be --link (ed) together.

Ahmed Ashour
  • 5,179
  • 10
  • 35
  • 56
  • No, that wasn't really the problem. The containers were already able to communication with each other. See my answer to the question where I solved this. – psiyumm Apr 05 '17 at 03:29