Bobcares

WeSupport

Call Us! 1-800-383-5193
Call Us! 1-800-383-5193
Call Us! 1-800-383-5193

Need Help?

Emergency Response Time custom

Our experts have had an average response time of 11.06 minutes in March 2021 to fix urgent issues.

We will keep your servers stable, secure and fast at all times for one fixed price.

Broken Docker Swarm Cluster – How to resolve this error

by | May 17, 2021

Wondering how to resolve ‘Broken Docker Swarm Cluster’ error? We can help you.

As part of our Docker Hosting services, we assist our customers with several Docker queries.

Today, let us see how to resolve this error.

Causes for Broken Docker Swarm Cluster

Today, let us see some of the causes for Broken Docker Swarm Cluster.

  •  The deployed application does not work due to networking issues.
  •  A lot of error messages in /var/log/syslog on several workers and nodes from dockerd:
    ~~
    level=warning msg=”cannot find proper key indices while processing key update”
    level=error msg=”agent: session failed” error=”rpc error: code = Aborted desc = dispatcher is stopped”
    level=warning msg=”memberlist: Failed fallback ping: No installed keys could decrypt the message”
    level=warning msg=”memberlist: Decrypt packet failed: No installed keys could decrypt the message
    ~~
  • netstat -tnlp does not show usual dockerd bindings on a leader manager.
  • After docker swarm leave, the manager will not see updated nodes.
  • Deployment is hung up.
  • The reboot of a worker or manager is not fixing the broken state of the cluster.

 

Prior to this issue, we have got quite a lot of errors with “dispatcher is stopped” from all nodes.

Usually, in case of “dispatcher is stopped” error, cluster usually re-initializes communication automatically and no administrator intervention is necessary.

Suppose there are three managers in docker swarm, which are periodically selecting a leader. However, this process failed and manager-0 became unreachable from other nodes.

Then, the other two manager nodes did not set new quorum because of the stopped dispatcher and the cluster did not change the status of manager-0.

Unfortunately, manager-0 was not able to renew the docker network and other managers and worker nodes have become unreachable.

 

Solution for Broken Docker Swarm Cluster

Today, let us see the steps followed by our support Techs in order to resolve the error.

First and foremost, the solution is to destroy and reinitialize the whole cluster.

Reinitialize docker swarm cluster

In this article, let us see the steps to Reinitialize docker swarm cluster.

1. Optional if used, save tags assigned on nodes for later reassignment. On manager node, run this script:

~~
for node in $(docker node ls –filter role=worker –format ‘{{ .Hostname }}’);
do
tags=$(docker node inspect “$node” -f ‘{{.Spec.Labels}}’ |\
sed -e ‘s/^map\[//’ -e ‘s/\]$//’)
printf “%s: %s\n” “$node” “$tags”
done | sort
~~

Next, we can assign it later back with:

$ docker node update –label-add <TAG>=true <WORKER>

2. On each node, force to leave:

$ docker swarm leave –force

3. On each node, restart service:

$ systemctl restart docker.service

4. On manager, create a new cluster:

$ docker swarm init –availability drain –advertise-addr <INTERNAL IP>

Internal IP address is the intranet IP of cluster which will be used for communication between nodes.

5. Then, generate tokens for the manager/worker node invitation:

~~
$ docker swarm join-token manager
To add a manager to this swarm, run the following command:

docker swarm join –token SWMTKN-1-6bsyhhxe3txagx… 172.30.0.58:2377

$ docker swarm join-token worker
To add a worker to this swarm, run the following command:

docker swarm join –token SWMTKN-1-6bsyhhxe3txagy… 172.30.0.58:2377
~~

The previous command’s output are commands for joining the cluster. Copy-paste it to them.

6. On manager, confirm that nodes have joined the cluster:

$ docker node ls

Also set availability of manager nodes to Drain:

$ docker node update –availability drain <HOSTNAME>

Manager availability is by default Active, but if we do not want it to run any containers set it to Drain.

7. Optional for tags, add it to nodes now:

$ docker node update –label-add <TAG>=true <WORKER>

8. Finally, deploy the stack again:

$ docker stack deploy -c docker-compose.yml <STACK_NAME>

 

[Need any further assistance in fixing Docker errors? – We’re available 24*7]

Conclusion

Today, we saw how our support Techs follow the steps to resolve the error.

Are you using Docker based apps?

There are proven ways to get even more out of your Docker containers! Let us help you.

Spend your time in growing business and we will take care of Docker Infrastructure for you.

GET STARTED

var google_conversion_label = "owonCMyG5nEQ0aD71QM";

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Privacy Preference Center

Necessary

Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.

PHPSESSID - Preserves user session state across page requests.

gdpr[consent_types] - Used to store user consents.

gdpr[allowed_cookies] - Used to store user allowed cookies.

PHPSESSID, gdpr[consent_types], gdpr[allowed_cookies]
PHPSESSID
WHMCSpKDlPzh2chML

Statistics

Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.

_ga - Preserves user session state across page requests.

_gat - Used by Google Analytics to throttle request rate

_gid - Registers a unique ID that is used to generate statistical data on how you use the website.

smartlookCookie - Used to collect user device and location information of the site visitors to improve the websites User Experience.

_ga, _gat, _gid
_ga, _gat, _gid
smartlookCookie

Marketing

Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.

IDE - Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.

test_cookie - Used to check if the user's browser supports cookies.

1P_JAR - Google cookie. These cookies are used to collect website statistics and track conversion rates.

NID - Registers a unique ID that identifies a returning user's device. The ID is used for serving ads that are most relevant to the user.

DV - Google ad personalisation

IDE, test_cookie, 1P_JAR, NID, DV, NID
IDE, test_cookie
1P_JAR, NID, DV
NID
hblid

Security

These are essential site cookies, used by the google reCAPTCHA. These cookies use an unique identifier to verify if a visitor is human or a bot.

SID, APISID, HSID, NID, PREF
SID, APISID, HSID, NID, PREF