Today, investigating an other issue i noticed an agent logging Connectivity issues to one of Our management servers. It turned out that we had 4 MS having trouble accepting Connections on port 5723. Telnet Connection was refused as well and agents where complaining alot.
The OpsMgr Connector could not connect to ;MANAGEMENTSRV:5723. The error code is 10061L(No connection could be made because the target machine actively refused it.).
OpsMgr was unable to set up a communications channel to MANAGEMENTSRV. Communication will resume when MANAGEMENTSRVis available and communication from this computer is allowed.
Booting the affected management servers seems to resolve the issue. Using splunk i was able to find out when it happened - during a Network outage, but why? Communication to SQL etc are just fine. Agents failed over to the two MS still in operation.
SquaredUp graph showing avg.-batch per sec for the server working an one thats not…