Replace Mangement Server *NIX monitoring

I’m replacing the 2012 management servers in my SCOM2016 management group with server 2019, so that I can then in-place upgrade to SCOM2019. I’m doing this one at a time by removing the management server from SCOM, deleting the VM, then setting a new one up with the same name and IP and adding it back to the 2016 management group.

I’ve done the easy ones, now I need to tackle the two that make up my *NIX monitoring resource pool. If I replace one of the two servers in the resource pool then copy its new certificate to the existing second member (and vice versa) will that maintain *NIX monitoring whilst I then do the same thing to the second server? For example:

Existing *NIX Resource Pool (SCOM2016 MG):
SCOMMS1 - server 2012
SCOMMS2 - server 2012

goes to:

SCOMMS1 - server 2019
SCOMMS2 - server 2012
copy certificates both ways

then to:

SCOMMS1 - server 2019
SCOMMS2 - server 2019
copy certificates both ways

Will that work? Or is there a better way?

So I did the above and replaced one of my 2012 management servers with a VM running Windows server 2019, SCOM2016 with the same name and IP, joined it back to the SCOM Management Group and did the certificate malarkey so that *nix agents can report to it.

Sadly, I think I’m hitting an error with TLS/SSL enforcement - I followed Kevin Holman’s guide on it. Both the 2012 server and the 2019 server are now TLS1.2 enforced; when I join the 2019 server to the *nix resource pool and the agents balance across the two management servers then after a couple of minutes they stop heartbeating and go critical.

The error is:
WSManFault
The server certificate on the destination computer (servername: 1270) has the following errors:
Encountered an internal error in the SSL library.

This only happens on the 2019 server, they’re both locked down the same way, running the same SCOM2016 software - the only difference is the OS. I’ve tried relaxing the TLS/SSL enforcement but it doesn’t seem to help. Does anyone have any ideas what’s going on?

Hi Peter,

Not something I have much experience in, unfortunately. But I have run into some issues in similar areas before.

Do you know if the error is coming from the Linux or the Windows side? It sounds like the Linux side

If you use Test-WSMan from the windows side do you get any additional errors?

Test-WSMan -Port 1270 -ComputerName "fqdn" -Authentication Basic -Credential (Get-Credential) -UseSSL

If you enable more detailed logging on the MS do you get anything?
From the same link do you see anything in the agent logs?

Do you see anything in the SChannel logs (assuming you have that enabled)?

If you look at the certificate on the agent does it match up with what you expect (for whatever reason my lab agent seems to have the cert in a different place to where the docs think it should be)?

openssl x509 -in /etc/opt/omi/ssl/omi.pem -noout -text

For example, I know that SHA1 was depreciated at some point in favour of SHA256, I’m not aware of it being mandatory at any point, but it could be that you have older certs that it’s now throwing a wobbly about? My lab machine is
Signature Algorithm: sha256WithRSAEncryption but it’s also never seen anything other than SCOM 2019.
If so then the Wayback machine captured a script for updating that might be helpful here from the technet script center.

Good luck!

1 Like

Thanks very much for the suggestions, plenty to look at there. The Test-WSMan seems to throw the same error that comes up in EventLogs about ‘internal error in the SSL library’.

Test-WSMan : <f:WSManFault xmlns:f="http://schemas.microsoft.com/wbem/wsman/1/wsmanfault" Code="12175"
Machine="ITS-P-xxx.ac.uk"><f:Message>The server certificate on the destination computer
(its-p-baxxxx.ac.uk:1270) has the following errors:
Encountered an internal error in the SSL library.   </f:Message></f:WSManFault>
At line:1 char:1
+ Test-WSMan -Port 1270 -ComputerName its-p-baxxxx.ac.uk -Authe ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (its-p-baxxxx.ac.uk:String) [Test-WSMan], InvalidOperationException
    + FullyQualifiedErrorId : WsManError,Microsoft.WSMan.Management.TestWSManCommand

The SHA1/SHA256 is interesting, I wonder whether the 2019 server is locked down for that more than the original 2012 management server.

1 Like

The certificate on one of the agents tested seems to look fine

            X509v3 Extended Key Usage:
                TLS Web Server Authentication
    Signature Algorithm: sha256WithRSAEncryption

I’ve gone back to the 2019 management server and relaxed the TLS enforcement so that TLS1.0, 1.1 and 1.2 are enabled for client and server and it’s now monitoring Linux agents. However, the 2012 management server has legacy protocols disabled apart from TLS1.2 and that works. I wonder whether the way that TLS is controlled in the registry is different between 2012 and 2019 and that has some impact on the linux scom agent.