Linux monitoring

Hi,

I have a strange situation with our latest added linux servers into scom. The network adaptor and heartbeat provider stays grayed out and also the server displays in grayed out. Everything else is monitored.

Any suggestions?

Try putting the server in Maintenance Mode for 30 minutes and see if it lights up after that

Looking at a Linux server in our SCOM I can see that the adapter and heartbeat are not monitored. In your case it seems rather strange.

Any other alerts related to this Linux server?

Try looking at the OperationsManager logs for the Management Server hosting the Linux server for warning/errors regarding your server. Also at the logs on the Linux server.

Maybe try to redeploy the agent on the server?

Is this happening to just this server? Is it a special version of Linux? Is it supported?

https://docs.microsoft.com/en-us/system-center/scom/plan-supported-crossplat-os?view=sc-om-2019

Sorry, dont know what could be wrong. I would open a case against Microsoft and see if they can help.

Support for 1801 ended 2019-08-08

To be frank, I don’t really know where the problem could be. In the past I have had issues with some specific linux version. In that case it was the SCOM agent that was the problem.

Hi Everyone,

We have installed a complete new linux server with an older template RH7.4 and the result after adding the server to scom is the same. Everything will be monitored but is grayed out.

Any ideas about what can be the problem?

Thanks, Luc

Your problem sounds familiar to me, but I can’t remember what the precise solution was.

I wonder if either your certificates are not in sync across all of your management servers (MS) or if you have a mix of SHA1/SHA256 certificates.

I would verify that certificates are in sync across all of your management servers and that you have only SHA256 certs (remove any SHA1 certs).

On the Linux agent, I would double-check that it’s using the correct certificate (SHA256).

In my travels, I’ve seen issues where monitoring starts healthy, then goes gray. For example, the discovery runs on MS1, and the agent is healthy. But at some point, control is shifted to another management server (MS2), and the agent goes gray because MS2 isn’t authorized to talk to the agent.

But you also mentioned that your monitoring is split (some healthy, some gray), so that has me wondering if your Linux Accounts are not correctly setup. As a test, use the same account in the Privileged and Unprivileged accounts. Some commands require elevation, and those might be failing.

Hi Peter, thanks for your answer. I did this but it doesn’t solve the problem. Is there anything else I can try? Can it be in the configuration of the linux server?

Hi Peter, it is strange. And I also found the image that you put above on the internet with the article, but nothing of al the solutions work. There is nothing to find in the scom log and nothing to find in the linux log. The servers are monitored well, we get alerts, but the overall state stays grayed out. Also redeploy doesn’t work and when doing a redeploy of an agent on a linux server that is green, it becomes again green. Any other suggestion?

Hi Peter, We are working with scom 1801 and it is a RH7 server. It is strange, because we have this situation only on 4 RH7 servers, all the rest displays correct. Removing the agents and redeploying them on a good one and on a bad one results in the same situation.

Thanks Peter, I think we need to do that. But I don’t know if scom 1801 is yet supported and we could not migrate to 2019 because of a lot RH6 servers. With the information I have, I think the problem must be on the linux server. What do you think about this, with the information I gave you?

Thanks Peter, the version of linux is the same on a working server and on a not working server and the agents are the same, so I think it must be in de config of the linux server.