Some background: We have multiple customised Windows Service and Process Monitors configured in our environment. These were configured via the SCOM UI wizard and point to associated SCOM groups with the relevant Windows Servers as members.
This works absolutely fine and has done for many years, however recently when removing multiple servers from different SCOM groups the associated customised monitors are still applied to the Windows Servers. We have tried the usual of clearing agent cache to no avail, this is now causing problems by creating false positive alerts/health state issues across multiple servers.
Any ideas on this? The only resolution we can think of is to remove the custom monitors (where applicable) and recreate them!?
Thanks for the responses so far. Yes, we have basically removed the Windows Servers from the SCOM groups which the customised monitors are targetted at so not necessarily applied an override per say on discovery. My assumption was the monitor and/or discovery would be disabled when the servers were removed from the targetted group.
Would you recommend I apply an override to disable the monitor on the customised monitors to a new group containing these servers (or individually) then running the “remove-scomdisabledclassinstance” powershell cmdlet? My only concern with the latter is its never been run before and the various articles I’ve read indicate it could cause sigificant database performance degredation?
I am not aware of the extent of performance degradation which occurs but I suppose that would depend on your retention periods and how many class instances were undiscovered during that period of time.
It has been a while since I have looked into the XML created when a service template is used but from what I recall a class is created which is then populated by a discovery for the service, which targets the windows computer class. There are also two monitors created, one to query WMI and one to query the service control manager. An override is then made to enable the monitor for the group which you selected in the wizard.
So in theory, after the server is removed from the group, the next client polling cycle should inform the client that it has new configuration and would shortly thereafter stop monitoring for the service.
If someone were to change the monitor to default to enabled then I believe any server with the service would then be monitored regardless of group membership.
With the service/process monitors the discovery is disabled by default and is enabled for the group/object that you create them for.
There is a quirk where remove-scomdisabledclassinstance only works on discoveries that are set as enabled by default.
So to work around this what you have to do once you’ve removed the object from the group, is go to the discovery and place an override against the specific object that is enforced. Run remove-scomdisabledclassinstance and then you’ll be able to remove the override. I’ve had to do this many times, never had issues running remove-scomdisabledclassinstance
The more times you run remove-scomdisabledclassinstance the less of a performance hit you will take - but that is just for the duration of it running, there is no long term effects (arguably it makes it all better after)
If concerned run it as a period first time where performance hits are acceptable - 6pm, over night, 4am what ever
Thanks ever so much for your replies, I will apply an override on the server(s) that no longer require the service monitoring then run remove-scomdisabledclassinstance at a quiet time to mitigate any performance overhead. Like you say Matthaus the more times I run this cmdlet the less of a performance hit it will have, therefore I might utilise the OpsMgr Self Maintenance Management Pack and enable it to run weekly.
When you say it is pointed at the group, do you mean it is only enabled for the group? and is it targeting a custom class or something like windows server?
Never mind you answered this by saying you used the template. I would recommend ensuring it is undiscovered by the class which is created by the template. and then running remove-scomdisabledclassinstance as suggested below