Windows Service monitoring best practices

Hi, while migrating to scom 2019, I’m reviewing our existing monitoring. We have a lost of windows services related to applications that we are monitoring. We are using the standard scom wizard to setup these monitors. I was thinking to create a class for each application (with mp studio and the fragments from Kevin Holman) and also use these to create the service monitors in our new scom 2019 environment. Is this the best way or is the way with the standard wizard the way to go? Any tips? Thanks, Luc

1 Like

I don’t think there is one best way as every environment is different. Your idea sounds great, you could also look at using SquaredUp Enterprise Applications as an option, its a really cool feature :smile:

1 Like

Hey Luc, SCOM actually has a way to do it via the GUI. Check this out:


I was using the Service Template on occasions but this was creating some unnecessary rules etc. when they were created.

I would recommend using the Basic Service Monitor as that will be a singular unit without the unwanted extras & is easier to remove should you need to (i.e. only 1 thing to remove).

For me, after seeing all the unnecessary code generated by the GUI, I went with building MPs in VASE. Often, service monitoring was just the entry point into deeper monitoring. I also didn’t like the alert message. I had first created an override to disable the Microsoft.SystemCenter.NTService.ServiceStateMonitor default monitor, and created mine on top of it. I also didn’t like that in the HealthExplorer I’d always have a disabled monitor, and a lot of other irritants, so right now I am going this way.

  • Created a “base” MP which would define a computer role (Base=“Windows!Microsoft.Windows.ComputerRole”) and a service class (Base=“System!System.LocalApplication”) which has the same properties as other NTService class.
  • Created a new monitor on the service class which uses the same datasource as Microsoft.SystemCenter.NTService.ServiceStateMonitor.
  • On that monitor, created a diagnostic task that returns the number of times the service has stopped in the past 30 (configurable) minutes (does an event log search)
  • Then a Recovery that only runs if the number of times returned is below a configured threshold which would restart the service. Like this, if the service constantly fails, SCOM will not try restarting it after a couple attempts.
  • I also added some tasks (start, stop, restart services)
  • For each application, I create a new MP which references the base MP created above, create a new computer role class which is based on the one above, and run 2 discoveries. First a Registry discovery targeted to Windows.Computer to discover the Role, then a WMI (or script) discovery targeted to the Computer Role class to discover the services. Like this I don’t need to use groups to narrow the discovery, the registry discovery is lightweight and can run on all agents and the WMI discovery gives a lot more options (wildcards anyone?) than a registry one and although more ressource expensive than a registry one, it only runs on the agents which have the computer role detected, so no biggie.

For the last point, maybe best practice would be to use a single MP with templates, but as I said, I usually take the opportunity to dig deeper into application monitoring and this is my entry point. Yes, I realize this is a lot of code just for service monitoring (should post it somewhere eventually), but after 8 years (already!) SCOM administrating, I like the flexibility this gives me!