I’m building a dashboard that will have have 3 columns that each look at an application from a different perspective as shown below.

1) Application Health (Processes running/TCP Ports listening)

2) System Health (Memory/CPU/Disk/etc.)

3) Database Health (DB Management Pack metrics)

System and Database are complete, but I need help figuring out how to instrument the application health in SCOM/Squaredup.  Notice that I have 2 Data Centers that have the same apps running on different servers. In each DC, I’d like to monitor the health of the application on all servers hosting that application. For example, let’s say “BOS” application has 2 servers behind a load balancer and they both run the same apps/processes.  I’d like this BOS-App icon to show Green if both servers have both processes running (a.exe and b.exe).  If any process is not running on either of the 2 servers in this Datacenter, I’d like that icon to go yellow. If any process is down on both servers in this Datacenter, I’d like the icon to go red.  The same goes for Datacenter 2 – two different servers running the same 2 processes. See diagram below. The circles next to the application are the ones I’m trying to focus on. Note, that I already used VADA to build out the DAs for the “Sys” column of icons that are linked to SCOM IDs and are working perfectly.

Looking for recommendations on how to build the objects in SCOM or VADA/SCOM to monitor only the application .EXEs for the 2 servers and roll up the status as described.  Do I start with building out a DA with the 2 servers in VADA and then create another DA in SCOM to roll the status up to this “parent” DA?  In that case, can I disable all system level monitors on these 2 servers in SCOM for these DAs and not affect the other DAs used for the “System” icons?  Or do I build groups or DAs in SCOM somehow to monitor the applications? I don’t know how to make this work properly in SCOM, without affecting the systems that are already part of other DAs.

The goal is to be able to quickly see from a single pane of glass if there is an application problem, system problem, or a database problem affecting the application.

Jelly answered
    • This blogpost could be helpful. The tricky part is getting the rollup to be a warning when one server is down and critical when everything is down. http://blogs.catapultsystems.com/cfuller/archive/2010/09/13/using-distributed-applications-to-generate-actionable-alerting/ One way to do it would be via Squaredups powershell MP so that you check status on both servers and post back warning if you find one running and critical if you find zero processes running. We have done a similiar setup. But we dont set them up as warnings. We have… Continue reading