Goal: Find a way to reduce CPU and Disk latency Impact. When all 4000+ VM’s downloads new config-changes (overrides and new versions of mp’s etc.) at the “same” time.
This is not very good for the esx-hosts, SAN and all of the VM’s.
I am not even sure that is even a valid registry key anymore. This was an old key with old guidance from SCOM 2007, back when we had an RMS that did EVERYTHING. Now that the config service is federated, it was completely re-written, and these settings are handled by the configservice.config file.
I work with some very large clients, and I have never edited that reg key. Nor do I do much tweaking to any config files. Too many times I have see us edit settings as a band-aid to the real problem - too much instance space, and instance property change. Write simpler management packs - be happier.
So there is no way to change how agents got new configuration? What I’m looking for is a way to spread the load so not all agents are downloading new changes at the same time. Is it possible to do all agents get configuration within a window of 10 minutes? Or all agents under each management server got different settings?
When all agents get a change (at the “same time”), there will be some more CPU usage on the servers. This Causes CPU Spike on virtualization hosts.
And when all agents downloads new mp’s or config changes at the same time, this could result in write-latancy on the SAN.
We experience these issues when we update management packs, import new management packs, update groups, overrides, Run As account configuration.
-> All these changes will force the agent to download changes and then we are experiencing this problem.
Of course, we can make the most of changes in a test environment, but at some point it must be imported into production…
Therefore, I wonder if it’s possible to spread the rollout of changes so not all agents do the same operation at almost the same time.
I suppose all scom environments are different, but this is something we notice and would like to try to find a solution or reduce it in some way.
Thanks for feedback!
(4000 + VM’s on VMWare ESX-hosts and HP 3PAR SAN)
Event 1201
Source: Health Service
“New management pack with id xxx received” - Splunk Search