Previously, when we had a Private Bytes and Handle Count leak in SCOM agents on Windows Servers 2008/R2. We would install the following KBs:
KB2685811KB2685813
Now, with the new OS versions of Windows 2012 and higher, we don’t have any solution for this problem.
Our Health Service Private Bytes and Handle Count values, has already been modified, according to Kevin’s article
Our current version of scom is: SCOM 2012 UR12 (i know it’s outdated but we plan to update it soon)
What evidence do you have that this is happening? is the SCOM agent constantly restarting on some machines?
We have thousands of 2012 servers and don’t have an issue except on a few - typically something like a SharePoint SQL server with 500+ DBs and we just set the thresholds much higher
We have a group that has this ‘high use’ thresholds set against it and plop in a server as required
All our MP’s are relatively up-to-date, and we do have a variety of management packs that requires a lot of resources (AD,SQL,Cluster,Exchange, etc)
But because this issue appears on only 30 servers within our environment, it doesn’t seems to me like a problem with the MP’s themselves more like a resource leak…
Is there any particular way to investigate this?
Because i don’t see any common denominator between those servers…
Annoyingly I was trying to link to the question I asked here, but that seemed to fail (updated to shortened URL). You can probably get to it through my profile.
I was having the same sort of problems you were having, on certain servers in a certain management group the Handles were running away. For me it came down to a management pack having problems. Got rid of the offending MP and the resource leak went away.
If you think about it the MPs run code on the agents. If one of the pieces of code has issues then the agent running it has issues.
Using process explorer I could see that the handles it was creating were tens of thousands of authentication tokens but not disposing of them. But couldn’t work out why, until I went to look at updating the server 2016 MP and saw a log in the latest version saying that they had addressed a handle leak. Updated the MP and the problems disappeared.
I keep a log of all the changes I make and when, so in hindsight I would have looked at my log for any changes made after that started happening and narrowed down from there.