0 answers, 1 selected
1 questions, 1 solved
Member for 1 years, 8 months, 5 days
1344 profile views
Last seen 30 Aug, 2018
08 Jan, 2019 Best answer answer
08 Jan, 2019 Selected answer Health Service Private Bytes and Handle Count leak in SCOM Agent on Windows 2012 and Higher OS
29 Aug, 2018 Commented Annoyingly I was trying to link to the question I asked here, but that seemed to fail (updated to shortened URL). You can probably get to it through my profile. I was having the same sort of problems you were having, on certain servers in a certain management group the Handles were running away. For me it came down to a management pack having problems. Got rid of the offending MP and the resource leak went away. If you think about it the MPs run code on the agents. If one of the pieces of code has issues then the agent running it has issues. Using process explorer I could see that the handles it was creating were tens of thousands of authentication tokens but not disposing of them. But couldn't work out why, until I went to look at updating the server 2016 MP and saw a log in the latest version saying that they had addressed a handle leak. Updated the MP and the problems disappeared. I keep a log of all the changes I make and when, so in hindsight I would have looked at my log for any changes made after that started happening and narrowed down from there.
12 Jun, 2018 Commented Sorry it's taken so long to get back to you! Been crazy round here recently. Not really to be honest. Only thing along those lines I have done is a custom monitor to try and spot problems with that APM bug in IIS machines. I've pulled that out but problems persist. As I'm not monitoring anything properly in that MG I'm going to pull out all the management packs and see if that makes any difference. If it does I'll add them in slowly to see if I can identify a problematic one. Thanks for all your help so far! I've not really had any other help from any other sources so it is much appreciated!
04 Jun, 2018 Commented OK I get that the handles will be higher in general in this sort of sityaltion, but over the weekend I've upped this to 60K handles and they just hit that instead and restart anyway (though the process is slower). Judging by the increase in alert count this is happening roughly 6 times a day even on the higher threshold. I also created a new VM that does literally nothing but sit there as a test for this scenario and it too is exhibiting the same problem (Next step will be to remove and manage via dev to see if the same thing happens there now). Surely this can't be normal? I'm pretty close to binning the management group and starting again Good point about disabling the monitors for the other management servers, I had missed that one.
1 votes received 100/100 1up 0down
1 votes casted 100/100 1up 0down
No answer posted yet!