SCOM connectivity error: unauthorized

We have been having an issue for a while (SCOM 2016 UR3, Squaredup 4.0.4) where sometimes I will get a browser login prompt in which I cannot login to. If I look at the transit log I see the following. The only way to fix this issue is to reboot the squaredup server. IISReset or App Pool Cycle does not fix the issue. I even tried restarting the Microsoft Monitoring Agent. Anyone seen this before?

System.UnauthorizedAccessException: The user does not have sufficient permission to perform the operation. —> Microsoft.EnterpriseManagement.Common.UnauthorizedAccessEnterpriseManagementException: The user does not have sufficient permission to perform the operation. —> System.ServiceModel.Security.SecurityNegotiationException: The caller was not authenticated by the service. —> System.ServiceModel.FaultException: The request for security token could not be satisfied because authentication failed.
at System.ServiceModel.Security.SecurityUtils.ThrowIfNegotiationFault(Message message, EndpointAddress target)
at System.ServiceModel.Security.SspiNegotiationTokenProvider.GetNextOutgoingMessageBody(Message incomingMessage, SspiNegotiationTokenProviderState sspiState)
— End of inner exception stack trace —

Server stack trace:
at System.ServiceModel.Security.IssuanceTokenProviderBase1.DoNegotiation(TimeSpan timeout) at System.ServiceModel.Security.SspiNegotiationTokenProvider.OnOpen(TimeSpan timeout) at System.ServiceModel.Security.WrapperSecurityCommunicationObject.OnOpen(TimeSpan timeout) at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout) at System.ServiceModel.Security.CommunicationObjectSecurityTokenProvider.Open(TimeSpan timeout) at System.ServiceModel.Security.SecurityProtocol.OnOpen(TimeSpan timeout) at System.ServiceModel.Security.WrapperSecurityCommunicationObject.OnOpen(TimeSpan timeout) at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout) at System.ServiceModel.Channels.SecurityChannelFactory1.ClientSecurityChannel1.OnOpen(TimeSpan timeout) at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout) at System.ServiceModel.Channels.LayeredChannel1.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.CallOpenOnce.System.ServiceModel.Channels.ServiceChannel.ICallOnce.Call(ServiceChannel channel, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.CallOnceManager.CallOnce(TimeSpan timeout, CallOnceManager cascade)
at System.ServiceModel.Channels.ServiceChannel.EnsureOpened(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

I get a similar issue with a session that gets kicked out that I cant log back into - ours is something to do with when our NetScaler is having a ‘moment’ as ours is load balanced.

Did you know that SCOM 2016 is now on UR8? Might be worth considering an update!

1 Like

Out of interest can the user log on to the SCOM console from that machine when this issue is occurring?

You could also try klist purge on the machine for the user or other accounts (can be used to kill cached kerberos tokens, which might pin the issue down to that. Might have to loop through all sessions.).

So the issue was a particular MAC client in our environment. Apparently the MAC will not get a new token once it expires and when that MAC tried to access SquaredUp, the kerberos ticket would basically become invalid (BADOPTION) on SquaredUp server causing any new session to not work. So that is why it was always hit or miss. I am not sure how a single client could cause this, still investigating this, but at least we know why this was occurring. Once we had the user stop access the server with his MAC, the kerberos issues went away.

1 Like

Yes, we are planning on updating to SCOM 2019 in the next 2 months

Thank you, I will try that. Yes the SCOM console does work from that same PC. If klist purge does work and that is always having an issue, what could that mean?

The error message mentions something about struggling to create a security token. When you reboot the machine that also discards all of the Kerberos tokens so that’s why I’m thinking that might be involved.

Kerberos is a strange beast, could be multiple reasons for it not working.

There are a good couple of Microsoft blogs about it that can explain things and how to track them down way better than I ever could:

https://docs.microsoft.com/en-gb/archive/blogs/askds/kerberos-for-the-busy-admin

https://docs.microsoft.com/en-gb/archive/blogs/tspring/viewing-and-purging-cached-kerberos-tickets

https://docs.microsoft.com/en-gb/archive/blogs/vivek/part3-troubleshooting-kerberos-authentication-and-things-to-check-when-it-fails

The other issue that sometimes happen is Squaredup will show that I have read only access to SCOM. The SCOM console though shows I have full access. If I do a purge all sessions and then a iisreset, this does solve the issue. We have seen this problem on both our prod and dev environments with SquaredUp.

Oh wow, yeah that is really weird! Thanks for sharing the cause, that’s not one I would have jumped to!

VMware-Capacity.zip (2.52 KB)