top N performance with large scope

We have a process being monitored (and perf. collected) on almost all of our 2500 servers. We would like to display top processor usage for these processes, but since “all windows computer” is a large scope, no data is returned in Squared Up. Building a dashboard in opsConsole will eventually display the data. I have confirmed that a smaller scope will show the data, so it has to be related.

Any ideas?

1 Like

Hello,

Are you using the in built template to monitor this process? If so you can easily configure an alert based on CPU and / or memory usage for that process and therefore not require a topN graph to display that information.

If seeing this information is important to you then I’d recommend scoping your view to your most critical servers only.

1 Like

I understand your problem, we run into the same issues too. The SquaredUp performance plugin will timeout before data is displayed for large scopes and there’s no option to change a timeout value for that. The SCOM console has no such problem.

1 Like

Maybe you could use the SQL plugin, and define your query though that, then limiting it to only show top 20 or so - it could be faster than the performance plugin.

I don’t have access to the same installation size, and is experiencing that kind of problem

2 Likes

Has anyone had a moment to check this out in v3? I’m curious if the new version handles the large number of objects better :slight_smile: I’ve only got a tiny environment in comparison.

Hi

Like someone stated above. If you use the SQL Plugin you could for instance get the top 10 (or whatever you prefer) CPU State Changes for the CPU Monitor. This gives you which servers use the most CPU.

The Query (OpsDB) below lists the top 10 for the last day (easily changed)

select distinct top 10 count(sce.StateId) as NumStateChanges,
bme.DisplayName AS ObjectName,
bme.Path,
m.DisplayName as MonitorDisplayName,
m.Name as MonitorIdName,
mt.typename AS TargetClass
from StateChangeEvent sce with (nolock)
join state s with (nolock) on sce.StateId = s.StateId
join BaseManagedEntity bme with (nolock) on s.BasemanagedEntityId = bme.BasemanagedEntityId
join MonitorView m with (nolock) on s.MonitorId = m.Id
join managedtype mt with (nolock) on m.TargetMonitoringClassId = mt.ManagedTypeId
where m.IsUnitMonitor = 1
– Scoped to specific Monitor (remove the “–” below):
AND m.Name like (’%CPUUtilization%’)
– Scoped to specific Computer (remove the “–” below):
– AND bme.Path like (‘%sql%’)
– Scoped to within last 1 day
AND sce.TimeGenerated > dateadd(dd,-1,getutcdate())
group by s.BasemanagedEntityId,bme.DisplayName,bme.Path,m.DisplayName,m.Name,mt.typename
order by NumStateChanges desc

Just tried this with about 900 windows computers, and this still works. I think the limit of objects is around 1200.

Kind regards,

J

yup, alert is an option, but we just wanted to identify a potential problem here. How is this process behaving during the last week or so. An alert will only alert when the threshold is breached, were not able to identify a process running 35% cpu for two hours…

The SQL plugin is way faster. i was to the assumption that performance plugin somewhat “wrote” the query for me and displayed it nicely, but i may be wrong. Now, running this query:

select top 10 Path, ObjectName, CounterName, InstanceName, SampleValue, DateTime from Perf.vPerfRaw pvpr
inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId
inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId
inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId
WHERE CounterName = ‘% Processor Time’ AND InstanceName = ‘MsMpEng’

Gives some data. maybe i can build more intelligence in to that…

We have found this to be most useful. Use advanced in the scope and take advantage of the criteria section.
http://support.squaredup.com/support/solutions/articles/213709-how-to-use-criteria-when-scoping-alerts

Hi jswadley, i don’t quite see how this solves my problem. I don’t want to use other criteria than “list me the top 10 objects based on this perf.counter” :slight_smile: problem seems to be that my scope is too large, and therefor times out.

What do you consider large Scope 2680 computers will produce this error
The JSON request was too large to be deserialized.

But 40 Exchange servers will display. having hard time finding anything in between :slight_smile:

My environment is less than 100 machines, no where near your total!