How to verify OpsMgr Data Warehouse Grooming is actually happening
How do I know data warehouse grooming and aggregation is actually happening?
You can run the query below, which will return the date / time of the oldest and newest aggregated samples in the Data Warehouse and the number of days in the data set of your hourly samples.
select
min(datetime)as MinDate,
max(datetime)as MaxDate ,
datediff(d,min(datetime),max(datetime)) AS NoOfDaysInDataSet
from Perf.vPerfHourly
The following will give you the number of samples - it was more than 14 million in the lab Data Warehouse I used.
select count(*) from Perf.PerfHourly
If you want to see all the columns contained, retrieve a few rows with the query below
select top 10 * from Perf.PerfHourly
and here’s some of the data you’ll see:
Columns in the output include SampleCount (the number of samples in the hourly aggregation), AverageValue, MinValue, MaxValue and StandardDeviation. If you run a report, I am sure you’ve seen these last 4 in a chart.
What is Standard Deviation anyway?
While min, max and average values are obvious enough, you may not be familiar with StandardDeviation if you didn’t take Statistics in school, so here’s a definition.
Standard Deviation is the measure of how widely dispersed the values in the samples that were aggregated. A low value here means the samples were tightly clustered, meaning there was not much fluctuation in the values in the raw performance data. A high value means the counter was bouncing up and down quite a lot (that there were sharp spikes and/or drops).