Saturday 17 February 2018

Checking SQL Server stale statistics

Clearly, it is of paramount importance to monitor the state of statistics because they always play a significant role in the performance of the whole database. When statistics are not up to date the performance will indeed decrease tremendously over time and I am sure nobody wants to have a sluggish system. So we possibly have the following questions:
  • How many times did you find yourself in a big problem related to unforeseen performance issues even some days after having carried out dutifully the respective database maintenance tasks? 
  • Did you wonder why rebuilding index and statistics update tasks are not good enough? 
  • If everything was working smoothly, why did the performance start going down unexpectedly? 
  • Why are the indexes not being used as usual? do I need to create more indexes? 
The answer to all these questions might be found by digging deeper into the stale statistics. It is worth noting that Statistics contain all the information that SQL Optimizer needs to generate the best possible execution plans for your queries. If they are not up to date then the impact might be disastrous. So, having the right indexes is not good enough when the statistics went stale.

Generally, it is advisable to update statistics if it is verified that at least 20 percent of the data changed (as long as they have more than five hundred rows). But can we know how many rows changed for a table so that we can just update its Index Statistics? In SQL Server almost everything can be checked and this is not an exception. Today, I am coming up with a script to check which Index Statistics have gone stale so that we can focus our work on them and keep expected performance going as well. The script is built on the base of STATS_DATE function which returns the date of the most recent update for statistics on a table or indexed view.

SELECT DB_NAME() AS DatabaseName,
SCHEMA_NAME(t.[schema_id]) AS SchemaName,
t.name AS TableName,
ix.name AS IndexName,
STATS_DATE(ix.id,ix.indid) AS 'StatsLastUpdate', -- using STATS_DATE function
ix.rowcnt AS 'RowCount',
ix.rowmodctr AS '#RowsChanged',
CAST((CAST(ix.rowmodctr AS DECIMAL(20,8))/CAST(ix.rowcnt AS DECIMAL(20,2)) * 100.0) AS DECIMAL(20,2)) AS '%RowsChanged'
FROM sys.sysindexes ix
INNER JOIN sys.tables t ON t.[object_id] = ix.[id]
WHERE ix.id > 100 -- excluding system object statistics
AND ix.indid > 0 -- excluding heaps or tables that do not have any indexes
AND ix.rowcnt >= 500 -- only indexes with more than 500 rows
ORDER BY  [%RowsChanged] DESC
Bearing in mind there are two types of statistics: Column Statistics and Index Statistics. The first type are statistics linked to columns and created automatically as result of querying the columns, so SQL Server normally creates them unless you have AUTO_CREATE_STATISTICS database option disable, whereas the second one are statistics linked to indexes and these statistics are created at the same time when the indexes are. Likewise, Index Statistics are always updated with FULLSCAN while indexes are being rebuilt. So, there is no need to update Index Statistics explicitly after rebuilding the indexes, if so, it might be harmful because UPDATE STATISTICS command without parameters is based on default sampling and it is not the best option for the vast majority of cases.
Talking of a little more of Column Statistics, after rebuilding indexes, we just have to update Column Statistics and, for this case, it might be good enough to use default sampling; however, it is also advisable to use FULLSCAN if possible.
Many of us may be wondering how many times the statistics need to be updated per month? the answer depends entirely on how frequently your data is being changed (updated, inserted, deleted). Updating statistics once a week might be enough for highly transactional environments whereas for others once a month would be more than good. That is all for now, I hope you find the script helpful. Let me know any remarks you may have. Stay tuned.

No comments:

Post a Comment

Let me know any remarks or questions you may have. Please write down your name.

HELLO, I'M PERCY REYES! — a book lover, healthy lifestyle lover... I've been working as a senior SQL Server Database Administrator (DBA) for over 20 years; I'm a three-time awarded Microsoft Data Platform MVP. I'm currently doing a PhD in Computer Science (cryptography) at Loughborough University, England — working on cryptographic Boolean functions, algorithmic cryptanalysis, number theory, and other algebraic aspects of cryptography. READ MORE