Saturday, 17 February 2018

Checking SQL Server stale statistics

Clearly, it is of paramount importance to monitor the state of statistics because they always play a significant role on the performance of the whole database. When statistics are not up to date the performance will indeed decrease tremendously over time and I am sure nobody wants to have a sluggish system. How many times did you find yourself in a big problem related to unforeseen performance issues even some days after having carried out dutifully the respective database maintenance tasks? did you wonder why rebuilding index and statistics update tasks are not being good enough? if everything was working smoothly, why did the performance start going down unexpectedly? Why are the indexes not being used as usual? do I need to create more indexes? the answer might be found by digging deeper into the stale statistics. It is worth noting that Statistics contain all the information that SQL Optimizer needs to generate the best possible execution plans for your queries. If they are not up to date then the impact might be disastrous. So, having the right indexes are not good enough when the statistics went stale.

Generally, it is advisable to update statistics if it is verified that at least 20 per cent of the data changed (as long as they have more than five hundred rows). But can we know how many rows changed for a table so that we can just update its Index Statistics? In SQL Server almost everything can be checked and this is not an exception. Today I an coming with a script to check which Index Statistics have gone stale so that we can focus our work on them and keep expected performance going as well. The script is built on the base of STATS_DATE function which retuns the date of the most recent update for statistics on a table or indexed view.

SELECT DB_NAME() AS DatabaseName,
SCHEMA_NAME(t.[schema_id]) AS SchemaName, AS TableName, AS IndexName,
STATS_DATE(,ix.indid) AS 'StatsLastUpdate', -- using STATS_DATE function
ix.rowcnt AS 'RowCount',
ix.rowmodctr AS '#RowsChanged',
CAST((CAST(ix.rowmodctr AS DECIMAL(20,8))/CAST(ix.rowcnt AS DECIMAL(20,2)) * 100.0) AS DECIMAL(20,2)) AS '%RowsChanged'
FROM sys.sysindexes ix
INNER JOIN sys.tables t ON t.[object_id] = ix.[id]
WHERE > 100 -- excluding system object statistics
AND ix.indid > 0 -- excluding heaps or tables that do not any indexes
AND ix.rowcnt >= 500 -- only indexes with more than 500 rows
ORDER BY  [%RowsChanged] DESC
Bearing in mind there are two types of statistics: Column Statistics and Index Statistics. The first type are statistics linked to columns and created automatically as result of querying the columns, so SQL Server normally creates them unless you have AUTO_CREATE_STATISTICS database option disable, whereas the second one are statistics linked to indexes and these statistics are created at the same time when the indexes are. Likewise, Index Statistics are always updated with FULLSCAN while indexes are being rebuilt. So, there is no need to update Index Statistics explicitly after rebuilding the indexes, if so, it might be harmful because UPDATE STATISTICS command without pamareters is based on default sampling and it is not the best option for the vast majority of cases.
Talking of a little more of Column Statistics, after rebuilding indexes we just have to update Column Statistics and for this case it might be good enough to use default sampling, however, it is also advisable to use FULLSCAN if your environment is worthy of it.
Many of us may be wondering how many times the statistics need to be updated per month? the answer depends entirely on how frequently your data is being changed (updated, inserted, deleted). Updating statistics once a week might be enough for highly transactional environments whereas for others once a month would be more than good. That is all for now, I hope you find the script helpful. Le me know any remarks you may have. Stay tuned.

No comments:

Post a Comment