Showing posts with label Performance Tuning. Show all posts
Showing posts with label Performance Tuning. Show all posts

Thursday 4 March 2021

Don't Place the Blame on SQL Server

I have never worked for Microsoft, but SQL Server has given me a lot in terms of learning, community and opportunities, all these together have helped me do a great job as a Database Administrator (DBA) for many years. Since I started working with SQL Server nearly 15 years ago, I have heard a lot of complaints about SQL Server being nowhere near as good as Oracle. Much as I would have liked to ignore these fruitless discussions, I couldn't see the point of comparing products in such a compulsive way. Is it not true that our skills are more crucial than the technology itself? — Or perhaps some people just try to find something to blame. Whatever the case, I am convinced that we, as database professionals, are compelled to make the most out of any specific database technology. 
No matter what technology we are working with, we are at the wheel — technology is just a tool — so it is not the best to blame technology on the ground of one's inefficiency.

Saturday 17 February 2018

Checking SQL Server stale statistics

Clearly, it is of paramount importance to monitor the state of statistics because they always play a significant role in the performance of the whole database. When statistics are not up to date the performance will indeed decrease tremendously over time and I am sure nobody wants to have a sluggish system. So we possibly have the following questions:
  • How many times did you find yourself in a big problem related to unforeseen performance issues even some days after having carried out dutifully the respective database maintenance tasks? 
  • Did you wonder why rebuilding index and statistics update tasks are not good enough? 
  • If everything was working smoothly, why did the performance start going down unexpectedly? 
  • Why are the indexes not being used as usual? do I need to create more indexes? 
The answer to all these questions might be found by digging deeper into the stale statistics. It is worth noting that Statistics contain all the information that SQL Optimizer needs to generate the best possible execution plans for your queries. If they are not up to date then the impact might be disastrous. So, having the right indexes is not good enough when the statistics went stale.

Tuesday 13 February 2018

Avoid changing default ANSI database options

Not having another way of fixing some specific errors, at times some people may consider turning off any ANSI database options as a final solution. To be perfectly honest, I do recommend getting to the bottom of each problem and then fixing it at that level instead of changing default ANSI settings (unless it is just a quick fix or is truly necessary because you verified the benefits are significant). For instance, it is by no means uncommon to turn ANSI_WARRINGS off to fix the following error:

Msg 8152, Level 16, State 14, Line 5
String or binary data would be truncated.

Friday 1 December 2017

SQLServer Buffer Manager Buffer cache hit ratio, Page Life,Lazy Write/sec, Checkpoint Pages/sec

Without any shadow of a doubt, one of the procedures to follow in order to monitor and diagnose performance issues of a database server is to analyse meaningful performance counters related to SQL Server. As a Database Administrator, we need to get used to dealing with this sorts of problems so it truly helpful to know how to analyse and interpret them by contrasting with other windows performance counters.

To begin with, we can analyse SQLServer Buffer cache hit ratio performance counter which gives us the percentage of cache memory usage. This percentage should mostly be above 99% for OLTP database servers. Therefore, nowadays 4GB or 8GB of RAM is not good enough for the vast majority of them. The knock-on effect of this lack of RAM is that there will be a huge increase on the I/O disk activity as a result of using much more paging (data pages are moving in and out from memory very frequently because there is not enough space to locate new ones). This will also cause a huge disk queue which will keep disk subsystem busy, consequently, it will impact directly on the database performance detrimentally. In this context, we also need to look into other SQL Server performance counters like SQLServer: Buffer Manager: Page Life Expectancy, SQLServer: Buffer Manager: Lazy Write/sec, and SQLServer: Buffer Manager: Checkpoint Pages/sec. 

As I mentioned before, there are other useful performance counters to diagnose and tune cache memory issues like Buffer Manager: Page Life Expectancy which indicates the time in seconds that a page has been inside of memory. This time should mostly be above 300 seconds, that is, 5 minutes. So, values below it should be considered as an alert, which means that SQL Server is under high memory pressure because bad-written queries may be using a lot. Clearly, the final solution is not to add more memory but identify and optimise those problematic queries. Only after that may you opt to add memory.

The other performance counter to check is SQLServer: Buffer Manager: Lazy Write/sec which gives the quantity of pages moved out per second from memory. Not only could it mean a lack of memory, but also there are copious amounts of Checkpoints which are not good at all because it will cause recompilation of stored procedures at the same time.The value of this counter should mostly be below 20. So, if you notice this value above 20, you may need to check SQLServer: Buffer Manager: Checkpoint Pages/sec counter as well. Checkpoints move out all pages from memory to disk.

Finally, keep in mind that it really important to do diagnostic tasks regularly which allow to take preventive actions so as to improve the whole performance over time. I do hope you find this post helpful. Let me know any remarks you may have. Thanks for reading. Stay tuned.

Thursday 23 November 2017

Dropping columns statistics after migrating to SQL Server 2014/2016/2017

While migrating databases to new versions of SQL Server, there are some tasks that need to be carried out in order to warrant strong consistent data and physical integrity of databases in tandem with their performance. For instance, it is common to run DBCC CHECKDB, rebuild all indexes, and update columns and index statistics. In some scenarios, we might consider dropping columns statistics and let SQL Server create them again according to new algorithms especially when the new cardinality estimator (CE) is going to be used after upgrading to SQL Server 2014/2016/2017. I personally do that, after restoring databases on the new version of SQL Server I proceed to drop all columns statistics (always having AUTO_CREATE_STATISTICS option enabled) and then SQL Server will definitely create them again based on the nature of database queries and the logic of THE new CE. In a long-term perspective this technique is the most recommendable from my point of view as not only will we have new versions of statistics but also purge old and unused statistics.
Here I will share a script to delete columns statistics. Be caution and only drop them if you have AUTO_CREATE_STATISTICS option enabled, otherwise no statistics will be created and the database performance will be affected tremendously. That is all for now. Let me know any remarks you may have.

  IF db_name() NOT IN ('model','master','distribution','msdb','tempdb')
        DECLARE @schema_name varchar(max)      
        DECLARE @table_name varchar(max)     
        DECLARE @stat_name varchar(max)        
        DECLARE @update_stat_cmd varchar(max)        
        DECLARE @update_stat_msg_header   varchar(max)      
        DECLARE update_stat_cursor CURSOR FOR    
              select  schema_name(o.[schema_id]),  object_name(s1.[object_id]) ,   
              from (  
                select s.[object_id], from sys.stats s  
                left join sys.indexes i on  
                where is null) s1  
              inner join sys.objects o on o.[object_id]=s1.[object_id]  
              where  o.type='U'  
              order by schema_name(o.[schema_id]),  object_name(s1.[object_id]) ,   
         OPEN update_stat_cursor        
         FETCH NEXT FROM update_stat_cursor INTO  @schema_name, @table_name,  @stat_name         
         WHILE (@@fetch_status = 0)        
           DECLARE @ini DATETIME, @fin DATETIME     
           SET @update_stat_msg_header =  '->Dropping ['+ RTRIM(@schema_name)  +'].[' + RTRIM(@table_name) + '].[' +@stat_name + ']'      
           PRINT @update_stat_msg_header    
           SET @update_stat_cmd ='DROP STATISTICS ['+ RTRIM(@schema_name)  +'].[' + RTRIM(@table_name) + '].[' +@stat_name + ']'    
           SET @ini=GETDATE()    
           EXEC (@update_stat_cmd)      
           SET @fin=GETDATE()    
           FETCH NEXT FROM update_stat_cursor INTO @schema_name, @table_name,  @stat_name            
         PRINT ' '       
         PRINT '----------------------------------------------------------------------------- '        
         SET @update_stat_msg_header = '*************  THERE ARE NO MORE STATISTICS TO BE UPDATED **************'         
         PRINT @update_stat_msg_header        
         PRINT ' '        
         PRINT 'All statistics not linked to any index were rebuilt!'        
        CLOSE update_stat_cursor        
        DEALLOCATE update_stat_cursor    

Friday 3 November 2017

AUTO_CLOSE database option and its impact on the performance

The AUTO_CLOSE database option is only one of the many options related to performance and availability of the database in SQL Server.  When AUTO_CLOSE is set to ON for a database, SQL Server closes all its files and releases the resources used for it shortly after the last connection is closed. This action will reduce the usage of memory, nevertheless, it is barely insignificant (12KB or 20KB). Furthermore, having this option turned on, the first connection will have to open the database again, as a result, it will experience a delay. I highly recommend having disabled this option all the time. Here is the code to turn this option off:

This option is turned off by default, but I have found many databases with this option turned on which also impacts on it performance. The other disadvantage of having enabled it is that when the last established connection to the database is closed, its files are accessible to be manipulated directly via Windows by some user, which means that someone is completely able to delete them while the database engine is running. So, we need to work with lots of cautiousness when it comes to changing not only this database option but also others.

That is all for now, let me know any remarks you may have. Thanks for reading again. Stay tuned.

Tuesday 12 September 2017

Detecting poor cursor usage in SQL Server

Without a shadow of a doubt, poor cursor usage issue is an essential aspect to review so as to reduce the possibility of having CPU bottleneck issues, and determine whether cursors are the most appropriate means to accomplish the processing or whether a set-based operation is possible. It is well know that set-based operation is generally more efficient, but if you make the decision of using cursors, you should ensure that it does not represent an issue for the database in question. Thus, detecting poor cursor usage and taking certain measures to ease the problem is crucial, and in SQL Server there are some techniques available to be used to diagnose the issue. To begin with, by using performance counters we are able to analyse the issue and find out the extent to which poor cursor usage is adversely affecting on the performance of specific workloads or the whole database. For instance, here is the counter SQL Server: Cursor Manager By Type: Cursor Requests/Sec which retrieves information about the number of SQL cursor requests received by the server. On the top of that, it is possible to filter out by the cursor manager instance such as API Cursor (only the API cursor information), TSQL Global Cursor and TSQL Local Cursor. Have a look at SQL Server, Cursor Manager by Type Object to get more info about the counter.
On the other hand, using SQL Trace is also helpful, for example, use a trace that includes the RPC:Completed event class search for 'sp_cursorfetch' statements. The value of the fourth parameter is the number of rows returned by the fetch. It is worth noting that the maximum number of rows that are requested to be returned is specified as an input parameter in the corresponding RPC:Starting event class. Finally, by using the DMV 'sys.dm_exec_cursors' we can also determine whether poor cursor usage exists in the database server as Transact-SQL cursors always have a fetch buffer of 1 and for API cursors it should be higher.

select c.* 
from sys.dm_exec_sessions s
    cross apply sys.dm_exec_cursors(con.session_id) as c
where cur.fetch_buffer_size = 1 
    and LIKE 'API%'

Consequently, if it is seen that API Cursors have a fetch buffer size of 1 then consider enabling multiple active results (MARS) when connecting to SQL Server and consult the appropriate documentation for your specific API to determine how to specify a higher fetch buffer size for the cursor either ODBC (SQL_ATTR_ROW_ARRAY_SIZE) or OLE DB (IRowset::GetNextRows, IRowsetLocate::GetRowsAt). After that, we can retrieve more details about the session and connection of the users associated to the harmful cursors so as to decide what to do.

select s.session_id, cn.client_net_address, s.login_name, s.status,s.client_interface_name, s.program_name, 
       c.cursor_id,,, c.plan_generation_num, c.creation_time, c.is_open, c.fetch_status, 
       c.fetch_buffer_size, c.worker_time, c.reads, c.writes, c.dormant_duration
from sys.dm_exec_connections cn
inner join  sys.dm_exec_sessions s on cn.session_id = s.session_id
cross apply sys.dm_exec_cursors(s.session_id) as c
where c.fetch_buffer_size = 1 
     and LIKE 'API%'

That is all for now, thanks for reading. Let me know any remarks you may have. Stay tuned.

Friday 4 August 2017

Troubleshooting timeout expired errors

Having .NET Applications working with SQL Server, some unforeseen timeout errors can be raised for two possible causes: bad-written SQL code (especially Ad-Hoc queries) and application issues. Under this circumstance, developers will generally blame SQL Server at first glance because they may think it is a misconfiguration related to timeout in SQL Server. In my experience troubleshooting these issues, I would say that lots of them are not mostly related to SQL Server configuration itself. Additionally, when it comes to application as the root cause, reading the SQL Server error log, you will not find any error about timeout or failed login events. What does it mean? it simply means that the application never tried to connect to SQL Server because it was still working at application level when the error was raised. Thus, the cause of timeout error is inside the application at Net SqlClient Data Provider level (see the CommandTimeout property value configured for the application connection). This 'CommandTimeout' property specifies the number of seconds that a application provider should wait for result sets before sending a timeout signal. It is well known that the default is 30 seconds, therefore, it may not be good enough due to application performance problems.
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
In order to get it fixed, firstly, developers should check .NET code within applications. In some cases, the developers I worked with found infinite loops or slow codes which spent much time before sending SQL queries to the database engine, as a result of this, the timeout threshold was reached at application level and, obviously, the error was raised. By the way, the following error could also be related to timeout:
Unable to connect to SQL Server session database.
Having checked the application, you can increase the 'CommandTimeout' property value to 60 seconds (if it is not enough then change to 120 or 180, and also considerate using 0 (unlimited) which should be a workaround while the problem is being traced and fixed). That is all for now, let me know any remarks you may have. Thanks for reading. Stay tuned.

Thursday 20 July 2017

Detecting excessive compilation and recompilation issues

Undoubtedly, recompilation is a big topic to reckon with, more importantly, in database environments processing data that is changing rapidly over time and compounded by ad-hoc workloads which may cause CPU bottleneck, so it is of paramount importance to detect excessive compilation and recompilation issues and address them to warrant stable performance for the queries, and in this sense, there are some tools to be used to detect these issues such as Performance Monitor, Extended Events, SQL Server Profiler Trace, DMVs, etc. When it comes to using Performance Monitor we should concentrate the efforts on analising the performance counters SQL Server: SQL Statistics: Batch Requests/sec, SQL Server: SQL Statistics: SQL Compilations/sec, and SQL Server: SQL Statistics: SQL Recompilations/sec. What's more, you can save a trace file capturing the events SP:Recompile, SQL:StmtRecompile, and CursorRecompile, then you can use the following query to see all the recompilation events:

select spid, StartTime, Textdata, EventSubclass, ObjectID, DatabaseID, SQLHandle 
from fn_trace_gettable('C:\RecompilationTrace_01.trc', 1)
where EventClass in(37,75,166) -- 37 = Sp:Recompile, 75 = CursorRecompile, 166 = SQL:StmtRecompile

Furthermore, we can also capture the showplan XML for query compile, but doing that has significant performance overhead because it is captured for each compilation or recompilation. So, just do it for a very short time as long as you see a high value for the SQL Compilations/sec counter in Performance Monitor. Once you know where the problem is you can use Engine Tuning Advisor to see whether any indexing changes improve the compile time and the execution time of the query.
Talking of DMVs to diagnose recompilation issues, looking into 'sys.dm_exec_query_optimizer_info' is very helpful, in particular, look at the Elapsed Time, which is the time elapsed due to optimizations, and also Final Cost. If you see that Time Elapsed is very close to the CPU time, you might reach the conclusion that the compilation and recompilation time is attributable to that high CPU use. Another DMV to use is 'sys.dm_exec_query_stats' whose most important columns to look at are sql_handle, total worker time, plan generation number (the number of times the query has recompiled), and statement Start Offset. Here is an example to check the top 20 most procedures that have been recompiled.

select top 20 SQLText.text, sql_handle, plan_generation_num, execution_count, dbid, objectid 
from sys.dm_exec_query_stats
   cross apply sys.dm_exec_sql_text(sql_handle) as SQLText
where plan_generation_num >1
order by plan_generation_num desc

There is a plethora of recommendations to deal with recompilation and keep the performance in optimum conditions, but for now you can take into consideration the following options:
  • Check to see whether the stored procedure was created with the WITH RECOMPILE option or whether the RECOMPILE query hint was used. If a procedure was created with the WITH RECOMPILE option, since SQL Server 2005, you may be able to take advantage of a statement-level RECOMPILE hint if a particular statement within that procedure needs to be recompiled. Using this hint at the statement level avoids the need of recompiling the whole procedure each time it executes, while at the same time allowing the individual statement to be compiled. 
  • Recompilations can occur due to changes in statistics, and you can use the KEEPFIXED PLAN query hint to make recompilations occur only when there is the need to ensure correctness and not to respond to changes in statistics. For instance, in this context recompilation can only occur if the underlying table structure or its schema that is referenced by a statement changes, or if a table is marked with the sp_recompile stored procedure, all resulting in the fact that the plan no longer applies and consequently triggering the recompilation event. 
  • Using the KEEP PLAN query hint is useful to set the recompilation threshold of temporary tables to be the same as permanent tables. Take a look at the EventSubclass column which displays 'Statistics Changed' for an operation on a temporary table.
  • Turning off the automatic updates of statistics for indexes and statistics that are defined on a table or indexed view prevents recompilations that are due to statistics changes on that object. It is worth noting that turning off the auto-stats option is not always a good idea. This is because the query optimizer is no longer sensitive to data changes in those objects resulting in suboptimal query plans. To be honest, I never turned off this option because I always preferred trusting in SQL Server criteria and instead opting to work on optimising queries.
  • Keep in mind that recompilation thresholds for temporary tables are lower than for normal tables, so if the recompilations on a temporary table are due to statistics changes, you can change the temporary tables to table variables. A change in the cardinality of a table variable does not cause a recompilation. The side effect of this approach is that the query optimizer does not keep track of a table variable's cardinality because statistics are not created or maintained on table variables. This can result in less optimal query plans, however, you can test the different options and choose the best one. Generally, temporary tables provide much better performance than tables variables when lots of data is involved.
  • Recompilation might also occur as a result from SET option changes, so one can diagnose it by using SQL Server Profiler to determine which SET option changed. It is highly advisable to avoid changing SET options within stored procedures and it is much better to set them at the connection level, and for the overwhelmingly majority of cases the default SET options work well. It is also very important to ensure that SET options are not changed during the lifetime of the connection.
  • To avoid recompilations that are due to deferred compilations, do not combine DML with DDL and do not create the DDL as a result from conditional IF statements.
  • To avoid recompilation and also to avoid ambiguity between objects, batches should have qualified object names, for example, dbo.Table1, User1.MySP, etc.
That is all for now, let me know any remarks you may have.

Friday 26 May 2017

How to know which non-clustered indexes are not being used any longer

It is said that indexes are extremely useful to boost the performance of databases, especially indexes on huge tables. I am in agreement with that and I also support the idea of having the right indexes to achieve the best performance if they are created and used properly. Consequently, we have to vigilant of it, indexes may have been created to solve a particular problem in a moment, however, it does not mean that they are going to be useful forever but only for some days or months while data does not change too much. So, I should say as well that indexes are not the be-all and end-all for every business case, and we must always monitor the index usage and should consider the choose of dropping them if we verify they are not being using any longer. Keep in mind, not only could indexes be big and occupy a lot of space, but also impact on write operations negatively. Therefore, we should avoid wasting space, and having done that we can also optimise preventive indexes maintenance time.

This post will show a code to know exactly which non-clustered indexes are being used since the last restart of SQL Server engine. This code also filters out the indexes which are not related to primary keys. As you have seen in the code, we figure out the indexes by checking the value of columns user_seeks, user_scans, and user_lookups, if they are equal to zero then is says to us they were not used until this moment. Please be cautious, before using this code to get the indexes to be deleted, you should make sure that the database engine is not been restarted recently, otherwise, you will delete the wrong indexes and will impact on the performance tremendously. I strongly suggest using this code only if the latest restart of SQL Server has been done three months ago, more and less.

SELECT  i.type_desc,last_user_seek, last_user_scan, last_user_lookup,
        OBJECT_NAME(s.[object_id]) AS [Table Name] , AS [Index Name] ,
        i.index_id ,
        user_updates AS [Total Writes] ,
        user_seeks + user_scans + user_lookups AS [Total Reads] ,
        user_updates - ( user_seeks + user_scans + user_lookups ) AS [Difference]
, 'DROP INDEX ['+ schema_name(o.[schema_id]) +'].['+ OBJECT_NAME(s.[object_id]) +'].[' +  + '];' as Statement_Drop_Index
,i.is_unique, i.is_unique_constraint, i.is_primary_key
FROM    sys.dm_db_index_usage_stats AS s WITH ( NOLOCK )
INNER JOIN sys.indexes AS i WITH ( NOLOCK )
INNER JOIN sys.objects o on o.object_id = i.object_id
    ON s.[object_id] = i.[object_id]  AND i.index_id = s.index_id
WHERE OBJECTPROPERTY(s.[object_id], 'IsUserTable') = 1
      AND s.database_id = DB_ID()     
      AND i.index_id > 1 and i.is_primary_key=0 and i.is_unique=0 and i.is_unique_constraint=0
      AND isnull(user_seeks, 0) + isnull(user_scans, 0) + isnull(user_lookups,0) =0 
ORDER BY  OBJECT_NAME(s.[object_id]), [Difference] DESC, [Total Writes] DESC,[Total Reads] ASC ;

When you execute it, you will also get a column containing the T-SQL code to delete the indexes which have not been used since the latest restart of SQL Server, and then you can execute it to delete them. To sum up, do not get me wrong, I am not against using indexes, on the contrary, I am in favour of using them properly, and if there were indexes that are not being using since a long time ago then they should be deleted. That's all for now. Let me know any remarks you may have. Thanks for reading.

Monday 26 December 2016

How to delete large quantity of data with no growth of the Transaction Log and no blocking issue

Without a shadow of a doubt, at times while working on optimising not only database performance but also hard disk usage, we have to face with challenging tasks which need to be completed rapidly without impacting on production environment. It is well-known that deleting data from large tables could be a truly hard task to complete as the much bigger they are the much more time they will take. Consequently, this task will also block the tables much time causing performance issues and stopping systems from working. Today's post is going to show one technique to delete historic data so as to only keep recent data and boost the performance without a hitch. (This technique applies for SQL Server 2005 Enterprise Edition up to SQL Server 2014 Enterprise. SQL Server 2016 supports TRUNCATE table with partitions.)

To begin with, the technique consists in using SQL Server partition switching which allows to access and transfer subsets of data quickly and efficiently. In this way, we may need to alter the large table and make it partitioned but in this example we are going to create a new one. Now let's say there is a need to keep only data from the latest three months. Here is the structure of the table 'LogTracker' which will be partitioned shortly after. It is also included a specific column named 'MonthNumber' to save the number of the month which at the same time will be the partition number.

CREATE TABLE [dbo].[LogTracker](
    [LogID] [uniqueidentifier] NOT NULL,
    [CreateDate] [datetime]  NULL,
    [ErrorDesc] [nvarchar](MAX)  NULL,
    [User] [nvarchar](40) NULL,
    [MonthNumber]  AS DATEPART(MONTH, [CreateDate]) PERSISTED NOT NULL
    [LogID] ASC,
    [MonthNumber] ASC

There must also be a non-partitioned table (with the same structure) to move the data from the partitioned one. This table will hold the data during a very short time and then it will be truncated. Because it does not have any foreign keys or constraints the truncate operation will be completed successfully.

CREATE TABLE [dbo].[LogTracker_Temp](
    [LogID] [uniqueidentifier] NOT NULL,
    [CreateDate] [datetime]  NULL,
    [ErrorDesc] [nvarchar](MAX)  NULL,
    [User] [nvarchar](40) NULL,
    [MonthNumber]  AS DATEPART(MONTH, [CreateDate]) PERSISTED NOT NULL
    [LogID] ASC,
    [MonthNumber] ASC

Now it is time to create the Function and the Scheme partition we will use to partition the table 'LogTracker'. For this particular example, the data will be classified by month, which means that each partition will contain data from one month.

CREATE PARTITION FUNCTION [upf_LogTracker](INT) AS RANGE LEFT FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)

Finally, we will partition the table.

WITH (MOVE TO ups_LogTracker([MonthNumber]))
ALTER TABLE dbo.LogTracker ADD CONSTRAINT PK_LogTracker PRIMARY KEY ([LogID],[MonthNumber])

Having successfully completed everything, we are able to insert some data for the purpose of this illustration. Obviously, we can insert millions of rows and the performance of deletion by using this technique will be the same, it will practically be instantaneous. There is no difference between deleting some rows or millions as they are using switching partition which is essentially a logical transfer of data between two tables.

INSERT INTO LogTracker SELECT NEWID(), '2016-06-23 13:13:56', 'Error A Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-07-24 04:18:12', 'Error C Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-08-20 01:34:57', 'Error B Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-09-21 08:09:16', 'Error D Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-11-22 05:26:45', 'Error A Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-11-23 15:43:56', 'Error A Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-12-24 23:28:12', 'Error C Desc', 'preyes'
As you can see, the data is classified by month.

SELECT $PARTITION.[upf_LogTracker](MonthNumber),* FROM LogTracker

With this code we are going to delete the historic data and keep only the most recent three months. It moves out the data of each month from 'LogTracker' to the stage table 'LogTracker_Temp' which in no time is truncated. It starts moving data from the month number one until 'the current month - 3'.

DECLARE @PartitionNumberToArchive INT=1
WHILE (@PartitionNumberToArchive<=@M)
    ALTER TABLE dbo.LogTracker SWITCH PARTITION @PartitionNumberToArchive TO dbo.LogTracker_temp
    TRUNCATE TABLE dbo.LogTracker_temp
    SET @PartitionNumberToArchive=@PartitionNumberToArchive+1

The previous code will finish in one o some seconds, instantaneously.

Naturally, the code needs to be scheduled and executed via a SQL Job at the end of each month. To be perfectly honest, not only will this deletion finish in one or two seconds, but also it will not cause any impact on the performance and the Transaction Log will not grow as it does by using traditionally techniques. I hope this post is truly useful for you and you make the most out of it. Let me know any remarks you may have. That is all for now. Thanks for reading.

Wednesday 7 December 2016

New per-operator level performance stats included in SQL2016 SP1 and SQL2014 SP2

Traditionally, we were used to getting only basic runtime performance stats per operator regarding to CPU, I/O and Memory, which were not good enough to allow us to automate recollecting tasks of them. I strongly believe that DBAs are always overwhelmingly interested in diagnosing some issues proactively so as to prevent bad-written queries from using hardware resources inefficiently. Luckily, this convoluted situation has been changed radically in SQL Server 2016 and SQL Server 2014 SP2 since Microsoft has included very useful information per operator (and also per thread in the scope of each operator) which can be seen looking into the Showplan XML. For instance, we are going to see that information for the following query (in this case it has run under a serial plan, that is one thread).

This is its actual execution plan.

It is highlighted all what is new in SQL2016, SQL2016 SP1 and SQL2014 SP2. Therefore, we can appreciate CPU and I/O runtime performance stats for the Clustered Index Scan operator on 'PurchaseOrderDetail' table such as I/O reads, CPU time and Elapsed time. In a parallel plan you will be able to see information per thread in each operator.

And now we will see not only CPU and I/O performance stats for the Hash operator but also Memory performance stats.

All this information can also be seen in the Properties window in the context of each operator inside the Actual Execution Plan by using the latest version of SSMS. If you display the detail of each statistic, the detail per thread will be seen when working with parallel execution plan.

Having seen this, not only are we now able to look into useful performance stats by using Showplan XML and SSMS, but also we can create some scripts for administration and optimization purposes in order to pick up queries with poor performance and then optimise them. That is all for now. Let me know any remark you may have. Thanks for reading.

Friday 4 March 2016

How to split the tempdb database into more files

Naturally, as DBAs we do know is mandatory to modify data and log properties of the tempdb database. Unless we do a customized configuration of it, SQL Server will create only one data file and log file by default. The reality is that we will mostly need to create more files on production environments. There are many recommendations not only about how to create them but also the quantity of files on OLTP environments. To be perfectly honest, I do not believe that the number of data files only depends on the number of cores, but it also depends on concurrency, tempdb contention issues, the workloads on your server and, clearly, the performance of your queries. So, there is no a rule for it. Moreover, depending on the SQL Server version we are working on, we may NOT need to split the tempdb into many data files. There are some situations where it will work splendidly with only one data file since each database environment is unique and, therefore, we need to determine the best for it.

Today we are not going to discuss more details about it. I just would like to suggest splitting your tempdb database into four data files (if you have four or eight core) or eight data files (for 16, 32, 64, or more cores) and ONLY one log file. Ideally, we also need to locate them in different drives RAID1, RAID5 or RAID10. Now having very clear the situation, I will show you one small script to split your default tempdb database into 8 data files and remaining the only one log file.

USE [master]
ALTER DATABASE [tempdb] MODIFY FILE (NAME=N'templog', NEWNAME= N'tempdev_Log')
ALTER DATABASE [tempdb] MODIFY FILE (NAME=N'tempdev', NEWNAME=N'tempdev_Data01')
ALTER DATABASE [tempdb] MODIFY FILE (NAME=N'tempdev_Log',  FILENAME= N'D:SQLTempDBtempdev_Log.ldf', SIZE = 2048MB , FILEGROWTH = 2048MB)
ALTER DATABASE [tempdb] MODIFY FILE (NAME=N'tempdev_Data01',  FILENAME= N'D:SQLTempDBtempdev_Data01.mdf', SIZE = 512MB , FILEGROWTH = 1024MB )
ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data02', FILENAME = N'D:SQLTempDBtempdev_Data02.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB )
ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data03', FILENAME = N'D:SQLTempDBtempdev_Data03.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB )
ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data04', FILENAME = N'D:SQLTempDBtempdev_Data04.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB )
ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data05', FILENAME = N'D:SQLTempDBtempdev_Data05.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB )
ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data06', FILENAME = N'D:SQLTempDBtempdev_Data06.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB )
ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data07', FILENAME = N'D:SQLTempDBtempdev_Data07.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB )
ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev_Data08', FILENAME = N'D:SQLTempDBtempdev_Data08.ndf' , SIZE = 512MB , FILEGROWTH = 1024MB )

After having successfully executed, we must restart the database engine in order to get the new files created and all changes done. I hope you find this post interesting. Let me know any remark if you may have. Thanks for reading!

Saturday 13 February 2016

Looking into some concepts about the well-known tempdb database

Today's post is going to look into some concepts about the well-known tempdb database. As DBAs, we have to keep an eye on it not only at the beginning of the installation, but also as part of our monitoring tasks on a daily basis. To do so, we really need to understand how it works and what enhancements have come from SQL Server 2005 to forward. To begin with, the tempdb is the shared database per instance in SQL Server which is used to stored and manage temporary objects. It has a number of changes since SQL Server 2005, that is, there are new tempdb usages and internal optimizations enhancements. Nevertheless, the tempdb architecture is mostly unchanged since SQL Server 2000. 
In a user database, the transactions have the ACID attributes: atomicity, concurrency, isolation, and durability whereas in the tempdb database the transactions lose the durability attribute which basically means that they do not persist after a SQL Server shut down event. Most of these internal operations on tempdb do not generate log records as there is no need to roll back. So, these operations are faster. Furthermore, some of the database options cannot be modified for tempdb and others are limited or restricted. Here they are:
  • Auto Shrink is not allowed for tempdb.
  • Database Shrink and File Shrink capabilities are also limited.
  • The database CHECKSUM option cannot be enabled. 
  • A database snapshot cannot be created on tempdb. 
  • DBCC CHECKALLOC and DBCC CHECKCATALOG are not supported. 
  • Only offline checking for DBCC CHECKTABLE is performed.

Tempdb Space Usage: The following types of objects can occupy tempdb space: Internal Objects, Version Stores, and User Objects.

Internal Objects: Internal Objects metadata is stored in memory (it means that the metadata is hidden in the tempdb and it does not appear in catalog views such as 'sys.all_objects') and each of them occupies at least nine pages (one IAM page and eight data pages) in tempdb. Page Allocations and Update operations on it does not generate log records. The Internal Objects are used
  • to store intermediate runs for sort
  • to store intermediate results for hash joins and hash aggregates
  • to store XML variables or other large object (LOB) data type variables (text, image, ntext, varchar(max), varbinary(max), and all others)
  • by queries that need a spool to store intermediate results
  • by keyset cursors to store the keys
  • by static cursors to store a query result
  • by Service Broker (Query Notification and Event Notification) to store messages in transit
  • by INSTEAD OF triggers to store data for internal processing
  • by DBCC CHECKDB (it internally uses a query that may need to spool intermediate results)
Version Stores: They do not appear in catalog views such as 'sys.all_objects'. Here are more concepts:
  • Version stores are used to store row versions generated by transactions for features such as snapshot isolation, triggers, MARS (multiple active result sets), and online index build.
  • The online index build version store is for row versions from tables that have online index build operations on them.
  • The common version store is for row versions from all other tables in all databases.
  • The version store consists of append-only store units which are highly optimized for sequential inserts and random look up. Inserts into the version store do not generate log records.
  • Each unit can store many row versions. If there are versions to be stored, a new store unit is created about every minute.
In the following cases the versions of rows are generated for
  • SNAPSHOT isolation and read committed snapshot isolation (the versions are generated by DML operations in the database when the respective database options are enabled)
  • AFTER triggers (the versions are generated for all the update operations by the transaction that fired the trigger during the INSERT, DELETE, or UPDATE statement in any database, independent of database options. INSTEAD OF triggers do not generate versions)
  • MARS (the versions are generated by the UPDATE or DELETE statement when there is a pending SELECT on the same transaction and the same connection)
  • building an ONLINE index
User Objects: They do appear in catalog views such as 'sys.all_objects. Now some details about it:
  • The 'sp_spaceused' system stored procedure can show the size occupied by these objects.
  • User Objects include both user-defined tables and indexes, and system catalog tables and indexes.
  • Operations on User Objects in tempdb are mostly logged. Bulk copy program (BCP), bulk insert, SELECT INTO, and index rebuild operations are bulk logged.
  • User-defined tables include the global temporary tables such as ##t, and local temporary tables such as #t.
  • Local temporary tables also include table variables such as @t and the mapping index for online clustered index build with the SORT_IN_TEMPDB option.
Performance Enhancements in SQL Server 2005 or later
  1. The tempdb logging optimization avoids logging the “after value” in certain log records in tempdb.
  2. Instant data file initialization works by not zeroing out the NTFS file when the file is created or when the size of the file is increased.
  3. There is less use of the UP type page latch when allocating pages and extents in tempdb. Proportional fill has been optimized to reduce UP latch contention.
  4. Proportional fill has been optimized to reduce UP latch contention.
  5. There is now deferred drop in tempdb.
  6. Worktable caching is improved.
  7. SQL Server 2005 or later caches the temporary table that is created by using a CREATE TABLE or SELECT INTO statement.
Recommendations for managing
  1. The tempdb files must be configured with initial size and auto-growth based on your workloads. Do not let with the default sizes.
  2. The tempdb files must be located on RAID0 (for better performance) or RAID1 (if you need have more writes than reads) or RAID5 (if you have more reads than writes). RAID10 is the best option but no all companies can justify this.
  3. The tempdb files must be located on separated disks to avoid contention issues and improves the performance.
  4. Tempdb database must be created with one data file per physical processor if the quantity of CPUs is eight or less. If there are more then eight CPUs then you can start off with eight data files and after increase the number of files by four in case there was PAGELATCH contentions on one of the allocation bitmap pages (including PFS pages, GAM pages and SGAM pages) until the issue is solved, if not, add four files more, and so on.
  5. Do not forget, tempdb data files or log file should not grow so much in a short time, if this happens then something is working wrong, so you need to analyze the workloads performance and detect the query that you have to optimize.
That is all for now. Let me know any remarks you may have about tempdb database. Thanks for reading.

Friday 12 November 2010

SQL11 (Denali): OFFSET and FETCH (New options for ORDER BY)

It is known that ORDER BY clause has been improved in 'Denali' (code name of the upcoming version SQL Server 2012) adding more options as a result from looking for new solutions to work on specific business cases such as fetching top N rows per window of results from the result set which we used to do by combining TOP, OVER, and ORDER BY. Nevertheless, the performance of this approach was poor and not even close to being as good as OFFSET and FETCH in 'Denali'.

OFFSET keyword is used for excluding first N rows (default value is zero) whereas FETCH is used for reading/fetching M rows per each window of results starting from position (N+1). Therefore, OFFSET and FETCH allow to get only rows from position (N+1) to (N+M). I do recommend creating one Index (preferably Index Clustered) on ORDER BY columns and using these columns on WHERE clause as much as possible in order to avoid costly operations such as Index Scans or Table Scans. I have performed many tests on 10 millions of rows, and to be to honest, it worked very well. To illustrate, the following query will start fetching from position 4+1=5 (OFFSET position=4 for SQL Server) to the end of the table.

SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address   
FROM dbo.Customers    
ORDER BY CustomerID 

Now with DESC the results will be different (excluding last 4 rows)

SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address   
FROM dbo.Customers    

Basically, each execution of the query is done independently, which means that not only are the results sent back to the client side, but also immediately the resources are released. Here is another example how to get only the top 3 rows excluding first 4 rows (OFFSET=4), which means that it only reads the fifth, sixth, and seventh rows.

SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
FROM dbo.Customers    
ORDER BY CustomerID  

Having a need to read the next three rows (eighth, ninth, and tenth rows), we have to increase OFFSET value in three.
SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
FROM dbo.Customers    
ORDER BY CustomerID  
   OFFSET (7) ROWS ---ó OFFSET (7) ROW 

Here the script to read dynamically each result by using variables (and optionally OPTIMIZE FOR hint to optimise this Ad-Hoc query). 
SET @Start=0 

SET @Next=3
SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
FROM dbo.Customers    
ORDER BY CustomerID 
     OFFSET (@Start) ROWS    

 OPTION(OPTIMIZE FOR (@Start=4,@Next=5))
Now the same code inside an stored procedure.

CREATE PROCEDURE dbo.PageFetch(@PageNumber INT, @PageSize INT)
CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
FROM dbo.Customers    
ORDER BY CustomerID 
     OFFSET (@PageSize * (@PageNumber -1) ) ROWS

Querying the results for second, third and fifth windows of results with OFFSET.
EXEC dbo.PageFetch 2,3;
EXEC dbo.PageFetch 3,3;
EXEC dbo.PageFetch 5,3;

Luckily, OFFSET and FETCH are supported by subqueries, functions, derived tables, but not by indexed views and using directly with TOP, OVER, INSERT, UPDATE, MERGE, o DELETE (except inside of independent queries at lower level such as subqueries). Here an illustration.

SELECT  TOP(2) ROW_NUMBER() OVER(ORDER BY CustomerID) AS 'NRO', CompanyName, ContactName, ContactTitle  FROM (
        SELECT  CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
        FROM dbo.Customers    
        ORDER BY CustomerID  
           OFFSET (4) ROWS ---ó OFFSET (4) ROW 
) AS T1

Finally, using it with Views.

SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address --/..*.../  
FROM dbo.Customers  
ORDER BY CustomerID 
   OFFSET (0) ROWS ---ó OFFSET (4) ROW 
SELECT CustomerID, CompanyName, ContactName FROM dbo.v1
WHERE  CustomerID LIKE 'AN%' 

To sum up, OFFSET and FETCH are excellent improvements which allow to implement solutions to fetch only specific pages/windows of results from the result set with ease. Let me know any remarks you may have. Thanks for reading!

Sunday 5 February 2006

Improving the performance by using 'Boost SQL Server Priority' option

Boost SQL Server Priority is one of the many options that we can configure at SQL Server instance level in order to improve the performance of the database engine by changing its priority running on Microsoft Windows Server.

By default, the value of this option is 0 (zero) which says to Windows Server to execute SQL Server under normal priority either running on one processor or symmetric multiprocessor environment. On the other hand, if we configure this value to 1, SQL Server will be executed with a major priority which might improve the performance of a database server. This option works very well on servers which are only dedicated to SQL Server. In contrast, it is not highly recommended using it when there are more applications running on the same server.

'Boost SQL Server Priority' option can be configured via either SQL Management Studio or T-SQL. In this post we are going to see how to configure it via the second way. Because this option is an Advanced Option, we firstly need to enable the possibility of accessing to it, which means that we have to enable 'show advanced options' option:
sys.sp_configure 'show advanced options', 1
Shortly after that, we must confirm the change without restating the database engine:
Having done that, we move on configuring 'Boost SQL Server Priority'.
EXEC sys.sp_configure N'priority boost', N'1'
Now we need to commit and override the change.
That is all we need to do in order to enable this option. It couldn't have been simpler.
To sum up, at this point I need to ask you to work with a lot of cautiousness as this configuration might impact on the whole database engine performance. If there were many SQL Server database engines running on the same server and this option was enabled only for some of them, the rest of them might be impacted detrimentally. Nevertheless, I personally do not recommend enabling it on machines running on only one processor or many database engines (or other applications). Thanks for reading. Stay tuned.
HELLO, I'M PERCY REYES! — a book lover, healthy lifestyle lover... I've been working as a senior SQL Server Database Administrator (DBA) for over 20 years; I'm a three-time awarded Microsoft Data Platform MVP. I'm currently doing a PhD in Computer Science (cryptography) at Loughborough University, England — working on cryptographic Boolean functions, algorithmic cryptanalysis, number theory, and other algebraic aspects of cryptography. READ MORE