Percy Reyes

Monday, 26 December 2016

How to delete large quantity of data with no growth of the Transaction Log and no blocking issue

Without a shadow of a doubt, at times while working on optimising not only database performance but also hard disk usage, we have to face with challenging tasks which need to be completed rapidly without impacting on production environment. It is well-known that deleting data from large tables could be a truly hard task to complete as the much bigger they are the much more time they will take. Consequently, this task will also block the tables much time causing performance issues and stopping systems from working. Today's post is going to show one technique to delete historic data so as to only keep recent data and boost the performance without a hitch. (This technique applies for SQL Server 2005 Enterprise Edition up to SQL Server 2014 Enterprise. SQL Server 2016 supports TRUNCATE table with partitions.)

To begin with, the technique consists in using SQL Server partition switching which allows to access and transfer subsets of data quickly and efficiently. In this way, we may need to alter the large table and make it partitioned but in this example we are going to create a new one. Now let's say there is a need to keep only data from the latest three months. Here is the structure of the table 'LogTracker' which will be partitioned shortly after. It is also included a specific column named 'MonthNumber' to save the number of the month which at the same time will be the partition number.

CREATE TABLE [dbo].[LogTracker](
    [LogID] [uniqueidentifier] NOT NULL,
    [CreateDate] [datetime]  NULL,
    [ErrorDesc] [nvarchar](MAX)  NULL,
    [User] [nvarchar](40) NULL,
    [MonthNumber]  AS DATEPART(MONTH, [CreateDate]) PERSISTED NOT NULL
 CONSTRAINT [PK_LogTracker] PRIMARY KEY CLUSTERED 
  (
    [LogID] ASC,
    [MonthNumber] ASC
  )
)

There must also be a non-partitioned table (with the same structure) to move the data from the partitioned one. This table will hold the data during a very short time and then it will be truncated. Because it does not have any foreign keys or constraints the truncate operation will be completed successfully.

CREATE TABLE [dbo].[LogTracker_Temp](
    [LogID] [uniqueidentifier] NOT NULL,
    [CreateDate] [datetime]  NULL,
    [ErrorDesc] [nvarchar](MAX)  NULL,
    [User] [nvarchar](40) NULL,
    [MonthNumber]  AS DATEPART(MONTH, [CreateDate]) PERSISTED NOT NULL
 CONSTRAINT [PK_LogTracker_Temp] PRIMARY KEY CLUSTERED 
  (
    [LogID] ASC,
    [MonthNumber] ASC
  )
)

Now it is time to create the Function and the Scheme partition we will use to partition the table 'LogTracker'. For this particular example, the data will be classified by month, which means that each partition will contain data from one month.

CREATE PARTITION FUNCTION [upf_LogTracker](INT) AS RANGE LEFT FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
GO
CREATE PARTITION SCHEME [ups_LogTracker] AS PARTITION [upf_LogTracker] ALL TO ([PRIMARY])
GO

Finally, we will partition the table.

ALTER TABLE dbo.LogTracker DROP CONSTRAINT PK_LogTracker
WITH (MOVE TO ups_LogTracker([MonthNumber]))
GO
ALTER TABLE dbo.LogTracker ADD CONSTRAINT PK_LogTracker PRIMARY KEY ([LogID],[MonthNumber])
GO

Having successfully completed everything, we are able to insert some data for the purpose of this illustration. Obviously, we can insert millions of rows and the performance of deletion by using this technique will be the same, it will practically be instantaneous. There is no difference between deleting some rows or millions as they are using switching partition which is essentially a logical transfer of data between two tables.

INSERT INTO LogTracker SELECT NEWID(), '2016-06-23 13:13:56', 'Error A Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-07-24 04:18:12', 'Error C Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-08-20 01:34:57', 'Error B Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-09-21 08:09:16', 'Error D Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-11-22 05:26:45', 'Error A Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-11-23 15:43:56', 'Error A Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-12-24 23:28:12', 'Error C Desc', 'preyes'

As you can see, the data is classified by month.

SELECT $PARTITION.[upf_LogTracker](MonthNumber),* FROM LogTracker

With this code we are going to delete the historic data and keep only the most recent three months. It moves out the data of each month from 'LogTracker' to the stage table 'LogTracker_Temp' which in no time is truncated. It starts moving data from the month number one until 'the current month - 3'.

DECLARE @M INT = CASE WHEN DATEPART(MONTH, GETDATE()) - 3 >0 THEN DATEPART(MONTH, GETDATE()) - 3  ELSE 12 + DATEPART(MONTH, GETDATE()) - 3 END
DECLARE @PartitionNumberToArchive INT=1
WHILE (@PartitionNumberToArchive<=@M)
BEGIN
    ALTER TABLE dbo.LogTracker SWITCH PARTITION @PartitionNumberToArchive TO dbo.LogTracker_temp
    TRUNCATE TABLE dbo.LogTracker_temp
    UPDATE STATISTICS dbo.LogTracker WITH FULLSCAN
    SET @PartitionNumberToArchive=@PartitionNumberToArchive+1
END

The previous code will finish in one o some seconds, instantaneously.

Naturally, the code needs to be scheduled and executed via a SQL Job at the end of each month. To be perfectly honest, not only will this deletion finish in one or two seconds, but also it will not cause any impact on the performance and the Transaction Log will not grow as it does by using traditionally techniques. I hope this post is truly useful for you and you make the most out of it. Let me know any remarks you may have. That is all for now. Thanks for reading.

Sunday, 11 December 2016

SQL2016: DROP IF EXISTS and CREATE OR ALTER statements

Beyond all doubt, using old-fashioned ways of manipulating database objects is not only a wasting of time but also boring. Therefore, it is well-known that there are situations where we face with many errors while altering or dropping certain database objects because they do not exist or maybe they already are created. The traditional way of completing this task suggests writing a logic to validate first the existence of the object and then make a decision depending on the result, for instance, we usually do it by querying the object from the 'sys.all_objects' system view. Having said that, I am of the idea that this way of working is always a very unproductive task. Luckily, this situation has changed as now with the arrival of SQL Server 2016 we are able to drop/creater/alter database objects without thinking much about whether they are or not existing objects.

Let me expand on what I am saying. New language features of T-SQL have been introduced in SQL Server 2016 such as DROP IF EXISTS and CREATE OR ALTER statements. With DROP IF EXISTS we can drop some sorts of object and validate their existence in only one statement. The objects supported with DROP IF EXISTS are PROCEDURE, TABLE (included COLUMN and CONSTRAINT), TRIGGER, VIEW, FUNCTION, INDEX, ASSEMBLY, ROLE, AGGREGATE, RULE, TYPE, DATABASE, SCHEMA, USER, DEFAULT, SECURITY POLICY, SEQUENCE, and SYNONYM. Thus we do not need to do something like this any longer.

IF EXISTS (SELECT * FROM sys.objects WHERE name = 'MyProcNumberOne')
 DROP PROCEDURE MyProcNumberOne

Here are some examples of DROP IF EXISTS statement.

DROP PROCEDURE IF EXISTS [dbo].[MyProcNumberOne]
GO
DROP TABLE IF EXISTS [dbo].[MyTableDemo]
GO
DROP VIEW IF EXISTS [dbo].[MyViewTest]
GO
DROP VIEW IF EXISTS [dbo].[MyFunctionTwo]
GO
DROP INDEX IF EXISTS [dbo].[IX_MyTableDemo_04]

Now DROP IF EXISTS working on columns and constraints.

ALTER TABLE [dbo].[MyTableDemo] DROP COLUMN IF EXISTS Col2
GO
ALTER TABLE [dbo].[MyTableDemo] DROP CONSTRAINT IF EXISTS FK_MyTableDemo_10
GO

Now talking of the other new statement CREATE OR ALTER, it supports STORED PROCEDURES (including natively compiled), FUNCTIONS (including natively compiled), TRIGGERS, and VIEWS. So we are not going to need the traditional validation any more.

IF NOT EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[MyProcNumberOne]')
CREATE PROCEDURE dbo.mysproc (@Par1 INT, @Par2 VARCHAR(10), ... )
AS
BEGIN
...
..

CREATE OR ALTER statement is also easy to use. If the object exists then it will be altered, otherwise it will be created. I couldn't have been simpler.

CREATE OR ALTER [dbo].[MyProcNumberOne] (@Par1 INT, @Par2 VARCHAR(10), ... )
AS
BEGIN
...
..

To be perfectly honest, I do think that these new language features are the be-all and end-all as they are practical, simple and allow us to work in the sense of improving the quality of deploying new business functionalities and integration of systems. Surely, you will make the most out of these features. That is all for now. Let me know any remarks you may have. Thanks for reading.

Wednesday, 7 December 2016

New per-operator level performance stats included in SQL2016 SP1 and SQL2014 SP2

Traditionally, we were used to getting only basic runtime performance stats per operator regarding to CPU, I/O and Memory, which were not good enough to allow us to automate recollecting tasks of them. I strongly believe that DBAs are always overwhelmingly interested in diagnosing some issues proactively so as to prevent bad-written queries from using hardware resources inefficiently. Luckily, this convoluted situation has been changed radically in SQL Server 2016 and SQL Server 2014 SP2 since Microsoft has included very useful information per operator (and also per thread in the scope of each operator) which can be seen looking into the Showplan XML. For instance, we are going to see that information for the following query (in this case it has run under a serial plan, that is one thread).

This is its actual execution plan.

It is highlighted all what is new in SQL2016, SQL2016 SP1 and SQL2014 SP2. Therefore, we can appreciate CPU and I/O runtime performance stats for the Clustered Index Scan operator on 'PurchaseOrderDetail' table such as I/O reads, CPU time and Elapsed time. In a parallel plan you will be able to see information per thread in each operator.

And now we will see not only CPU and I/O performance stats for the Hash operator but also Memory performance stats.

All this information can also be seen in the Properties window in the context of each operator inside the Actual Execution Plan by using the latest version of SSMS. If you display the detail of each statistic, the detail per thread will be seen when working with parallel execution plan.

Having seen this, not only are we now able to look into useful performance stats by using Showplan XML and SSMS, but also we can create some scripts for administration and optimization purposes in order to pick up queries with poor performance and then optimise them. That is all for now. Let me know any remark you may have. Thanks for reading.

Monday, 14 November 2016

How to create new Logins and Users for an AlwaysOn Availability Group

(This tip also applies for Log Shipping and Database Mirroring)

Surely, after setting up an AlwaysOn Availability Group (AG) we are going to need to create additional Logins and Users who must have access and the same permissions across the replica servers so that users can work on each server properly. At times it may end up being a confusing task as SQL Logins may not have matched to the correct SQL Users and therefore users will only access to the SQL instance but not databases in the Secondary Replicas. Do not forget that we cannot create Users directly in Secondary Replicas as they are only read-only, read intent or no access.

To begin with, we need to understand some basic concepts. Logins and Users in SQL Server are different things. Logins allow to access only to the SQL instance and perform tasks at that level such as create/alter/drop databases, alter configurations of the SQL instance, etc. whereas Users allow to get into the databases and work inside them doing delete/update/insert data, create/alter/drop tables, stored procedures, triggers, etc. So, traditionally we always need to have a Login and a User matched each other so as to have access to databases.(Since SQL Server 2012 it is possible to create SQL User without a SQL Login but this aspect is not covered in this post.)

In this context, when it comes to creating new Logins and Users not only will we need to create them in the Primary Replica, but also we have to make sure they also are created in each Secondary Replica so that Users and Applications can log on successfully with the same permissions as they have in the Primary Replica. So we might get some big issues if they are not matched correctly.

Obviously, we need to create the new Login at Primary Replica level and then give the permissions inside the databases. Consequently, ONLY the user will be replicated immediately to each Secondary Replica. So, because the Login is NOT a database object it will not be replicated (it is a server object), which means that we need to create the Login in each Secondary Replica server manually. If the Login is a Windows Login then all what we need to do is to create that Windows Login in each Secondary Replica. A Windows login only has a SID across the whole domain so it will be automatically matched to its Windows User with the same name in each database in the Secondary Replica server. We do not need to know its SID before creating it.

USE [master]
GO
CREATE LOGIN [DOMAIN\percy.reyes] FROM WINDOWS WITH DEFAULT_DATABASE=[master]
GO

Now when it comes to working with SQL Logins and SQL Users we must figure out and take the correct SID of the SQL Login in order to use it to create the same SQL Login in each Secondary Replica server. Therefore, we can query the SID of the SQL Login we need by executing the following code like this.

Having got it, we can use it to create the SQL Login by indicating the correct SID. (Do not forget to use the same password and other properties too.)

USE [master]
GO
CREATE LOGIN [preyes] WITH  PASSWORD=N'P4s$Word&1', SID= 0x8AA780458D0F9E4F95D4204B8E022C83, CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF
GO

In conclusion, only after successfully following this technique will the SQL Login match automatically to the correct user inside the database. It also means that the user will be able to log on any Secondary Replica server with the right permissions inside the databases. That is all for now. Let me know any remarks you may have. Thanks for reading.

Friday, 28 October 2016

How to monitor database file space usage in SQL Server

Undoubtedly, when it comes to monitoring database file space usage we need a simple way to do it. Fortunately, SQL Server has considerable quantity of DMV and System Functions to allows us to get closely related information about it. Today's post has the intention of providing a practical way of warning us in case the free database file space is less than the threshold of 20%. With this alert in mind, we will be able to make a preventive decision in order to prevent the performance of in-flight transactions from being impacted.

I am coming up with a simple query to achieve that. Firstly, we need to create a physical temporary table 'FileSpace' which will hold related information for each database. Having successfully creating the table, the following step is to execute the adequate logic, in the context of each database user via 'sp_msforeachdb', to collect file space usage which will be reported in the end. I am getting space usage information of each database file by using FILEPROPERTY system function.

The execution of the entire script (including creation of the table) may be automated via a SQL Job in order to send yourself an alert with format HTML by using database mail. Unluckily, this code does not include the logic to send that email but it is not something difficult to complete, therefore, just do it. That is all for now. Let me know any remarks you may have. Thanks for reading.

Here is the whole script.

USE [master]
GO
CREATE TABLE [dbo].[FileSpace](
    [DatabaseName] [nvarchar](128) NULL,
    [FileName] [sysname] NOT NULL,
    Type varchar(10) not null,
    [FileSizeGB] [decimal](10, 2) NULL,
    [SpaceUsedGB] [decimal](10, 1) NULL,
    [SpaceFreeGB] [decimal](10, 1) NULL,
    [SpaceFree%] [decimal](10, 1) NULL,
    Physical_Name varchar(max)
) ON [PRIMARY]
GO
 
EXEC sp_MSforeachdb '
USE [?]
insert into master.DBO.[FileSpace]([DatabaseName] ,    [FileName] , Type,    [FileSizeGB],    [SpaceUsedGB] ,    [SpaceFreeGB] ,    [SpaceFree%],Physical_Name )
select db_name() DatabaseName, name FileName, Type_Desc ,
cast(size/128.0/1024.0 as decimal(10,2)) SizeGB, 
cast(FILEPROPERTY(name,''SpaceUsed'') /128.0/1024.0  as decimal(10,1)) SpaceUsedGB, 
cast((size - FILEPROPERTY(name,''SpaceUsed'') ) /128.0/1024.0 as decimal(10,1)) SpaceFreeGB, 
cast(((size - FILEPROPERTY(name,''SpaceUsed'') )/(size*1.0)) *100 as decimal(10,1)) [SpaceFree%], physical_name
FROM sys.database_files'

SELECT * FROM FileSpace
WHERE [SpaceFree%]<=20
DROP TABLE FileSpace

Pages