Percy Reyes

Friday 26 May 2017

How to know which non-clustered indexes are not being used any longer

It is said that indexes are extremely useful to boost the performance of databases, especially indexes on huge tables. I am in agreement with that and I also support the idea of having the right indexes to achieve the best performance if they are created and used properly. Consequently, we have to vigilant of it, indexes may have been created to solve a particular problem in a moment, however, it does not mean that they are going to be useful forever but only for some days or months while data does not change too much. So, I should say as well that indexes are not the be-all and end-all for every business case, and we must always monitor the index usage and should consider the choose of dropping them if we verify they are not being using any longer. Keep in mind, not only could indexes be big and occupy a lot of space, but also impact on write operations negatively. Therefore, we should avoid wasting space, and having done that we can also optimise preventive indexes maintenance time.

This post will show a code to know exactly which non-clustered indexes are being used since the last restart of SQL Server engine. This code also filters out the indexes which are not related to primary keys. As you have seen in the code, we figure out the indexes by checking the value of columns user_seeks, user_scans, and user_lookups, if they are equal to zero then is says to us they were not used until this moment. Please be cautious, before using this code to get the indexes to be deleted, you should make sure that the database engine is not been restarted recently, otherwise, you will delete the wrong indexes and will impact on the performance tremendously. I strongly suggest using this code only if the latest restart of SQL Server has been done three months ago, more and less.

SELECT  i.type_desc,last_user_seek, last_user_scan, last_user_lookup,
        OBJECT_NAME(s.[object_id]) AS [Table Name] ,
        i.name AS [Index Name] ,
        i.index_id ,
        user_updates AS [Total Writes] ,
        user_seeks + user_scans + user_lookups AS [Total Reads] ,
        user_updates - ( user_seeks + user_scans + user_lookups ) AS [Difference]
, 'DROP INDEX ['+ schema_name(o.[schema_id]) +'].['+ OBJECT_NAME(s.[object_id]) +'].[' + i.name  + '];' as Statement_Drop_Index
,i.is_unique, i.is_unique_constraint, i.is_primary_key
FROM    sys.dm_db_index_usage_stats AS s WITH ( NOLOCK )
INNER JOIN sys.indexes AS i WITH ( NOLOCK )
INNER JOIN sys.objects o on o.object_id = i.object_id
    ON s.[object_id] = i.[object_id]  AND i.index_id = s.index_id
WHERE OBJECTPROPERTY(s.[object_id], 'IsUserTable') = 1
      AND s.database_id = DB_ID()     
      AND i.index_id > 1 and i.is_primary_key=0 and i.is_unique=0 and i.is_unique_constraint=0
      AND isnull(user_seeks, 0) + isnull(user_scans, 0) + isnull(user_lookups,0) =0 
ORDER BY  OBJECT_NAME(s.[object_id]), [Difference] DESC, [Total Writes] DESC,[Total Reads] ASC ;

When you execute it, you will also get a column containing the T-SQL code to delete the indexes which have not been used since the latest restart of SQL Server, and then you can execute it to delete them. To sum up, do not get me wrong, I am not against using indexes, on the contrary, I am in favour of using them properly, and if there were indexes that are not being using since a long time ago then they should be deleted. That's all for now. Let me know any remarks you may have. Thanks for reading.

Thursday 11 May 2017

ObjectExplorer error: Dedicated administrator connections are not supported

Dedicated Administrator Connection (DAC) is a type of privileged connection that we can establish to SQL Server database engine when it does not respond because of too much workload. In this context, DAC is truly useful which allow to diagnosis and solve this sort of problems without restarting the database server any longer. Its is said that it is not possible to establish a DAC connection to the Object Explorer panel via SSMS so that we can get the following error while trying do it in that way:

Cannot connect to admin:<SQLInstanceFullName>.
Dedicated administrator connections are not supported. (ObjectExplorer)

It is true up to a point, but it does not mean that SSMS does not support DAC connections. Therefore, the question is: can we ONLY establish a DAC connection to SQL Server by using SQLCMD? the answer is 'NO'. We also can do it via SSMS while creating a new SQL Query, which means that instead of clicking on 'Object Explorer' and 'Connect', we must click on 'New Query' and finally write 'admin:<Full Name of SQL Instance>' on 'Server Name' text box. That is all, it couldn't have been simpler.

Having done that, we will have a DAC connection ready via SSMS. Let me know any remarks you may have. Thanks for reading. Stay tuned.

Sunday 30 April 2017

Converting Unix Timestamp into SQL Server DateTime

It is well known that developing and integrating applications based on different technologies may end up being a huge challenge, especially when it comes to dealing with data stored in diverse data type formats as it is Timestamp in Unix and DateTime in SQL Server. This post intents to show you a method in SQL Server to convert Unix Timestamp into Datetime. Here it is:

CREATE FUNCTION [dbo].[UNIX_TIMESTAMP_TO_DATETIME] (
@timestamp integer
)
RETURNS datetime
AS
BEGIN
  DECLARE @return datetime
  SET @timestamp = @timestamp - 18000
  SELECT @return = DATEADD(second, @timestamp,{d '1970-01-01'});
   
  RETURN @return
END

The function takes as a parameter a Unix Timestamp value so that it is converted into Datetime, and it will then give you the equivalent value in SQL Server Datetime format. I know it might not be the be-all and end-all, nevertheless, I am pretty sure it will be use for many people. I hope you make the most out of it. That’s all for now. Let me know any remarks you may have. Thanks for reading. Stay tuned.

Wednesday 12 April 2017

Checking Disk Space usage and Database Files usage per partition disk

When it comes to monitoring and managing disk space we may have some graphic tools to help us complete these tasks rapidly, nevertheless, not everybody can afford to pay the tools, and most of them do not give some details we may need on the spur of the moment. For example, we may have the need to check how much disk space is occupied by database files exclusively so that a wise decision can be made to ensure we are using disk space properly of a database server. We may need to see a report something like this.

In the report we can appreciate that all partition disks are displayed, each of them has information related to the space used in total (db_file_size_on_disk_db) by all database files (data & log) and it also includes the equivalent in percentage (db_file_size_on_disk_pct). Thus, we will be able to know whether or not the disk space is being using properly. Furthermore, disk space capacity and disk space available in gigabytes and percentage can also seen (disk_capacity_gb, disk_space_available_gb, disk_space_available_pct).

Now it is high time to show the code to get that report containing that information:

CREATE PROC [dbo].[sp_DBA_mon_db_files_size_on_disk] (@volume_mount_point char(1)=NULL)
/*WITH ENCRYPTION*/
AS
begin
if (@volume_mount_point is not null )
begin
    SELECT    vs.volume_mount_point, 
    vs.logical_volume_name,   
    
    DB_NAME(f.database_id) database_name,  f.type_desc,
    vs.total_bytes/1024/1024/1024.0 AS [disk_capacity_gb],
    cast( f.[size] / 128 /1024.0 as decimal(10,2)) AS [db_file_size_on_disk_gb],  
    cast( ( f.[size]*100/ 128 /1024.0 ) /   (vs.total_bytes/1024/1024/1024.0 )  as DECIMAL(10,2) ) AS [db_file_size_on_disk_pct],
    cast(vs.available_bytes/1024/1024/1024.0 as DECIMAL(10,2)) AS [disk_space_available_gb],    
   cast ( ( vs.available_bytes *100.0  / vs.total_bytes) as DECIMAL(10,2) )  AS [disk_space_available_pct],
   f.physical_name,
   vs.file_system_type  
    FROM  sys.master_files AS f
    CROSS APPLY   sys.dm_os_volume_stats(f.database_id, f.file_id) AS vs
    WHERE   f.database_id < 32767 and  vs.volume_mount_point=@volume_mount_point + ':\'
    order  by f.[size]  desc
end
 else
  begin
    SELECT vs.volume_mount_point, 
        vs.logical_volume_name,                
        vs.total_bytes/1024/1024/1024.0 AS [disk_capacity_gb],
        cast(SUM( f.[size] / 128 /1024.0) as decimal(10,2)) AS [db_file_size_on_disk_gb],  
       cast( (SUM( f.[size]/ 128 /1024.0) *100) /   (vs.total_bytes/1024/1024/1024.0 )  as DECIMAL(10,2) ) AS [db_file_size_on_disk_pct],
        cast(vs.available_bytes/1024/1024/1024.0  as DECIMAL(10,2)) AS [disk_space_available_gb],    
       cast ( ( vs.available_bytes *100.0  / vs.total_bytes) as DECIMAL(10,2) )  AS [disk_space_available_pct],
       vs.file_system_type  
    FROM  sys.master_files AS f
    CROSS APPLY   sys.dm_os_volume_stats(f.database_id, f.file_id) AS vs
    WHERE   f.database_id < 32767
    group by  vs.volume_mount_point,vs.total_bytes ,   vs.logical_volume_name,  vs.total_bytes ,vs.file_system_type  ,  vs.available_bytes 
    
 end
end

The report could receive the letter of the partition disk as a parameter, and also it can be omitted. As you can see in the definition of the stored procedure, the default value for @volume_mount_point is NULL which means, according to the logic of it, that general information will be displayed, that is the report in the graphic, otherwise, giving the letter of a partition will allow you to get detail information for every database file located in that disk partition. That's is all for now. I hope it is useful and practical for you. Let me know any remarks you may have. Thanks for reading. Stay tuned.

Wednesday 5 April 2017

Looking into Modern_Spanish and Latin1_General

The collations Modern_Spanish and Latin1_General are both Windows collations and support the same character set (Code Page 1252), but there are some differences between both collations related to the treatment of some characters in terms of sorting/doing comparisons. Some of the differences are noticeable especially when Accent-Insensitive is used, in Modern_Spanish n and ñ are considered different characters whereas in Latin1_General they are considered a character and an accented version of the same character.

Today I am going to show off an example to make it clear so that you can take it on board to pre-empt unforeseen results or conflict of collations. This example is based on Modern_Spanish_CI_AI and Latin1_General_CI_AI.

create table TempTable (
   Text_in_Latin1  varchar(100) collate Latin1_General_CI_AI,
   Text_in_Modern varchar(100) collate Modern_Spanish_CI_AI
)
go
insert into TempTable values ('Español', 'Español')
insert into TempTable values ('Espanol', 'Espanol')
insert into TempTable values ('Espanól', 'Espanól')
go
select Text_in_Latin1 from TempTable where Text_in_Latin1='Espanol'
go
select Text_in_Modern from TempTable where Text_in_Modern='Espanol'
go
drop table TempTable

Text_in_Latin1
--------------------------------------
Español
Espanol
Espanól

Text_in_Modern
--------------------------------------
Espanol
Espanól

As you can see, the first result demonstrates that Español, Espanol and Espanól are treated as the same word when we use Accent-Insensitive whereas in the second one Español and Espanol are treated as different words. It is worth noting that you only must change the collation if you really understand the behaviour change. For example, one might think that Latin1_General_CI_AS (Windows Collation) and SQL_Latin1_General_CP1_CI_AS (SQL Collation) are the same, but they are not the same, there is a slight difference in the short behaviour between both for specific special characters like '-'. Here another example about it.

create table TempTable (
   Text_in_Latin1  varchar(100) collate Latin1_General_CI_AS,
   Text_in_SQL_Latin1 varchar(100) collate SQL_Latin1_General_CP1_CI_AS
)
go
insert into TempTable values ('es-pe', 'es-pe')
insert into TempTable values ('espe', 'espe')
go
select Text_in_Latin1 from TempTable order by Text_in_Latin1
go
select Text_in_SQL_Latin1 from TempTable order by Text_in_SQL_Latin1
go
drop table TempTable

Text_in_Latin1
---------------------------------------
espe
es-pe

Text_in_SQL_Latin1
---------------------------------------
es-pe
espe

To be honest, I personally like having the same collation for all databases, but it has to be done carefully and, more importantly, if you really want that change. That is all for now, let me know any remarks you may have.

Monday 26 December 2016

How to delete large quantity of data with no growth of the Transaction Log and no blocking issue

Without a shadow of a doubt, at times while working on optimising not only database performance but also hard disk usage, we have to face with challenging tasks which need to be completed rapidly without impacting on production environment. It is well-known that deleting data from large tables could be a truly hard task to complete as the much bigger they are the much more time they will take. Consequently, this task will also block the tables much time causing performance issues and stopping systems from working. Today's post is going to show one technique to delete historic data so as to only keep recent data and boost the performance without a hitch. (This technique applies for SQL Server 2005 Enterprise Edition up to SQL Server 2014 Enterprise. SQL Server 2016 supports TRUNCATE table with partitions.)

To begin with, the technique consists in using SQL Server partition switching which allows to access and transfer subsets of data quickly and efficiently. In this way, we may need to alter the large table and make it partitioned but in this example we are going to create a new one. Now let's say there is a need to keep only data from the latest three months. Here is the structure of the table 'LogTracker' which will be partitioned shortly after. It is also included a specific column named 'MonthNumber' to save the number of the month which at the same time will be the partition number.

CREATE TABLE [dbo].[LogTracker](
    [LogID] [uniqueidentifier] NOT NULL,
    [CreateDate] [datetime]  NULL,
    [ErrorDesc] [nvarchar](MAX)  NULL,
    [User] [nvarchar](40) NULL,
    [MonthNumber]  AS DATEPART(MONTH, [CreateDate]) PERSISTED NOT NULL
 CONSTRAINT [PK_LogTracker] PRIMARY KEY CLUSTERED 
  (
    [LogID] ASC,
    [MonthNumber] ASC
  )
)

There must also be a non-partitioned table (with the same structure) to move the data from the partitioned one. This table will hold the data during a very short time and then it will be truncated. Because it does not have any foreign keys or constraints the truncate operation will be completed successfully.

CREATE TABLE [dbo].[LogTracker_Temp](
    [LogID] [uniqueidentifier] NOT NULL,
    [CreateDate] [datetime]  NULL,
    [ErrorDesc] [nvarchar](MAX)  NULL,
    [User] [nvarchar](40) NULL,
    [MonthNumber]  AS DATEPART(MONTH, [CreateDate]) PERSISTED NOT NULL
 CONSTRAINT [PK_LogTracker_Temp] PRIMARY KEY CLUSTERED 
  (
    [LogID] ASC,
    [MonthNumber] ASC
  )
)

Now it is time to create the Function and the Scheme partition we will use to partition the table 'LogTracker'. For this particular example, the data will be classified by month, which means that each partition will contain data from one month.

CREATE PARTITION FUNCTION [upf_LogTracker](INT) AS RANGE LEFT FOR VALUES (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
GO
CREATE PARTITION SCHEME [ups_LogTracker] AS PARTITION [upf_LogTracker] ALL TO ([PRIMARY])
GO

Finally, we will partition the table.

ALTER TABLE dbo.LogTracker DROP CONSTRAINT PK_LogTracker
WITH (MOVE TO ups_LogTracker([MonthNumber]))
GO
ALTER TABLE dbo.LogTracker ADD CONSTRAINT PK_LogTracker PRIMARY KEY ([LogID],[MonthNumber])
GO

Having successfully completed everything, we are able to insert some data for the purpose of this illustration. Obviously, we can insert millions of rows and the performance of deletion by using this technique will be the same, it will practically be instantaneous. There is no difference between deleting some rows or millions as they are using switching partition which is essentially a logical transfer of data between two tables.

INSERT INTO LogTracker SELECT NEWID(), '2016-06-23 13:13:56', 'Error A Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-07-24 04:18:12', 'Error C Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-08-20 01:34:57', 'Error B Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-09-21 08:09:16', 'Error D Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-11-22 05:26:45', 'Error A Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-11-23 15:43:56', 'Error A Desc', 'preyes'
INSERT INTO LogTracker SELECT NEWID(), '2016-12-24 23:28:12', 'Error C Desc', 'preyes'

As you can see, the data is classified by month.

SELECT $PARTITION.[upf_LogTracker](MonthNumber),* FROM LogTracker

With this code we are going to delete the historic data and keep only the most recent three months. It moves out the data of each month from 'LogTracker' to the stage table 'LogTracker_Temp' which in no time is truncated. It starts moving data from the month number one until 'the current month - 3'.

DECLARE @M INT = CASE WHEN DATEPART(MONTH, GETDATE()) - 3 >0 THEN DATEPART(MONTH, GETDATE()) - 3  ELSE 12 + DATEPART(MONTH, GETDATE()) - 3 END
DECLARE @PartitionNumberToArchive INT=1
WHILE (@PartitionNumberToArchive<=@M)
BEGIN
    ALTER TABLE dbo.LogTracker SWITCH PARTITION @PartitionNumberToArchive TO dbo.LogTracker_temp
    TRUNCATE TABLE dbo.LogTracker_temp
    UPDATE STATISTICS dbo.LogTracker WITH FULLSCAN
    SET @PartitionNumberToArchive=@PartitionNumberToArchive+1
END

The previous code will finish in one o some seconds, instantaneously.

Naturally, the code needs to be scheduled and executed via a SQL Job at the end of each month. To be perfectly honest, not only will this deletion finish in one or two seconds, but also it will not cause any impact on the performance and the Transaction Log will not grow as it does by using traditionally techniques. I hope this post is truly useful for you and you make the most out of it. Let me know any remarks you may have. That is all for now. Thanks for reading.

Sunday 11 December 2016

SQL2016: DROP IF EXISTS and CREATE OR ALTER statements

Beyond all doubt, using old-fashioned ways of manipulating database objects is not only a wasting of time but also boring. Therefore, it is well-known that there are situations where we face with many errors while altering or dropping certain database objects because they do not exist or maybe they already are created. The traditional way of completing this task suggests writing a logic to validate first the existence of the object and then make a decision depending on the result, for instance, we usually do it by querying the object from the 'sys.all_objects' system view. Having said that, I am of the idea that this way of working is always a very unproductive task. Luckily, this situation has changed as now with the arrival of SQL Server 2016 we are able to drop/creater/alter database objects without thinking much about whether they are or not existing objects.

Let me expand on what I am saying. New language features of T-SQL have been introduced in SQL Server 2016 such as DROP IF EXISTS and CREATE OR ALTER statements. With DROP IF EXISTS we can drop some sorts of object and validate their existence in only one statement. The objects supported with DROP IF EXISTS are PROCEDURE, TABLE (included COLUMN and CONSTRAINT), TRIGGER, VIEW, FUNCTION, INDEX, ASSEMBLY, ROLE, AGGREGATE, RULE, TYPE, DATABASE, SCHEMA, USER, DEFAULT, SECURITY POLICY, SEQUENCE, and SYNONYM. Thus we do not need to do something like this any longer.

IF EXISTS (SELECT * FROM sys.objects WHERE name = 'MyProcNumberOne')
 DROP PROCEDURE MyProcNumberOne

Here are some examples of DROP IF EXISTS statement.

DROP PROCEDURE IF EXISTS [dbo].[MyProcNumberOne]
GO
DROP TABLE IF EXISTS [dbo].[MyTableDemo]
GO
DROP VIEW IF EXISTS [dbo].[MyViewTest]
GO
DROP VIEW IF EXISTS [dbo].[MyFunctionTwo]
GO
DROP INDEX IF EXISTS [dbo].[IX_MyTableDemo_04]

Now DROP IF EXISTS working on columns and constraints.

ALTER TABLE [dbo].[MyTableDemo] DROP COLUMN IF EXISTS Col2
GO
ALTER TABLE [dbo].[MyTableDemo] DROP CONSTRAINT IF EXISTS FK_MyTableDemo_10
GO

Now talking of the other new statement CREATE OR ALTER, it supports STORED PROCEDURES (including natively compiled), FUNCTIONS (including natively compiled), TRIGGERS, and VIEWS. So we are not going to need the traditional validation any more.

IF NOT EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[MyProcNumberOne]')
CREATE PROCEDURE dbo.mysproc (@Par1 INT, @Par2 VARCHAR(10), ... )
AS
BEGIN
...
..

CREATE OR ALTER statement is also easy to use. If the object exists then it will be altered, otherwise it will be created. I couldn't have been simpler.

CREATE OR ALTER [dbo].[MyProcNumberOne] (@Par1 INT, @Par2 VARCHAR(10), ... )
AS
BEGIN
...
..

To be perfectly honest, I do think that these new language features are the be-all and end-all as they are practical, simple and allow us to work in the sense of improving the quality of deploying new business functionalities and integration of systems. Surely, you will make the most out of these features. That is all for now. Let me know any remarks you may have. Thanks for reading.

Wednesday 7 December 2016

New per-operator level performance stats included in SQL2016 SP1 and SQL2014 SP2

Traditionally, we were used to getting only basic runtime performance stats per operator regarding to CPU, I/O and Memory, which were not good enough to allow us to automate recollecting tasks of them. I strongly believe that DBAs are always overwhelmingly interested in diagnosing some issues proactively so as to prevent bad-written queries from using hardware resources inefficiently. Luckily, this convoluted situation has been changed radically in SQL Server 2016 and SQL Server 2014 SP2 since Microsoft has included very useful information per operator (and also per thread in the scope of each operator) which can be seen looking into the Showplan XML. For instance, we are going to see that information for the following query (in this case it has run under a serial plan, that is one thread).

This is its actual execution plan.

It is highlighted all what is new in SQL2016, SQL2016 SP1 and SQL2014 SP2. Therefore, we can appreciate CPU and I/O runtime performance stats for the Clustered Index Scan operator on 'PurchaseOrderDetail' table such as I/O reads, CPU time and Elapsed time. In a parallel plan you will be able to see information per thread in each operator.

And now we will see not only CPU and I/O performance stats for the Hash operator but also Memory performance stats.

All this information can also be seen in the Properties window in the context of each operator inside the Actual Execution Plan by using the latest version of SSMS. If you display the detail of each statistic, the detail per thread will be seen when working with parallel execution plan.

Having seen this, not only are we now able to look into useful performance stats by using Showplan XML and SSMS, but also we can create some scripts for administration and optimization purposes in order to pick up queries with poor performance and then optimise them. That is all for now. Let me know any remark you may have. Thanks for reading.