Blog Archives

A viable alternative to dynamic SQL in administration scripts


As a DBA, you probably have your toolbox full of scripts, procedures and functions that you use for the day-to-day administration of your instances.

I’m no exception and my hard drive is full of scripts that I tend to accumulate and never throw away, even if I know I will never need (or find?) them again.

However, my preferred way to organize and maintain my administration scripts is a database called “TOOLS”, which contains all the scripts I regularly use.

One of the challenges involved in keeping the scripts in a database rather than in a script file is the inability to choose a database context for the execution. When a statement is encapsulated in a view, function or stored procedure in a database, every reference to a database-specific object is limited to the database that contains the programmable object itself. The only way to overcome this limitation is the use of dynamic sql.

For instance, if I want to query the name of the tables in a database, I can use the following statement:

SELECT name FROM sys.tables

The statement references a catalog view specific to a single database, so if I enclose it in a stored procedure, the table names returned by this query are those found in the database that contains the stored procedure itself:

USE TOOLS;
GO

CREATE PROCEDURE getTableNames
AS
SELECT name FROM sys.tables;
GO

EXEC getTableNames;

This is usually not an issue, since most stored procedures will not cross the boundaries of the database they are created in. Administration scripts are different, because they are meant as a single entry point to maintain the whole SQL server instance.

In order to let the statement work against a different database, you need to choose one of the following solutions:

  1. dynamic SQL
  2. sp_executesql
  3. marking as a system object
  4. … an alternative way

Each of these techniques has its PROs and its CONs and I will try to describe them in this post.

1. Dynamic SQL

It’s probably the easiest way to solve the issue: you just have to concatenate the database name to the objects names.

USE TOOLS;
GO

ALTER PROCEDURE getTableNames @db_name sysname
AS
BEGIN
    DECLARE @sql nvarchar(max)
    SET @sql = 'SELECT name FROM '+ QUOTENAME(@db_name) +'.sys.tables';

    EXEC(@sql)
END
GO

EXEC getTableNames 'msdb';

PROS:

  • very easy to implement for simple statements

CONS:

  • can rapidly turn to a nightmare with big, complicated statements, as each object must be concatenated with the database name. Different objects have different ways to be related to the database: tables and views can be concatenated directly, while functions such as OBJECT_NAME accept an additional parameter to specify the database name.
  • the statement has to be treated as a string and enclosed in quotes, which means that:
    • quotes must be escaped, and escaped quotes must be escaped again and escaped and re-escaped quotes… ok, you know what I mean
    • no development aids such as intellisense, just-in-time syntax checks and syntax coloring

 

2. sp_executesql

It’s a neater way to avoid concatenating the database name to each object referenced in the statement.

USE TOOLS;
GO

ALTER PROCEDURE getTableNames @db_name sysname
AS
BEGIN

    -- use a @sql variable to store the whole query
    -- without concatenating the database name

    DECLARE @sql nvarchar(max);

    SET @sql = 'SELECT name FROM sys.tables';

    -- concatenate the database name to the
    -- sp_executesql call, just once

    DECLARE @cmd nvarchar(max);

    SET @cmd = 'EXEC '+ QUOTENAME(@db_name) +'.sys.sp_executesql @sql';

    EXEC sp_executesql @cmd, N'@sql nvarchar(max)', @sql

END
GO

EXEC getTableNames 'msdb';

PROS:

  • the dynamic sql is taken as a whole and does not need to be cluttered with multiple concatenations

CONS:

  • needs some more work than a straight concatenation and can be seen as “obscure”
  • suffers from the same issues found with plain dynamic sql, because the statement is, again, treated as a string

3. System object

Nice and easy: every stored procedure you create in the master database with the “sp_” prefix can be executed from any database context.

Using the undocumented stored procedure sp_MS_marksystemobject you can also mark the stored procedure as a “system object” and let it reference the tables in the database from which it is invoked.

USE master;
GO

ALTER PROCEDURE sp_getTableNames
AS
BEGIN
    SELECT name FROM sys.tables
END
GO

EXEC sys.sp_MS_marksystemobject 'sp_getTableNames'
GO

USE msdb;
GO

EXEC sp_getTableNames;

PROS:

  • no need to use dynamic sql

CONS:

  • requires creating objects in the “master” database, which is something I tend to avoid
  • works with stored procedures only (actually, it works with other objects, such as tables and views, but you have to use the “sp_” prefix. The day I will find a view named “sp_getTableNames” in the master database it won’t be safe to stay near me)

An alternative method:

It would be really helpful if we could store the statement we want to execute inside an object that doesn’t involve dynamic sql and doesn’t need to be stored in the master database. In other words, we need a way to get the best of both worlds.

Is there such a solution? Apparently, there isn’t.

The ideal object to store a statement and reuse it later is a view, but there is no way to “execute” a view against a different database. In fact you don’t execute a view, you just select from it, which is quite a different thing.

What you “execute” when you select from a view is the statement in its definition (not really, but let me simplify).

So, what we would need to do is just read the definition from a view and use the statement against the target database. Sounds straightforward, but it’s not.

The definition of a view also contains the “CREATE VIEW” statement and stripping it off is not just as easy as it seems.

Let’s see the issue with an example: I will create a view to query the last update date of the index statistics in a database, using the query from Glenn Berry’s Diagnostic Queries.

USE TOOLS;
GO

-- When were Statistics last updated on all indexes?  (Query 48)
CREATE VIEW statisticsLastUpdate
AS
SELECT
     DB_NAME() AS database_name
    ,o.NAME AS stat_name
    ,i.NAME AS [Index Name]
    ,STATS_DATE(i.[object_id], i.index_id) AS [Statistics Date]
    ,s.auto_created
    ,s.no_recompute
    ,s.user_created
    ,st.row_count
FROM sys.objects AS o WITH (NOLOCK)
INNER JOIN sys.indexes AS i WITH (NOLOCK)
    ON o.[object_id] = i.[object_id]
INNER JOIN sys.stats AS s WITH (NOLOCK)
    ON i.[object_id] = s.[object_id]
    AND i.index_id = s.stats_id
INNER JOIN sys.dm_db_partition_stats AS st WITH (NOLOCK)
    ON o.[object_id] = st.[object_id]
    AND i.[index_id] = st.[index_id]
WHERE o.[type] = 'U';

I just had to remove ORDER BY and OPTION(RECOMPILE) because query hints cannot be used in views.

Querying the object definition returns the whole definition of the view, not only the SELECT statement:

SELECT OBJECT_DEFINITION(OBJECT_ID('statisticsLastUpdate')) AS definition
definition
-------------------------------------------------------------------
-- When were Statistics last updated on all indexes?  (Query 48)
CREATE VIEW statisticsLastUpdate
AS
SELECT
     DB_NAME() AS database_name
    ,o.NAME AS stat_name
    ,i.NAME AS [Index Name]
    ,STATS_DATE(i.[object_id], i.index_id) AS [Statistics Date]
    ,s.auto_created
    ,s.no_recompute

(1 row(s) affected)

In order to extract the SELECT statement, we would need something able to parse (properly!) the view definition and we all know how complex it can be.

Fortunately, SQL Server ships with an undocumented function used in replication that can help solving the problem: its name is fn_replgetparsedddlcmd.

This function accepts some parameters, lightly documented in the code: fn_replgetparsedddlcmd (@ddlcmd, @FirstToken, @objectType, @dbname, @owner, @objname, @targetobject)

Going back to the example, we can use this function to extract the SELECT statement from the view definition:

SELECT master.sys.fn_replgetparsedddlcmd(
    OBJECT_DEFINITION(OBJECT_ID('statisticsLastUpdate'))
        ,'CREATE'
        ,'VIEW'
        ,DB_NAME()
        ,'dbo'
        ,'statisticsLastUpdate'
        ,NULL
) AS statement
statement
---------------------------------------------------------------------
AS
SELECT
     DB_NAME() AS database_name
    ,o.NAME AS stat_name
    ,i.NAME AS [Index Name]
    ,STATS_DATE(i.[object_id], i.index_id) AS [Statistics Date]
    ,s.auto_created
    ,s.no_recompute
    ,s.user_created
    ,st.row_count

(1 row(s) affected)

The text returned by the function still contains the “AS” keyword, but removing it is a no-brainer:

DECLARE @cmd nvarchar(max)
SELECT @cmd = master.sys.fn_replgetparsedddlcmd(
    OBJECT_DEFINITION(OBJECT_ID('statisticsLastUpdate'))
        ,'CREATE'
        ,'VIEW'
        ,DB_NAME()
        ,'dbo'
        ,'statisticsLastUpdate'
        ,NULL
    )

SELECT @cmd = RIGHT(@cmd, LEN(@cmd) - 2) -- Removes "AS"

SELECT @cmd AS statement
statement
-------------------------------------------------------------------

SELECT
     DB_NAME() AS database_name
    ,o.NAME AS stat_name
    ,i.NAME AS [Index Name]
    ,STATS_DATE(i.[object_id], i.index_id) AS [Statistics Date]
    ,s.auto_created
    ,s.no_recompute
    ,s.user_created
    ,st.row_count

(1 row(s) affected)

Now that we are able to read the SELECT statement from a view’s definition, we can execute that statement against any database we like, or even against all the databases in the instance.

-- =============================================
-- Author:      Gianluca Sartori - spaghettidba
-- Create date: 2013-04-16
-- Description: Extracts the view definition
--              and runs the statement in the
--              database specified by @db_name
--              If the target database is a pattern,
--              the statement gets executed against
--              all databases matching the pattern.
-- =============================================

CREATE PROCEDURE [dba_execute_view]
     @view_name sysname
    ,@db_name sysname
AS
BEGIN

SET NOCOUNT,
    XACT_ABORT,
    QUOTED_IDENTIFIER,
    ANSI_NULLS,
    ANSI_PADDING,
    ANSI_WARNINGS,
    ARITHABORT,
    CONCAT_NULL_YIELDS_NULL ON;
    SET NUMERIC_ROUNDABORT OFF;

DECLARE @cmd nvarchar(max)
DECLARE @sql nvarchar(max)

DECLARE @vw_schema sysname
DECLARE @vw_name sysname

IF OBJECT_ID(@view_name) IS NULL
BEGIN
    RAISERROR('No suitable object found for name %s',16,1,@view_name)
    RETURN
END

IF DB_ID(@db_name) IS NULL
    AND @db_name NOT IN ('[USER]','[SYSTEM]')
    AND @db_name IS NOT NULL
BEGIN
    RAISERROR('No suitable database found for name %s',16,1,@view_name)
    RETURN
END

SELECT @vw_schema = OBJECT_SCHEMA_NAME(OBJECT_ID(@view_name)),
    @vw_name = OBJECT_NAME(OBJECT_ID(@view_name))

SELECT @cmd = master.sys.fn_replgetparsedddlcmd(
    OBJECT_DEFINITION(OBJECT_ID(@view_name))
        ,'CREATE'
        ,'VIEW'
        ,DB_NAME()
        ,@vw_schema
        ,@vw_name
        ,NULL
    )

SELECT @cmd = RIGHT(@cmd, LEN(@cmd) - 2) -- Removes "AS"

-- CREATE A TARGET TEMP TABLE
SET @sql = N'
    SELECT TOP(0) * INTO #results FROM ' + @view_name + ';

    INSERT #results
    EXEC [dba_ForEachDB]
        @statement = @cmd,
        @name_pattern = @db_name;

    SELECT * FROM #results;'

EXEC sp_executesql
     @sql
    ,N'@cmd nvarchar(max), @db_name sysname'
    ,@cmd
    ,@db_name

END

The procedure depends on dba_ForEachDB, the stored procedure I posted a couple of years ago that replaces the one shipped by Microsoft. If you still prefer their version, you’re free to modify the code as you wish.

Now that we have a stored procedure that “executes” a view, we can use it to query statistics update information from a different database:

EXEC [dba_execute_view] 'statisticsLastUpdate', 'msdb'

single_db

We could also query the same information from all user databases:

EXEC [dba_execute_view] 'statisticsLastUpdate', '[USER]'

multidb

That’s it, very easy and straightforward.

Just one suggestion for the SELECT statements in the views: add a DB_NAME() column, in order to understand where the data comes from, or it’s going to be a total mess.

Next steps:

This is just the basic idea, the code can be improved in many ways.

For instance, we could add a parameter to decide whether the results must be piped to a temporary table or not. As you probably know, INSERT…EXEC cannot be nested, so you might want to pipe the results to a table in a different way.

Another thing you might want to add is the ability to order the results according to an additional parameter.

To sum it up, with a little help from Microsoft, we can now safely create a database packed with all our administration stuff and execute the queries against any database in our instance.

Moving system databases to the default data and log paths


Recently I had to assess and tune quite a lot of SQL Server instances and one the things that are often overlooked is the location of the system databases.

I often see instance where the system databases are located in the system drives under the SQL Server default installation path, which is bad for many reasons, especially for tempdb.

I had to move the system databases so many times that I ended up coding a script to automate the process.

The script finds all system databases that are not sitting in the default data and log paths and issues the ALTER DATABASE statements needed to move the files to the default paths.

Obviously, to let the script work, the default data and log paths must have been set in the instance properties:

MoveSystemDBs

You may also point out that moving all system databases to the default data and log paths is not always a good idea. And you would be right: for instance, if possible, the tempdb database should be working on a fast dedicated disk. However, very often I find myself dealing with low-end servers where separate data and log disks are a luxury, not to mention a dedicated tempdb disk.  If you are concerned about moving tempd to the default data and log paths, you can modify the script accordingly.

-- =============================================
-- Author:      Gianluca Sartori - spaghettidba
-- Create date: 2013-03-22
-- Description: Moves the system databases to the
--              default data and log paths and 
--              updates SQL Server startup params
--              accordingly.
-- =============================================
SET NOCOUNT ON;

USE master;

-- Find default data and log paths
-- reading from the registry

DECLARE @defaultDataLocation nvarchar(4000)
DECLARE @defaultLogLocation nvarchar(4000)

EXEC master.dbo.xp_instance_regread
    N'HKEY_LOCAL_MACHINE',
    N'Software\Microsoft\MSSQLServer\MSSQLServer',
    N'DefaultData',
    @defaultDataLocation OUTPUT

EXEC master.dbo.xp_instance_regread
    N'HKEY_LOCAL_MACHINE',
    N'Software\Microsoft\MSSQLServer\MSSQLServer',
    N'DefaultLog',
    @defaultLogLocation OUTPUT

-- Loop through all system databases
-- and move to the default data and log paths

DECLARE @sql nvarchar(max)

DECLARE stmts CURSOR STATIC LOCAL FORWARD_ONLY
FOR
SELECT
    ' ALTER DATABASE '+ DB_NAME(database_id) +
    ' MODIFY FILE ( ' +
    '     NAME = '''+ name +''', ' +
    '     FILENAME = '''+
    CASE type_desc
        WHEN 'ROWS' THEN @defaultDataLocation
        ELSE @defaultLogLocation
    END +
    '\'+ RIGHT(physical_name,CHARINDEX('\',REVERSE(physical_name),1)-1) +'''' +
    ' )'
FROM sys.master_files
WHERE DB_NAME(database_id) IN ('master','model','msdb','tempdb')
    AND (
        physical_name NOT LIKE @defaultDataLocation + '%'
        OR physical_name NOT LIKE @defaultLogLocation + '%'
    )

OPEN stmts
FETCH NEXT FROM stmts INTO @sql

WHILE @@FETCH_STATUS = 0
BEGIN

    PRINT @sql
    EXEC(@sql)

    FETCH NEXT FROM stmts INTO @sql
END

CLOSE stmts
DEALLOCATE stmts

-- Update SQL Server startup parameters
-- to reflect the new master data and log
-- files locations

DECLARE @val nvarchar(500)
DECLARE @key nvarchar(100)

DECLARE @regvalues TABLE (
    parameter nvarchar(100),
    value nvarchar(500)
)

INSERT @regvalues
EXEC master.dbo.xp_instance_regenumvalues
        N'HKEY_LOCAL_MACHINE',
        N'SOFTWARE\Microsoft\MSSQLServer\MSSQLServer\Parameters'

DECLARE reg CURSOR STATIC LOCAL FORWARD_ONLY
FOR
SELECT *
FROM @regvalues
WHERE value LIKE '-d%'
    OR value LIKE '-l%'

OPEN reg

FETCH NEXT FROM reg INTO @key, @val

WHILE @@FETCH_STATUS = 0
BEGIN

    IF @val LIKE '-d%'
        SET @val = '-d' + (
            SELECT physical_name
            FROM sys.master_files
            WHERE DB_NAME(database_id) = 'master'
                AND type_desc = 'ROWS'
        )
    IF @val LIKE '-l%'
        SET @val = '-l' + (
            SELECT physical_name
            FROM sys.master_files
            WHERE DB_NAME(database_id) = 'master'
                AND type_desc = 'LOG'
        )

    EXEC master.dbo.xp_instance_regwrite
        N'HKEY_LOCAL_MACHINE',
        N'SOFTWARE\Microsoft\MSSQLServer\MSSQLServer\Parameters',
        @key,
        N'REG_SZ',
        @val

    FETCH NEXT FROM reg INTO @key, @val

END

CLOSE reg
DEALLOCATE reg

After running this script, you can shut down the SQL Server service and move the data and log files to the appropriate locations.

When the files are ready, you can bring SQL Server back online.

BE CAREFUL! Before running this script against a clustered instance, check what the xp_instance_regread commands return: I have seen cases with SQL Server not reading from the appropriate keys.

dba_runCHECKDB v2(012)


If you are one among the many that downloaded my consistency check stored procedure called “dba_RunCHECKDB”, you may have noticed a “small” glitch… it doesn’t work on SQL Server 2012!

This is due to the resultset definition of DBCC CHECKDB, which has changed again in SQL Server 2012. Trying to pipe the results of that command in the table definition for SQL Server 2008 produces a column mismatch and it obviously fails.

Fixing the code is very easy indeed, but I could never find the time to post the corrected version until today.

Also, I had to discover the new table definition for DBCC CHECKDB, and it was not just as easy as it used to be in SQL Server 2008. In fact, a couple of days ago I posted a way to discover the new resultset definition working around the cumbersome metadata discovery feature introduced in SQL Server 2012.

Basically, the new output of DBCC CHECKDB now includes 6 new columns:

    CREATE TABLE ##DBCC_OUTPUT(
        Error int NULL,
        [Level] int NULL,
        State int NULL,
        MessageText nvarchar(2048) NULL,
        RepairLevel nvarchar(22) NULL,
        Status int NULL,
        DbId int NULL, -- was smallint in SQL2005
        DbFragId int NULL,      -- new in SQL2012
        ObjectId int NULL,
        IndexId int NULL,
        PartitionId bigint NULL,
        AllocUnitId bigint NULL,
        RidDbId smallint NULL,  -- new in SQL2012
        RidPruId smallint NULL, -- new in SQL2012
        [File] smallint NULL,
        Page int NULL,
        Slot int NULL,
        RefDbId smallint NULL,  -- new in SQL2012
        RefPruId smallint NULL, -- new in SQL2012
        RefFile smallint NULL,  -- new in SQL2012
        RefPage int NULL,
        RefSlot int NULL,
        Allocation smallint NULL
    )

If you Google the name of one of these new columns, you will probably find a lot of blog posts (no official documentation, unfortunately) that describes the new output of DBCC CHECKDB, but none of them is strictly correct: all of them indicate the smallint columns as int.

Not a big deal, actually, but still incorrect.

I will refrain from posting the whole procedure here: I updated the code in the original post, that you can find clicking here. You can also download the code from the Code Repository.

As usual, suggestions and comments are welcome.

Discovering resultset definition of DBCC commands in SQL Server 2012


Back in 2011 I showed a method to discover the resultset definition of DBCC undocumented commands.

At the time, SQL Server 2012 had not been released yet and nothing suggested that the linked server trick could stop working on the new major version. Surprisingly enough it did.

If you try to run the same code showed in that old post on a 2012 instance, you will get a quite explicit error message:

DECLARE @srv nvarchar(4000);
SET @srv = @@SERVERNAME; -- gather this server name

-- Create the linked server
EXEC master.dbo.sp_addlinkedserver
    @server     = N'LOOPBACK',
    @srvproduct = N'SQLServ', -- it’s not a typo: it can’t be “SQLServer”
    @provider   = N'SQLNCLI', -- change to SQLOLEDB for SQLServer 2000
    @datasrc    = @srv;

-- Set the authentication to "current security context"
EXEC master.dbo.sp_addlinkedsrvlogin
    @rmtsrvname  = N'LOOPBACK',
    @useself     = N'True',
    @locallogin  = NULL,
    @rmtuser     = NULL,
    @rmtpassword = NULL;

USE tempdb;
GO

CREATE PROCEDURE loginfo
AS
BEGIN
    SET NOCOUNT ON;

    DBCC LOGINFO();
END
GO

SELECT *
INTO tempdb.dbo.loginfo_output
FROM OPENQUERY(LOOPBACK, 'SET FMTONLY OFF; EXEC tempdb.dbo.loginfo');

DROP PROCEDURE loginfo;
GO
Msg 11528, Level 16, State 1, Procedure sp_describe_first_result_set, Line 1
The metadata could not be determined because statement 'DBCC LOGINFO();' in procedure 'loginfo' does not support metadata discovery.

This behaviour has to do with the way SQL Server 2012 tries to discover metadata at parse/bind time, when the resultset is not available yet for DBCC commands.

Fortunately, there is still a way to discover metadata when you have a SQL Server instance of a previous version available.

On my laptop I have a 2008R2 instance I can use to query the 2012 instance with a linked server:

-- run this on the 2008R2 instance
USE [master]
GO

EXEC master.dbo.sp_addlinkedserver
     @server = N'LOCALHOST\SQL2012'
    ,@srvproduct = N'SQL Server'

EXEC master.dbo.sp_addlinkedsrvlogin
     @rmtsrvname = N'LOCALHOST\SQL2012'
    ,@useself = N'True'
    ,@locallogin = NULL
    ,@rmtuser = NULL
    ,@rmtpassword = NULL
GO

SELECT *
INTO tempdb.dbo.loginfo_output
FROM OPENQUERY([LOCALHOST\SQL2012], 'SET FMTONLY OFF; EXEC tempdb.dbo.loginfo');
GO

This code pipes the results of the DBCC command into a table in the tempdb database in my 2008R2 instance. The table can now be scripted using SSMS:

Using the 2008R2 instance as a “Trojan Horse” for the metadata discovery, you can see that the resultset definition of DBCC LOGINFO() has changed again in SQL Server 2012:

CREATE TABLE [dbo].[loginfo_output](
	[RecoveryUnitId] [int] NULL,  -- new in SQL2012
	[FileId] [int] NULL,
	[FileSize] [bigint] NULL,
	[StartOffset] [bigint] NULL,
	[FSeqNo] [int] NULL,
	[Status] [int] NULL,
	[Parity] [tinyint] NULL,
	[CreateLSN] [numeric](25, 0) NULL
)

This trick will be particularly useful for an upcoming (and long overdue) post, so… stay tuned!

Using QUERYTRACEON in plan guides


Yesterday the CSS team made the QUERYTRACEON hint publicly documented.

This means that now it’s officially supported and you can use it in production code.

After reading the post on the CSS blog, I started to wonder whether there is some actual use in production for this query hint, given that it requires the same privileges as DBCC TRACEON, which means you have to be a member of the sysadmin role.

In fact, if you try to use that hint when connected as a low privileged user, you get a very precise error message, that leaves no room for interpretation:

SELECT *
FROM [AdventureWorks2012].[Person].[Person]
OPTION (QUERYTRACEON 4199)

Msg 2571, Level 14, State 3, Line 1
User ‘guest’ does not have permission to run DBCC TRACEON.

How can a query hint available to sysadmins only be possibly useful for production?

I posted my doubt on Twitter using the #sqlhelp hashtag and got interesting replies from Paul Randal, Paul White and Robert Davis.

My concerns were not about the usefulness of the hint per se, but about the usefulness in production code. Often 140 chars are not enough when you want to express your thoughts clearly, in fact I decided to write this blog post to clarify what I mean.

As we have seen, the QUERYTRACEON query hint cannot be used directly by users not in the sysadmin role, but it can be used in stored procedures with “EXECUTE AS owner” and in plan guides.

While it’s completely clear what happens when the hint is used in procedures executed in the context of the owner, what happens in plan guides is not so obvious (at least, not to me). In fact, given that the secuirty context is not changed when the plan guide is matched and applied, I would have expected it to fail miserably when executed by a low privileged user, but it’s not the case.

Let’s try and see what happens:

First of all we need a query “complex enough” to let the optimizer take plan guides into account. A straight “SELECT * FROM table” and anything else that results in a trivial plan won’t be enough.

SELECT *
FROM [Person].[Person] AS P
INNER JOIN [Person].[PersonPhone] AS H
    ON P.BusinessEntityID = H.BusinessEntityID
INNER JOIN [Person].[BusinessEntity] AS BE
    ON P.BusinessEntityID = BE.BusinessEntityID
INNER JOIN [Person].[BusinessEntityAddress] AS BEA
    ON BE.BusinessEntityID = BEA.BusinessEntityID
WHERE BEA.ModifiedDate > '20080101'

Then we need a plan guide to apply the QUERYTRACEON hint:

EXEC sp_create_plan_guide @name = N'[querytraceon]', @stmt = N'SELECT *
FROM [Person].[Person] AS P
INNER JOIN [Person].[PersonPhone] AS H
	ON P.BusinessEntityID = H.BusinessEntityID
INNER JOIN [Person].[BusinessEntity] AS BE
	ON P.BusinessEntityID = BE.BusinessEntityID
INNER JOIN [Person].[BusinessEntityAddress] AS BEA
	ON BE.BusinessEntityID = BEA.BusinessEntityID
WHERE BEA.ModifiedDate > ''20080101''', @type = N'SQL', @hints = N'OPTION (QUERYTRACEON 4199)'

If we enable the plan guide and try to issue this query in the context of a low privileged user, we can see no errors thrown any more:

CREATE LOGIN testlogin WITH PASSWORD = 'testlogin123';
GO
USE AdventureWorks2012;
GO
CREATE USER testlogin FOR LOGIN testlogin;
GO
GRANT SELECT TO testlogin;
GO
EXECUTE AS USER = 'testlogin';
GO
SELECT *
FROM [Person].[Person] AS P
INNER JOIN [Person].[PersonPhone] AS H
    ON P.BusinessEntityID = H.BusinessEntityID
INNER JOIN [Person].[BusinessEntity] AS BE
    ON P.BusinessEntityID = BE.BusinessEntityID
INNER JOIN [Person].[BusinessEntityAddress] AS BEA
    ON BE.BusinessEntityID = BEA.BusinessEntityID
WHERE BEA.ModifiedDate > '20080101';
GO
REVERT;
GO

If we open a profiler trace and capture the “Plan Guide Successful” and “Plan Guide Unsuccessful” events, we can see that the optimizer matches the plan guide and enforces the use of the query hint.

Lesson learned: even if  users are not allowed to issue that particular query hint directly, adding it to a plan guide is a way to let anyone use it indirectly.

Bottom line is OPTION QUERYTRACEON can indeed be very useful when we identify some queries that obtain a decent query plan only when a specific trace flag is active and we don’t want to enable it for the whole instance. In those cases, a plan guide or a stored procedure in the owner’s context can be the answer.

Replay a T-SQL batch against all databases


It’s been quite a lot since I last posted on this blog and I apologize with my readers, both of them :-) .

Today I would like to share with you a handy script I coded recently during a SQL Server health check. One of the tools I find immensely valuable for conducting a SQL Server assessment is Glenn Berry’s SQL Server Diagnostic Information Queries. The script contains several queries that can help you collect and analyze a whole lot of information about a SQL Server instance and I use it quite a lot.

The script comes with a blank results spreadsheet, that can be used to save the information gathered by the individual queries. Basically, the spreadsheet is organized in tabs, one for each query and has no preformatted column names, so that you can run the query, select the whole results grid, copy with headers and paste everything to the appropriate tab.

When working with multiple instances, SSMS can help automating this task with multiserver queries. Depending on your SSMS settings, the results of a multiserver query can be merged into a single grid, with an additional column holding the server name.

This feature is very handy, because it lets you run a statement against multiple servers without changing the statement itself.

This works very well for the queries in the first part of Glenn Berry’s script, which is dedicated to instance-level checks. The second part of the script is database-specific and you have to repeat the run+copy+paste process for each database in your instance.

It would be great if there was a feature in SSMS that allowed you to obtain the same results as the multiserver queries, scaled down to the database level. Unfortunately, SSMS has no such feature and  the only possible solution is to code it yourself… or borrow my script!

Before rushing to the code, let’s describe briefly the idea behind and the challenges involved.

It would be quite easy to take a single statement and use it with sp_MsForEachDB, but this solution has several shortcomings:

  • The results would display as individual grids
  • There would be no easy way to determine which results grid belongs to which database
  • The statement would have to be surrounded with quotes and existing quotes would have to be doubled, with an increased and unwanted complexity

The ideal tool for this task should simply take a statement and run it against all [user] databases without modifying the statement at all, merge the results in a single result set and add an additional column to hold the database name. Apparently, sp_MSForEachDB, besides being undocumented and potentially nasty, is not the right tool for the job.

That said, the only option left is to capture the statement from its query window, combining a trace, a loopback linked server and various other tricks.

Here’s the code:


-- =============================================
-- Author:      Gianluca Sartori - @spaghettidba
-- Create date: 2012-06-26
-- Description: Records statements to replay
--              against all databases.
-- =============================================
CREATE PROCEDURE replay_statements_on_each_db
    @action varchar(10) = 'RECORD',
    @start_statement_id int = NULL,
    @end_statement_id   int = NULL
AS
BEGIN

    SET NOCOUNT ON;

    DECLARE @TraceFile nvarchar(256);
    DECLARE @TraceFileNoExt nvarchar(256);
    DECLARE @LastPathSeparator int;
    DECLARE @TracePath nvarchar(256);
    DECLARE @TraceID int;
    DECLARE @fs bigint = 5;
    DECLARE @r int;
    DECLARE @spiid int = @@SPID;
    DECLARE @srv nvarchar(4000);
    DECLARE @ErrorMessage nvarchar(4000);
    DECLARE @ErrorSeverity int;
    DECLARE @ErrorState int;
    DECLARE @sql nvarchar(max);
    DECLARE @statement nvarchar(max);
    DECLARE @column_list nvarchar(max);

    IF @action NOT IN ('RECORD','STOPRECORD','SHOWQUERY','REPLAY')
        RAISERROR('A valid @action (RECORD,STOPRECORD,SHOWQUERY,REPLAY) must be specified.',16,1)

    -- *********************************************** --
    -- *                 RECORD                      * --
    -- *********************************************** --
    IF @action = 'RECORD'
    BEGIN

        BEGIN TRY

            -- Identify the path of the default trace
            SELECT @TraceFile = path
            FROM master.sys.traces
            WHERE id = 1

            -- Split the directory / filename parts of the path
            SELECT @LastPathSeparator = MAX(number)
            FROM master.dbo.spt_values
            WHERE type = 'P'
                  AND number BETWEEN 1 AND LEN(@tracefile)
                  AND CHARINDEX('\', @TraceFile, number) = number
            --' fix WordPress's sql parser quirks'

            SELECT @TraceFile =
                  SUBSTRING(
                         @TraceFile
                        ,1
                        ,@LastPathSeparator
                  )
                  + 'REPLAY_'
                  + CONVERT(char(8),GETDATE(),112)
                  + REPLACE(CONVERT(varchar(8),GETDATE(),108),':','')
                  + '.trc'

            SET @TraceFileNoExt = REPLACE(@TraceFile,N'.trc',N'')

            -- create trace
            EXEC sp_trace_create @TraceID OUTPUT, 0, @TraceFileNoExt, @fs, NULL;

            --add filters and events
            EXEC sp_trace_setevent @TraceID, 41, 1, 1;
            EXEC sp_trace_setevent @TraceID, 41, 12, 1;
            EXEC sp_trace_setevent @TraceID, 41, 13, 1;

            EXEC sp_trace_setfilter @TraceID, 1, 0, 7, N'%fn_trace_gettable%'
            EXEC sp_trace_setfilter @TraceID, 1, 0, 7, N'%replay_statements_on_each_db%'
            EXEC sp_trace_setfilter @TraceID, 12, 0, 0, @spiid

            --start the trace
            EXEC sp_trace_setstatus @TraceID, 1

            --create a global temporary table to store the statements
            IF OBJECT_ID('tempdb..##replay_info') IS NOT NULL
                DROP TABLE ##replay_info;

            CREATE TABLE ##replay_info (
                trace_id int,
                statement_id int,
                statement_text nvarchar(max)
            );

            --save the trace id in the global temp table
            INSERT INTO ##replay_info (trace_id) VALUES(@TraceID);

        END TRY
        BEGIN CATCH

            --cleanup the trace
            IF EXISTS( SELECT 1 FROM sys.traces WHERE id = @TraceId AND status = 1 ) EXEC sp_trace_setstatus @TraceID, 0;
            IF EXISTS( SELECT 1 FROM sys.traces WHERE id = @TraceId AND status = 0 ) EXEC sp_trace_setstatus @TraceID, 2;

            IF OBJECT_ID('tempdb..##replay_info') IS NOT NULL
                DROP TABLE ##replay_info;

            SELECT @ErrorMessage  = ERROR_MESSAGE(),
                   @ErrorSeverity = ERROR_SEVERITY(),
                   @ErrorState    = ERROR_STATE();

            RAISERROR(@ErrorMessage, @ErrorSeverity, @ErrorState);

        END CATCH

    END

    -- *********************************************** --
    -- *              STOP RECORDING                 * --
    -- *********************************************** --
    IF @action = 'STOPRECORD'
    BEGIN

        BEGIN TRY

            -- gather the trace id
            SELECT @TraceID = trace_id
            FROM ##replay_info;

            IF @TraceId IS NULL
                RAISERROR('No data has been recorded!',16,1)

            DELETE FROM ##replay_info;

            -- identify the trace file
            SELECT TOP(1) @TraceFile = path
            FROM sys.traces
            WHERE path like '%REPLAY[_]______________.trc'
            ORDER BY id DESC

            -- populate the global temporary table with
            -- the statements recorded in the
            INSERT INTO ##replay_info
            SELECT @TraceID,
                ROW_NUMBER() OVER(ORDER BY (SELECT NULL)),
                TextData
            FROM fn_trace_gettable(@traceFile, DEFAULT)
            WHERE TextData IS NOT NULL;

            --stop and deltete the trace
            IF EXISTS( SELECT 1 FROM sys.traces WHERE id = @TraceId AND status = 1 ) EXEC sp_trace_setstatus @TraceID, 0;
            IF EXISTS( SELECT 1 FROM sys.traces WHERE id = @TraceId AND status = 0 ) EXEC sp_trace_setstatus @TraceID, 2;

        END TRY
        BEGIN CATCH

            --stop and deltete the trace
            IF EXISTS( SELECT 1 FROM sys.traces WHERE id = @TraceId AND status = 1 ) EXEC sp_trace_setstatus @TraceID, 0;
            IF EXISTS( SELECT 1 FROM sys.traces WHERE id = @TraceId AND status = 0 ) EXEC sp_trace_setstatus @TraceID, 2;

            SELECT @ErrorMessage  = ERROR_MESSAGE(),
                   @ErrorSeverity = ERROR_SEVERITY(),
                   @ErrorState    = ERROR_STATE();

            RAISERROR(@ErrorMessage, @ErrorSeverity, @ErrorState);

        END CATCH

    END

    -- *********************************************** --
    -- *           SHOW COLLECTED QUERIES            * --
    -- *********************************************** --
    IF @action = 'SHOWQUERY'
    BEGIN
        BEGIN TRY

            IF OBJECT_ID('tempdb..##replay_info') IS NULL
                RAISERROR('No data has been recorded yet',16,1);

            SET @sql = 'SELECT statement_id, statement_text FROM ##replay_info ';

            IF @start_statement_id IS NOT NULL AND @end_statement_id IS NULL
                SET @sql = @sql + ' WHERE statement_id = @start_statement_id ';

            IF @start_statement_id IS NOT NULL AND @end_statement_id IS NOT NULL
                SET @sql = @sql + ' WHERE statement_id
                                    BETWEEN @start_statement_id AND @end_statement_id';

            EXEC sp_executesql
                 @sql
                ,N'@start_statement_id int, @end_statement_id int'
                ,@start_statement_id
                ,@end_statement_id;

        END TRY
        BEGIN CATCH
            SELECT @ErrorMessage  = ERROR_MESSAGE(),
                   @ErrorSeverity = ERROR_SEVERITY(),
                   @ErrorState    = ERROR_STATE();

            RAISERROR(@ErrorMessage, @ErrorSeverity, @ErrorState);
        END CATCH
    END

    -- *********************************************** --
    -- *                 REPLAY                      * --
    -- *********************************************** --
    IF @action = 'REPLAY'
    BEGIN

        BEGIN TRY

            --load the selected statement(s)
            SET @statement = '
                SET @sql = ''''
                SELECT @sql += statement_text + '' ''
                FROM ##replay_info
            ';

            IF @start_statement_id IS NOT NULL AND @end_statement_id IS NULL
                SET @statement =
                    @statement
                    + ' WHERE statement_id = @start_statement_id ';

            IF @start_statement_id IS NOT NULL AND @end_statement_id IS NOT NULL
                SET @statement =
                    @statement
                    + ' WHERE statement_id
                        BETWEEN @start_statement_id AND @end_statement_id';

            EXEC sp_executesql
                 @statement
                ,N'@start_statement_id int, @end_statement_id int, @sql nvarchar(max) OUTPUT'
                ,@start_statement_id
                ,@end_statement_id
                ,@sql OUTPUT;

            IF NULLIF(LTRIM(@sql),'') IS NULL
                RAISERROR('Unable to locate the statement(s) specified.',16,1)

            SET @srv = @@SERVERNAME; -- gather this server name

            IF EXISTS (SELECT * FROM sys.servers WHERE name = 'TMPLOOPBACK')
                EXEC sp_dropserver 'TMPLOOPBACK';

            -- Create a loopback linked server
            EXEC master.dbo.sp_addlinkedserver
                @server     = N'TMPLOOPBACK',
                @srvproduct = N'SQLServ', -- it’s not a typo: it can’t be “SQLServer”
                @provider   = N'SQLNCLI', -- change to SQLOLEDB for SQLServer 2000
                @datasrc    = @srv;

            -- Set the authentication to "current security context"
            EXEC master.dbo.sp_addlinkedsrvlogin
                @rmtsrvname  = N'TMPLOOPBACK',
                @useself     = N'True',
                @locallogin  = NULL,
                @rmtuser     = NULL,
                @rmtpassword = NULL;

            -- Use a permanent table in Tempdb to store the output
            IF OBJECT_ID('tempdb..___outputTable') IS NOT NULL
                DROP TABLE tempdb..___outputTable;

            -- Execute the statement in Tempdb to discover the column definition
            SET @statement = '
                SELECT TOP(0) *
                INTO tempdb..___outputTable
                FROM OPENQUERY(TMPLOOPBACK,''
                    SET FMTONLY OFF; EXEC tempdb.sys.sp_executesql N''''' + REPLACE(@sql,'''','''''''''') + '''''
                '')
            ';

            EXEC(@statement);

            SET @statement = @sql;

            -- Build the column list of the output table
            SET @column_list = STUFF((
                SELECT ',' + QUOTENAME(C.name)
                FROM tempdb.sys.columns AS C
                INNER JOIN tempdb.sys.tables AS T
                    ON C.object_id = T.object_id
                WHERE T.name = '___outputTable'
                FOR XML PATH('')
            ),1,1,SPACE(0));

            -- Add a "Database Name" column
            ALTER TABLE tempdb..___outputTable ADD Database__Name sysname;

            -- Build a sql statement to execute
            -- the recorded statement against all databases
            SET @sql =
                'N''INSERT tempdb..___outputTable(' + @column_list + ') EXEC(@statement); UPDATE tempdb..___outputTable SET Database__Name = DB_NAME() WHERE Database__Name IS NULL;''';

            -- Build a statement to execute on each database context
            ;WITH dbs AS (
                SELECT *,
                    system_db = CASE WHEN name IN ('master','model','msdb','tempdb') THEN 1 ELSE 0 END
                FROM sys.databases
                WHERE   DATABASEPROPERTY(name, 'IsSingleUser') = 0
                    AND HAS_DBACCESS(name) = 1
                    AND state_desc = 'ONLINE'
            )
            SELECT @sql = (
                SELECT
                    'EXEC ' + QUOTENAME(name) + '.sys.sp_executesql ' +
                        @sql + ',' +
                        'N''@statement nvarchar(max)'',' +
                        '@statement;' + char(10) AS [text()]
                FROM dbs
                ORDER BY name
                FOR XML PATH('')
            );

            -- Execute multi-db sql and pass in the actual statement
            EXEC sp_executeSQL @sql, N'@statement nvarchar(max)', @statement

            --
            SET @sql = '
                SELECT Database__Name AS [Database  Name], ' + @column_list + '
                FROM tempdb..___outputTable
                ORDER BY 1;
            '

            EXEC sp_executesql @sql;

            EXEC tempdb.sys.sp_executesql N'DROP TABLE ___outputTable';

        END TRY
        BEGIN CATCH
            SELECT @ErrorMessage  = ERROR_MESSAGE(),
                   @ErrorSeverity = ERROR_SEVERITY(),
                   @ErrorState    = ERROR_STATE();

            RAISERROR(@ErrorMessage, @ErrorSeverity, @ErrorState);
        END CATCH

    END

END

As you can see, the code creates a stored procedure that accepts a parameter named @action, which is used to determine what the procedure should do. Specialized sections of the procedure handle every possible value for the parameter, with the following logic:

First of all you start recording, then you execute the statements to repeat on each database, then you stop recording. From that moment on, you can enumerate the statements captured and execute them, passing a specific statement id or a range of ids.

The typical use of the procedure could look like this:


-- start recording
EXECUTE replay_statements_on_each_db
    @action = 'RECORD'

-- run the statements you want to replay
SELECT DATABASEPROPERTYEX(DB_NAME(),'Recovery') AS RecoveryModel

-- stop recording
EXECUTE replay_statements_on_each_db
    @action = 'STOPRECORD'

-- display captured statements
EXECUTE replay_statements_on_each_db
    @action = 'SHOWQUERY'

-- execute the first statement
EXECUTE replay_statements_on_each_db
    @action             = 'REPLAY',
    @start_statement_id = 1,
    @end_statement_id   = 1

You can see the results of the script execution here:

Obviuosly this approach is totally overkill for just selecting the database recovery model, but it can become very handy when the statement’s complexity raises.

This seems a perfect fit for Glen Berry’s diagnostic queries, which is where we started from. You can go back to that script and add the record instructions just before the database specific queries start:

At the end of the script you can add the instructions to stop recording and show the queries captured by the procedure.

Once the statements are recorded, you can run any of the statements against all databases. For instance, I decided to run the top active writes index query (query 51).

As expected, the procedure adds the database name column to the result set and then displays the merged results.

You may have noticed that I skipped the first statement in the database-specific section of the script, which is a DBCC command. Unfortunately, not all kind of statement can be captured with this procedure, because some limitations apply. Besides the inability to capture some DBCC commands, please note that the column names must be explicitly set.

I think that a CLR procedure could overcome these limitations, or at least some of them. I hope I will find the time to try the CLR method soon and I promise I will blog the results.

SQL Server and Custom Date Formats


Today SQL Server Central is featuring my article Dealing with custom date formats in T-SQL.

There’s a lot of code on that page and I thought that making it available for download would make it easier to play with.

You can download the code from this page or from the Code Repository.

I was also asked to include a performance chart for the different methods included in the article. Here’s a quick’n'dirty Excel bar chart (I didn’t include the recursive iTVF for the sake of readability).
 

I hope you enjoy reading the article as much as I enjoyed writing it.

How to Eat a SQL Elephant in 10 Bites


One byte at a time, obviously!

No elephants were harmed during photoshopping.

Sometimes, when you have to optimize a poor performing query, you may find yourself staring at a huge statement, wondering where to start.

Some developers think that a single elephant statement is better than multiple small statements, but this is not always the case.

Let’s try to look from the perspective of software quality:

  • Efficiency
    The optimizer will likely come up with a suboptimal plan, giving up early on optimizations and transformations.
  • Reliability
    Any slight change in statistics could lead the optimizer to produce a different and less efficient plan.
  • Maintainability
    A single huge statement is less readable and maintainable than multiple small statements.

With those points in mind, the only sensible thing to do is cut the elephant into smaller pieces and eat them one at a time.

What should I do with this query??

This is how I do it:

  1. Lay out the original code and read the statement carefully
  2. Decide whether a full rewrite is more convenient
  3. Set up a test environment
  4. Identify the query parts
    • Identify the main tables
    • Identify non correlated subqueries and UNIONs
    • Identify correlated subqueries
  5. Write a query outline
  6. Break the statement into parts with CTEs, views, functions and temporary tables
  7. Merge redundant subqueries
  8. Put it all together
  9. Verify the output based on multiple different input values
  10. Comment your work thoroughly
The queries you will find in the pictures are (in very small part) a MySQL stored procedure I had to rewrite recently, so don’t try to run them in SQL Server. The syntax may be different, but the method still stands.

1.     Lay out the original code and read the statement carefully

Use one of the many SQL formatters you can find online. My favorite one is Tao Klerk’s Poor Man’s T-SQL Formatter: it’s very easy to use and configure and it comes with a handy SSMS add-in and plugins for Notepad++ and WinMerge. Moreover, it’s free and open source. A must-have.

Looks much better now.

Once your code is readable, don’t rush to the keyboard: take your time and read it carefully.

  • Do you understand (more or less) what it is supposed to do?
  • Do you think you could have coded it yourself?
  • Do you know all the T-SQL constructs it contains?

If you answered “yes” to all the above, you’re ready to go to the next step.

2.     Decide whether a full rewrite is more convenient

OK, that code sucks and you have to do something. It’s time to make a decision:

  1. Take the business rules behind the statement and rewrite it from scratch
    When the statement is too complicated and unreadable, it might be less time-consuming to throw the old statement away and write your own version.
    Usually it is quite easy when you know exactly what the code is supposed to do. Just make sure you’re not making wrong assumptions and be prepared to compare your query with the original one many times.
  2. Refactor the statement
    When the business rules are unclear (or unknown) starting from scratch is not an option. No, don’t laugh! The business logic may have been buried in the sands of time or simply you may be working on a query without any will to understand the business processes behind it.
    Bring a big knife: you’re going to cut the elephant in pieces.
  3. Leave the statement unchanged
    Sometimes the statement is too big or too complicated to bother taking the time to rewrite it. For instance, this query would take months to rewrite manually.
    It works? Great: leave it alone.

3.     Set up a test environment

It doesn’t matter how you decide to do it: at the end of the day you will have to compare the results of your rewritten query with the results of the “elephant” and make sure you did not introduce errors in your code.

The best way to do this is to prepare a script that compares the results of the original query with the results of your rewritten version. This is the script I am using (you will find it in the code repository, as usual).

-- =============================================
-- Author:      Gianluca Sartori - spaghettidba
-- Create date: 2012-03-14
-- Description: Runs two T-SQL statements and 
--              compares the results
-- =============================================

-- Drop temporary tables
IF OBJECT_ID('tempdb..#original') IS NOT NULL 
    DROP TABLE #original;

IF OBJECT_ID('tempdb..#rewritten') IS NOT NULL 
    DROP TABLE #rewritten;

-- Store the results of the original 
-- query into a temporary table
WITH original AS (
    <original, text, >
)
SELECT *
INTO #original
FROM original;

-- Add a sort column
ALTER TABLE #original ADD [______sortcolumn] int identity(1,1);



-- Store the results of the rewritten 
-- query into a temporary table
WITH rewritten AS (
    <rewritten, text, >
)
SELECT *
INTO #rewritten
FROM rewritten;

-- Add a sort column
ALTER TABLE #rewritten ADD [______sortcolumn] int identity(1,1);


-- Compare the results
SELECT 'original' AS source, *
FROM (
    SELECT * 
    FROM #original 
    
    EXCEPT 
    
    SELECT * 
    FROM #rewritten
) AS A

UNION ALL

SELECT 'rewritten' AS source, *
FROM (
    SELECT * 
    FROM #rewritten 
    
    EXCEPT 
    
    SELECT * 
    FROM #original
) AS B;

The script is a SSMS query template that takes the results of the original and the rewritten query and compares the resultsets, returning all the missing or different rows. The script uses two CTEs to wrap the two queries: this means that the ORDER BY predicate (if any) will have to be moved outside the CTE.

Also, the results of the two queries are piped to temporary tables, which means that you can’t have duplicate column names in the result set.

Another thing worth noting is that the statements to compare cannot be stored procedures. One simple way to overcome this limitation is to use the technique I described in this post.

The queries inside the CTEs should then be rewritten as:

SELECT *
FROM OPENQUERY(LOOPBACK,'<original, text,>')

Obviously, all the quotes must be doubled, which is the reason why I didn’t set up the script this way in the first place. It’s annoying, but it’s the only way I know of to pipe the output of a stored procedure into a temporary table without knowing the resultset definition in advance. If you can do better, suggestions are always welcome.

4.     Identify the query parts

OK, now you have everything ready and you can start eating the elephant. The first thing to do is to identify all the autonomous blocks in the query and give them a name. You can do this at any granularity and repeat the task as many times as you like: the important thing is that at the end of this process you have a list of query parts and a name for each part.

Identify the main parts and give them a name.

Identify the main tables

Usually I like the idea that the data comes from one “main” table and all the rest comes from correlated tables. For instance, if I have to return a resultset containing some columns from the “SalesOrderHeader” table and some columns from the “SalesOrderDetail” table, I consider SalesOrderHeader the main table and SalesOrderHeader a correlated table. It fits well with my mindset, but you are free to see things the way you prefer.

Probably these tables are already identified by an alias: note down the aliases and move on.

Identify non correlated subqueries and UNIONs

Non-correlated subqueries are considered as inline views. Often these subqueries are joined to the main tables to enrich the resultset with additional columns.

Don’t be scared away by huge subqueries: you can always repeat all the steps for any single subquery and rewrite it to be more compact and readable.

Again, just note down the aliases and move to the next step.

Identify correlated subqueries

Correlated subqueries are not different from non-correlated subqueries, with the exception that you will have less freedom to move them from their current position in the query. However, that difference doesn’t matter for the moment: give them a name and note it down.

5.     Write a query outline

Use the names you identified in the previous step and write a query outline. It won’t execute, but it gives you the big picture.

Won't execute, but describes what the query does.

If you really want the big picture, print the query. It may seem crazy, but sometimes I find it useful to be able to see the query as a whole, with all the parts with their names highlighted in different colors.

A touch of colour for my office.

Yes, that’s a single SELECT statement, printed in Courier new 8 pt. on 9 letter sheets, hanging on the wall in my office.

6.     Break the statement in parts with CTEs, views, functions and temporary tables

SQL Server offers a fair amount of tools that allow breaking a single statement into parts:

  • Common Table Expressions
  • Subqueries
  • Views
  • Inline Table Valued Functions
  • Multi-Statement Table Valued Functions
  • Stored procedures
  • Temporary Tables
  • Table Variables

Ideally, you will choose the one that performs best in your scenario, but you could also take usability and modularity into account.

CTEs and subqueries are a good choice when the statement they contain is not used elsewhere and there is no need to reuse that code.

Table Valued functions and views, on the contrary, are most suitable when there is an actual need to incapsulate the code in modules to be reused in multiple places.

Generally speaking, you will use temporary tables or table variables when the subquery gets used more than once in the statement, thus reducing the load.

A place for everything and everything in its place.

Though I would really like to go into deeper details on the performance pros and cons of each construct, that would take an insane amount of time and space. You can find a number of articles and blogs on those topics and I will refrain from echoing them here.

7.     Merge redundant subqueries

Some parts of your query may be redundant and you may have the opportunity to merge those parts. The merged query will be more compact and will likely perform significantly better.

For instance, you could have multiple subqueries that perform aggregate calculations on the same row set:

SELECT ProductID
    ,Name
    ,AverageSellOutPrice = (
        SELECT AVG(UnitPrice)
        FROM Sales.SalesOrderDetail
        WHERE ProductID = PR.ProductID
    )
    ,MinimumSellOutPrice = (
        SELECT MIN(UnitPrice)
        FROM Sales.SalesOrderDetail
        WHERE ProductID = PR.ProductID
    )
    ,MaximumSellOutPrice = (
        SELECT MAX(UnitPrice)
        FROM Sales.SalesOrderDetail
        WHERE ProductID = PR.ProductID
    )
FROM Production.Product AS PR;

The above query can be rewritten easily to avoid hitting the SalesOrderDetail table multiple times:

SELECT ProductID
    ,Name
    ,AverageSellOutPrice
    ,MinimumSellOutPrice
    ,MaximumSellOutPrice
FROM Production.Product AS PR
CROSS APPLY (
    SELECT AVG(UnitPrice), MIN(UnitPrice), MAX(UnitPrice)
    FROM Sales.SalesOrderDetail
    WHERE ProductID = PR.ProductID
) AS SellOuPrices (AverageSellOutPrice, MinimumSellOutPrice, MaximumSellOutPrice);

Another typical situation where you can merge some parts is when multiple subqueries perform counts on slightly different row sets:

SELECT ProductID
    ,Name
    ,OnlineOrders = (
        SELECT COUNT(*)
        FROM Sales.SalesOrderHeader AS SOH
        WHERE SOH.OnlineOrderFlag = 1
            AND EXISTS (
                SELECT *
                FROM Sales.SalesOrderDetail
                WHERE SalesOrderID = SOH.SalesOrderID
                    AND ProductID = PR.ProductID
            )
    )
    ,OfflineOrders = (
        SELECT COUNT(*)
        FROM Sales.SalesOrderHeader AS SOH
        WHERE SOH.OnlineOrderFlag = 0
            AND EXISTS (
                SELECT *
                FROM Sales.SalesOrderDetail
                WHERE SalesOrderID = SOH.SalesOrderID
                    AND ProductID = PR.ProductID
            )
    )
FROM Production.Product AS PR;

The only difference between the two subqueries is the predicate on SOH.OnlineOrderFlag. The two queries can be merged introducing a CASE expression in the aggregate:

SELECT ProductID
    ,Name
    ,ISNULL(OnlineOrders,0) AS OnlineOrders
    ,ISNULL(OfflineOrders,0) AS OfflineOrders
FROM Production.Product AS PR
CROSS APPLY (
    SELECT SUM(CASE WHEN SOH.OnlineOrderFlag = 1 THEN 1 ELSE 0 END),
           SUM(CASE WHEN SOH.OnlineOrderFlag = 0 THEN 1 ELSE 0 END)
    FROM Sales.SalesOrderHeader AS SOH
    WHERE EXISTS (
            SELECT *
            FROM Sales.SalesOrderDetail
            WHERE SalesOrderID = SOH.SalesOrderID
                AND ProductID = PR.ProductID
        )
) AS Orderscount (OnlineOrders, OfflineOrders);

There are infinite possibilities and enumerating them all would be far beyond the scope of this post. This is one of the topics that my students often find hard to understand and I realize that it really takes some experience to identify merge opportunities and implement them.

Hi query, you look very fit. Did you lose weight?

8.     Put it all together

Remember the query outline you wrote previously? It’s time to put it into action.

Some of the identifiers may have gone away in the merge process, some others are still there and have been transformed into different SQL constructs, such as CTEs, iTVFs or temporary tables.

9.     Verify the output based on multiple different input values

Now it’s time to see if your new query works exactly like the original one. You already have a script for that: you can go on and use it.

Remember that the test can be considered meaningful only if you repeat it a reasonably large number of times, with different parameters. Some queries could appear to be identical, but still be semantically different. Make sure the rewritten version handles NULLs and out-of-range parameters in the same way.

10.Comment your work thoroughly

If you don’t comment your work, somebody will find it even more difficult to maintain than the elephant you found when you started.

Comments are for free and don’t affect the query performance in any way. Don’t add comments that mimic what the query does, instead, write a meaningful description of the output of the query.

For instance, given a code fragment like this:

SELECT SalesOrderID, OrderDate, ProductID
INTO #orders
FROM Sales.SalesOrderHeader AS H
INNER JOIN Sales.SalesOrderDetail AS D
    ON H.SalesOrderID = D.SalesOrderID
WHERE OrderDate BETWEEN @StartDate AND @EndDate

a comment like “joins OrderHeader to OrderDetail” adds nothing to the clarity of the code. A comment like “Selects the orders placed between the @StartDate and @EndDate and saves the results in a temporary table for later use” would be a much better choice.

Elephant eaten. (Burp!)

If you don't see a hat, sorry: you're getting old.

After all, it was not too big, was it?

Setting up an e-mail alert for DBCC CHECKDB errors


Some months ago I posted a script on a SQLServerCentral forum to help a member automating the execution of DBCC CHECKDB and send and e-mail alert in case a consistency error is found.

The original thread can be found here.

I noticed that many people are actually using that script and I also got some useful feedback on the code itself, so I decided to write this post to make an enhanced version available to everyone.

The Problem

Your primary responsibility as a DBA is to safeguard your data with backups. I mean intact backups! Keep in mind that when you back up a corrupt database, you will also restore a corrupt database.

A task that checks the database integrity should be part of your backup strategy and you should be notified immediately when corruption is found.

Unfortunately, the built-in consistency check Maintenance Task does not provide an alerting feature and you have to code it yourself.

The Solution

SQL Server 2000 and above accept the “WITH TABLERESULTS” option for most DBCC commands to output the messages as a result set. Those results can be saved to a table and processed to identify messages generated by corrupt data and raise an alert.

If you don’t know how to discover the resultset definition of DBCC CHECKDB WITH TABLERESULTS, I suggest that you take a look at this post.

Here is the complete code of the stored procedure I am using on my production databases:

-- You have a TOOLS database, don't you?
-- If not, create it: you'll thank me later.
USE TOOLS;
GO
-- =============================================
-- Author:      Gianluca Sartori - spaghettidba
-- Create date: 2011-06-30
-- Description: Runs DBCC CHECKDB on the database(s) specified 
--              and returns a table result that can be used in 
--              reporting and alerting.
-- =============================================
ALTER PROCEDURE [maint].[dba_runCHECKDB] 
    @dbName           sysname = NULL,    -- Database name. If NULL, will check all databases
    @PHYSICAL_ONLY    bit     = 0,       -- Set to 1 to perform physical check only. Defaults to 0.
    @allMessages      bit     = 0,       -- Set to 1 to return all the messages generated by DBCC
                                         -- Set to 0 to return one summary message for each database (default)
    @dbmail_profile   sysname = NULL,    -- DBMail profile to use when sending the results
    @dbmail_recipient sysname = NULL,    -- DBMail recipient
    @log_to_table     bit     = 0        -- Set to 1 to enable logging to table DBCC_CHECKDB_HISTORY
AS 
BEGIN

    SET NOCOUNT,
        XACT_ABORT,
        QUOTED_IDENTIFIER,
        ANSI_NULLS,
        ANSI_PADDING,
        ANSI_WARNINGS,
        ARITHABORT,
        CONCAT_NULL_YIELDS_NULL ON;
    SET NUMERIC_ROUNDABORT OFF;

    DECLARE @version int
    DECLARE @sql nvarchar(4000)
    DECLARE @ErrorMessage nvarchar(2048)
    DECLARE @body nvarchar(max)
    
    DECLARE @Message nvarchar(4000)
    DECLARE @Severity int
    DECLARE @State int
    
    -- determine major version: DBCC output can be different 
    SELECT @version = CAST(REPLACE(CAST(SERVERPROPERTY('ProductVersion') AS char(2)),'.','') AS int)



    IF OBJECT_ID('tempdb..##DBCC_OUTPUT') IS NOT NULL
        DROP TABLE ##DBCC_OUTPUT

    /*
    -- SQL2000 outputs these columns:
    CREATE TABLE ##DBCC_OUTPUT(
        Error int NULL,
        [Level] int NULL,
        State int NULL,
        MessageText nvarchar(200) NULL,
        RepairLevel nvarchar(22) NULL,
        Status int NULL,
        DbId smallint NULL,
        Id int NULL,
        IndId smallint NULL,
        [File] smallint NULL,
        Page int NULL,
        Slot int NULL,
        RefFile smallint NULL,
        RefPage int NULL,
        RefSlot int NULL,
        Allocation smallint NULL
    )
    */
    
    -- SQL2005, SQL2008, SQL2008R2 produce this output
    -- except for the columns explicitly marked as new in SQL 2012
    CREATE TABLE ##DBCC_OUTPUT(
        Error int NULL,
        [Level] int NULL,
        State int NULL,
        MessageText nvarchar(2048) NULL,
        RepairLevel nvarchar(22) NULL,
        Status int NULL,
        DbId int NULL, -- was smallint in SQL2005
        DbFragId int NULL,      -- new in SQL2012
        ObjectId int NULL,
        IndexId int NULL,
        PartitionId bigint NULL,
        AllocUnitId bigint NULL,
        RidDbId smallint NULL,  -- new in SQL2012
        RidPruId smallint NULL, -- new in SQL2012
        [File] smallint NULL,
        Page int NULL,
        Slot int NULL,
        RefDbId smallint NULL,  -- new in SQL2012
        RefPruId smallint NULL, -- new in SQL2012
        RefFile smallint NULL,  -- new in SQL2012
        RefPage int NULL,
        RefSlot int NULL,
        Allocation smallint NULL
    )


    -- Add a computed column
    ALTER TABLE ##DBCC_OUTPUT ADD Outcome AS
        CASE 
            WHEN Error = 8989 AND MessageText LIKE '%0 allocation errors and 0 consistency errors%' THEN 0 
            WHEN Error <> 8989 THEN NULL
            ELSE 1 
        END
        
    -- Add an identity column to sort results when sending the email
    ALTER TABLE ##DBCC_OUTPUT ADD RowID int IDENTITY(1,1)



    DECLARE @localTran bit
    IF @@TRANCOUNT = 0
    BEGIN
        SET @localTran = 1
        BEGIN TRANSACTION LocalTran
    END
    
    BEGIN TRY

        -- Create the history table if needed
        IF @log_to_table = 1 
            AND OBJECT_ID('maint.DBCC_CHECKDB_HISTORY') IS NULL
        BEGIN
            SELECT TOP(0)
                RowId,
                Error,
                [Level],
                State,
                MessageText,
                RepairLevel,
                Status,
                DbId,
                DbFragId,
                CAST(NULL AS sysname) AS DatabaseName,
                ObjectId,
                IndexId,
                PartitionId,
                AllocUnitId,
                RidDbId,
                RidPruId,
                [File],
                Page,
                Slot,
                RefDbId,
                RefPruId,
                RefFile,
                RefPage,
                RefSlot,
                Allocation,
                Outcome,
                GETDATE() AS RunDate
            INTO maint.DBCC_CHECKDB_HISTORY
            FROM ##DBCC_OUTPUT
            
            ALTER TABLE maint.DBCC_CHECKDB_HISTORY
                ADD CONSTRAINT PK_DBCC_CHECKDB_HISTORY PRIMARY KEY CLUSTERED(RowId)
        END

          
        -- Open a cursor on the matching databases (version dependant)
        IF @version = 8 
        BEGIN
        
            DECLARE c_databases CURSOR LOCAL FAST_FORWARD
            FOR
            SELECT QUOTENAME(name) AS name
            FROM master..sysdatabases
            WHERE name = ISNULL(@dbName, name)
                AND Name <> 'tempdb'
                AND DATABASEPROPERTY(name, 'IsOffline') = 0
                AND DATABASEPROPERTY(name, 'IsReadOnly') = 0
                
        END
        ELSE
        BEGIN
        
            DECLARE c_databases CURSOR LOCAL FAST_FORWARD
            FOR
            SELECT QUOTENAME(name) AS name
            FROM master.sys.databases
            WHERE name = ISNULL(@dbName, Name)
                AND Name <> 'tempdb'
                AND state_desc = 'ONLINE'
                AND is_read_only = 0
                AND source_database_id IS NULL -- Exclude Snapshots	
                
        END

        OPEN c_databases
        FETCH NEXT FROM c_databases INTO @dbName
        

        WHILE @@FETCH_STATUS = 0
        BEGIN
            
            -- Build a SQL string
            SET @sql = 'DBCC CHECKDB('+ @dbName +') WITH TABLERESULTS, ALL_ERRORMSGS '
            
            IF @PHYSICAL_ONLY = 1 
                SET @sql = @sql + ', PHYSICAL_ONLY '

            BEGIN TRY
            
                IF @version = 8 
                BEGIN
                    -- SQL2000 has a shorter column list
                    INSERT INTO ##DBCC_OUTPUT (
                        Error, 
                        [Level],
                        State,
                        MessageText,
                        RepairLevel,
                        Status,
                        DbId,
                        ObjectId,
                        IndexId,
                        [File],
                        Page,
                        Slot,
                        RefFile,
                        RefPage,
                        RefSlot,
                        Allocation
                    )
                    EXEC(@sql)
                END
                
                IF @version > 8 AND @version < 11
                BEGIN
                    -- SQL2005/2008 use this column list
                    INSERT INTO ##DBCC_OUTPUT (
                        Error,
                        [Level],
                        State,
                        MessageText,
                        RepairLevel,
                        Status, 
                        DbId,
                        ObjectId,
                        IndexId,
                        PartitionId,
                        AllocUnitId,
                        [File],
                        Page,
                        Slot,
                        RefFile,
                        RefPage,
                        RefSlot,
                        Allocation 
                    )
                    EXEC(@sql)
                END

                IF @version >= 11
                BEGIN
                    -- SQL2012 uses all columns 
                    INSERT INTO ##DBCC_OUTPUT (
                        Error,
                        [Level],
                        State,
                        MessageText,
                        RepairLevel,
                        Status, 
                        DbId,
                        DbFragId,
                        ObjectId,
                        IndexId,
                        PartitionId,
                        AllocUnitId,
                        RidDbId,
                        RidPruId,
                        [File],
                        Page,
                        Slot,
                        RefDbId,
                        RefPruId,
                        RefFile,
                        RefPage,
                        RefSlot,
                        Allocation 
                    )
                    EXEC(@sql)
                END
                
            END TRY
            BEGIN CATCH
                SELECT  @ErrorMessage = 'Unable to run DBCC on database ' + @dbName + ': ' + ERROR_MESSAGE()
                        
                INSERT INTO ##DBCC_OUTPUT (Error, MessageText)
                SELECT Error = 8989,
                       MessageText = @ErrorMessage
                        
            END CATCH
                
            FETCH NEXT FROM c_databases INTO @dbName
        END
        
        CLOSE c_databases
        DEALLOCATE c_databases

        IF NOT EXISTS (
            SELECT 1 FROM ##DBCC_OUTPUT 
        )
        BEGIN 
            RAISERROR('No database matches the name specified.',10,1)
        END
        
        
        IF @log_to_table = 1 
        BEGIN 
            INSERT INTO maint.DBCC_CHECKDB_HISTORY (
                Error,
                [Level],
                State,
                MessageText,
                RepairLevel,
                Status,
                DbId,
                DbFragId,
                DatabaseName,
                ObjectId,
                IndexId,
                PartitionId,
                AllocUnitId,
                RidDbId,
                RidPruId,
                [File],
                Page,
                Slot,
                RefDbId,
                RefPruId,
                RefFile,
                RefPage,
                RefSlot,
                Allocation,
                Outcome,
                RunDate
            )
            SELECT 
                Error,
                [Level],
                State,
                MessageText,
                RepairLevel,
                Status,
                DbId,
                DbFragId,
                DatabaseName = ISNULL(DB_NAME(DbId),'resourcedb'),
                ObjectId,
                IndexId,
                PartitionId,
                AllocUnitId,
                RidDbId,
                RidPruId,
                [File],
                Page,
                Slot,
                RefDbId,
                RefPruId,
                RefFile,
                RefPage,
                RefSlot,
                Allocation,
                Outcome,
                RunDate = GETDATE()
            FROM ##DBCC_OUTPUT
            WHERE Error = 8989
                OR @AllMessages = 1
                OR DbId IN (
                    SELECT DbId 
                    FROM ##DBCC_OUTPUT
                    WHERE Error = 8989 
                        AND Outcome = 1
                )
        END
        

        -- Build the final SQL statement	
        SET @sql = 
            'SELECT ISNULL(DB_NAME(DbId),''resourcedb'') AS DatabaseName, ' + 
                CASE @allMessages 
                    WHEN 1 THEN '*' 
                    ELSE 'MessageText, Outcome' 
                END + '
            FROM ##DBCC_OUTPUT 
            WHERE 1 = 1 ' +
            CASE @allMessages WHEN 1 THEN '' ELSE 'AND Error = 8989' END 

        IF @dbmail_profile IS NULL OR @dbmail_recipient IS NULL
        BEGIN
            -- Query DBCC output directly
            EXEC(@sql)
        END
        ELSE
        BEGIN 
            -- Pipe DBCC output to a variable
            SET @sql = '
                SELECT @body = (
                    SELECT ISNULL(MessageText,'''') + char(10) AS [text()] 
                    FROM ( ' + @sql + ' AND Error = 8989 ) AS src
                    WHERE Outcome = 1 
                    ORDER BY 1 DESC
                    FOR XML PATH('''')
                )'
                
            EXEC sp_executesql @sql, N'@body nvarchar(max) OUTPUT', @body OUTPUT
            


            -- Send CHECKDB report
            IF @body IS NOT NULL 
            BEGIN
                IF @version > 8
                BEGIN
                    EXEC msdb.dbo.sp_send_dbmail 
                        @profile_name = @dbmail_profile,
                        @recipients   = @dbmail_recipient,
                        @subject      = 'Consistency error found!',
                        @body         = @body,
                        @importance   = 'High',
                        @query        = '
                            SET NOCOUNT ON;
                            SELECT TOP(5000) ISNULL(DB_NAME(DbId),''resourcedb'') + '' -'' AS DatabaseName, MessageText
                            FROM ##DBCC_OUTPUT WITH (NOLOCK)
                            WHERE DbId IN (
                                SELECT DbId 
                                FROM ##DBCC_OUTPUT WITH (NOLOCK)
                                WHERE Error = 8989 
                                    AND Outcome = 1
                            )
                            ORDER BY RowId ASC ',
                        @attach_query_result_as_file = 1,
                        @query_result_no_padding     = 1,
                        @query_result_header         = 0,
                        @exclude_query_output        = 1,
                        @query_attachment_filename   = 'DBCC_CHECKDB_Errors.log'
                END
                ELSE
                BEGIN
                    -- SQL2000, no database mail here.
                    -- Create the stored procedure sp_send_cdosysmail that you can find here:
                    -- http://support.microsoft.com/default.aspx?scid=kb;EN-US;312839
                    -- No attachments, sorry.
                    EXEC TOOLS.dbo.sp_send_cdosysmail
                        @From    = @dbmail_profile, 
                        @To      = @dbmail_recipient, 
                        @Subject = 'Consistency error found!', 
                        @Body    = @body
                END

            END

        END

        IF @localTran = 1 AND XACT_STATE() <> 0
            COMMIT TRAN LocalTran

        IF OBJECT_ID('tempdb..##DBCC_OUTPUT') IS NOT NULL
            DROP TABLE ##DBCC_OUTPUT
            
    END TRY
    BEGIN CATCH

        SELECT  @Message = ERROR_MESSAGE(),
                @Severity = ERROR_SEVERITY(),
                @State = ERROR_STATE()
    
        IF @localTran = 1 AND XACT_STATE() <> 0
            ROLLBACK TRAN

        RAISERROR ( @Message, @Severity, @State)
        
    END CATCH

END

Once the stored procedure is ready, you can run it against the desired databases:

EXEC [maint].[dba_runCHECKDB]
	@dbName        = 'model',
	@PHYSICAL_ONLY = 0,
	@allmessages   = 0

Setting up an e-mail alert

In order to receive an e-mail alert, you can use a SQL Agent job and schedule this script to run every night, or whenever you find appropriate.

EXEC [maint].[dba_runCHECKDB]
    @dbName           = NULL,
    @PHYSICAL_ONLY    = 0,
    @allmessages      = 0,
    @dbmail_profile   = 'DBA_profile',
    @dbmail_recipient = 'dba@mycompany.com'

The e-mail message generated by the stored procedure contains the summary outcome and a detailed log, attached as a text file:

Logging to a table

If needed, you can save the output of this procedure to a history table that logs the outcome of DBCC CHECKDB in time:

-- Run the stored procedure with @log_to_table = 1
EXEC TOOLS.maint.dba_runCHECKDB
    @dbName        = NULL,
    @PHYSICAL_ONLY = 0,
    @allMessages   = 0,
    @log_to_table  = 1

-- Query the latest results
SELECT *
FROM (
    SELECT *, RN = ROW_NUMBER()  OVER (PARTITION BY DBId ORDER BY RunDate DESC)
    FROM DBCC_CHECKDB_HISTORY
    WHERE Outcome IS NOT NULL
) AS dbcc_history
WHERE RN = 1

When invoked with the @log_to_table parameter for the first time, the procedure creates a log table that will be used to store the results. Subsequent executions will append to the table.

No excuses!

The web is full of blogs, articles and forums on how to automate DBCC CHECKDB. If your data has any value to you, CHECKDB must be part of your maintenance strategy.

Run! Check the last time you performed a successful CHECKDB on your databases NOW! Was it last year? You may be in big trouble.

Discovering resultset definition of DBCC commands


Lots of blog posts and discussion threads suggest piping the output of DBCC commands to a table for further processing. That’s a great idea, but, unfortunately, an irritatingly high number of those posts contains an inaccurate table definition for the command output.

The reason behind this widespread inaccuracy is twofold.

On one hand the output of many DBCC commands changed over time and versions of SQL Server, and a table that was the perfect fit for the command in SQL Server 2000 is not  perfect any more. In this case, the blog/article/thread is simply old, but many people will keep referring to that source assuming that things did not change.

On the other hand, the output is not always documented in BOL, and people often have to guess the table definition based on the data returned by the command. I’ve been guilty of this myself and I’ve been corrected many times, until I decided that I needed a better way to discover the output definition.

You are a database professional and you don’t like to guess, because guessing is never as good as knowing it for sure.

In order to stop guessing, you will have to create a linked server named “loopback” that points back to the same instance where you are running the DBCC command.

I am sure you are asking yourself why you need such a strange thing as a loopback linked server. The idea behind is that you need a way to query the command as if it was a table or a view, so that it can be used as a valid source for a SELECT…INTO statement. The perfect tool for this kind of task is the OPENQUERY command, which allows sending pass-through queries, that don’t necessarily need to be SELECT statements. OPENQUERY requires a linked server, which can be any OLEDB data source, including a remote server or the same SQL Server instance where the linked server lies.

OK, let’s create it:

DECLARE @srv nvarchar(4000);
SET @srv = @@SERVERNAME; -- gather this server name

-- Create the linked server
EXEC master.dbo.sp_addlinkedserver
@server     = N'LOOPBACK',
@srvproduct = N'SQLServ', -- it’s not a typo: it can’t be “SQLServer”
@provider   = N'SQLNCLI', -- change to SQLOLEDB for SQLServer 2000
@datasrc    = @srv;


-- Set the authentication to "current security context"
EXEC master.dbo.sp_addlinkedsrvlogin
@rmtsrvname  = N'LOOPBACK',
@useself     = N'True',
@locallogin  = NULL,
@rmtuser     = NULL,
@rmtpassword = NULL;

In order to capture the output of DBCC commands, you have to wrap them inside a stored procedure, otherwise SQL Server could complain about missing column information. I don’t know the exact technical reason behind this error (I suppose it has to do with the way metadata is propagated), but this limitation can be overcome wrapping the command into a stored procedure and using “SET FMTONLY OFF” in the pass-through query.

This is also a nice way to overcome the single INSERT…EXEC limit (and implement many more interesting tricks that I hope to cover in future posts).

For instance, to capture the table definition of DBCC LOGINFO(), you will have to create a stored procedure similar to this:

USE tempdb;
GO

CREATE PROCEDURE loginfo
AS
BEGIN
    SET NOCOUNT ON;

    DBCC LOGINFO();

END
GO

With the stored procedure and the linked server in place, you can set up the call using OPENQUERY:

SELECT *
INTO tempdb.dbo.loginfo_output
FROM OPENQUERY(LOOPBACK, 'SET FMTONLY OFF; EXEC tempdb.dbo.loginfo');

DROP PROCEDURE loginfo;
GO

Running this script will create a table named “loginfo_output” in the tempdb database: you can find it in your object explorer and script it out to a new query editor window.

Repeating these steps on instances running different versions on SQL Server reveals that the table definition changed in SQL2005 and then remained the same in 2008 and 2008R2.

-- SQL Server 2000
CREATE TABLE [dbo].[loginfo_output](
    [FileId]      [int] NULL,
    [FileSize]    [numeric](20, 0) NULL,
    [StartOffset] [numeric](20, 0) NULL,
    [FSeqNo]      [int] NULL,
    [Status]      [int] NULL,
    [Parity]      [tinyint] NULL,
    [CreateLSN]   [numeric](25, 0) NULL
)


-- SQL Server 2005, 2008 and 2008R2
CREATE TABLE [dbo].[loginfo_output](
    [FileId]      [int] NULL,
    [FileSize]    [bigint] NULL,
    [StartOffset] [bigint] NULL,
    [FSeqNo]      [int] NULL,
    [Status]      [int] NULL,
    [Parity]      [tinyint] NULL,
    [CreateLSN]   [numeric](25, 0) NULL
)

Now that you know how the output looks like, you can happily pipe the results of DBCC LOGINFO to an appropriate table:

-- Declare variable for dynamic sql
DECLARE @sql nvarchar(max)

-- Drop the table if already exists
IF OBJECT_ID('tempdb..loginfo_output') IS NOT NULL
    DROP TABLE tempdb..loginfo_output

-- Check SQL Server version
IF CAST(REPLACE(LEFT(CAST(SERVERPROPERTY('ProductVersion') AS nvarchar(128)),2),'.','') AS int) > 8
BEGIN
    -- SQL Server 2005+
    SET @sql = '
        CREATE TABLE tempdb..loginfo_output(
            [FileId]      [int] NULL,
            [FileSize]    [bigint] NULL,
            [StartOffset] [bigint] NULL,
            [FSeqNo]      [int] NULL,
            [Status]      [int] NULL,
            [Parity]      [tinyint] NULL,
            [CreateLSN]   [numeric](25, 0) NULL
        )
        '
END
ELSE
BEGIN
    -- SQL Server 2000
    SET @sql = '
        CREATE TABLE tempdb..loginfo_output(
            [FileId]      [int] NULL,
            [FileSize]    [numeric](20, 0) NULL,
            [StartOffset] [numeric](20, 0) NULL,
            [FSeqNo]      [int] NULL,
            [Status]      [int] NULL,
            [Parity]      [tinyint] NULL,
            [CreateLSN]   [numeric](25, 0) NULL
        )
        '
END

-- Create the output table
EXEC(@sql)

-- Execute DBCC command and
-- pipe results to the output table
INSERT tempdb..loginfo_output
EXEC('DBCC LOGINFO()')


-- Display results
SELECT *
FROM tempdb..loginfo_output

You could ask with good reason why you should use an output table when you could query the wrapper stored procedure directly with OPENQUERY. Based on observation, the trick does not always work and SQL Server can randomly complain about missing column information.

Msg 7357, Level 16, State 2, Line 2
Cannot process the object "loginfo". The OLE DB provider "SQLNCLI10" for linked server "LOOPBACK" indicates that either the object has no columns or the current user does not have permissions on that object.

Again, I don’t have an in-depth technical answer: I can only report what I observed. It’s not a big deal indeed, because the output definition changes very slowly (typically between SQL Server versions) and you probably would review your code anyway when upgrading to a newer version. I guess you can live with a hardcoded table definition when the price to pay for having it dynamic is a random failure.

This post showed you how to capture the output of DBCC LOGINFO, but the same technique can be used for all DBCC commands that allow specifying WITH TABLERESULTS, extended stored procedures, remote stored procedures and all those programmable objects than cannot be inspected easily.

Now that you have the right tool in your hands, do yourself a favour: stop guessing!

Follow

Get every new post delivered to your Inbox.

Join 252 other followers