Blog Archives

Installing SQL Server 2016 Language Reference Help from disk


A couple of years ago I blogged about Installing the SQL Server 2014 Language Reference Help from disk.

With SQL Server 2016 things changed significantly: we have the new Help Viewer 2.2, which is shipped with the Management Studio setup kit.

However, despite all the changes in the way help works and is shipped, I am still unable to download and install help content from the web, so I resorted to using the same trick that I used for SQL Server 2014.

This time the URLs and the files to download are different:

    1. Point your browser to http://services.mtps.microsoft.com/ServiceAPI/catalogs/sql2016/en-us
    2. Download the Language Reference Files:

      If you’re a PowerShell person, these three lines will do:

Invoke-WebRequest -Uri "http://packages.mtps.microsoft.com/sql_2016_branding_en-us(1bd6e667-f159-ac3b-f0a5-964c04ca5a13).cab" `
	-OutFile "sql_2016_branding_en-us(1bd6e667-f159-ac3b-f0a5-964c04ca5a13).cab"
Invoke-WebRequest -Uri "http://packages.mtps.microsoft.com/v2sql_shared_language_reference_b4621_sql_130_en-us_1(83748a56-8810-751f-d453-00c5accc862d).cab" `
	-OutFile "v2sql_shared_language_reference_b4621_sql_130_en-us_1(83748a56-8810-751f-d453-00c5accc862d).cab"
Invoke-WebRequest -Uri "http://packages.mtps.microsoft.com/v2sql_shared_language_reference_b4621_sql_130_en-us_2(ccc38276-b744-93bd-9008-fe79b294ff41).cab" `
	-OutFile "v2sql_shared_language_reference_b4621_sql_130_en-us_2(ccc38276-b744-93bd-9008-fe79b294ff41).cab"
      1. Create a text file name HelpContentSetup.msha in the same folder as the .cab files and paste the following html:
<html xmlns="http://www.w3.org/1999/xhtml">
<head />
<body class="vendor-book">
    <div class="details">
        <span class="vendor">Microsoft</span>
        <span class="locale">en-us</span>
        <span class="product">SQL Server 2016</span>
        <span class="name">Microsoft SQL Server Language Reference</span>
    </div>
    <div class="package-list">
        <div class="package">
            <span class="name">SQL_2016_Branding_en-US</span>
            <span class="deployed">False</span>
            <a class="current-link" href="sql_2016_branding_en-us(1bd6e667-f159-ac3b-f0a5-964c04ca5a13).cab">sql_2016_branding_en-us(1bd6e667-f159-ac3b-f0a5-964c04ca5a13).cab</a>
        </div>
        <div class="package">
            <span class="name">v2SQL_Shared_Language_Reference_B4621_SQL_130_en-us_1</span>
            <span class="deployed">False</span>
            <a class="current-link" href="v2sql_shared_language_reference_b4621_sql_130_en-us_1(83748a56-8810-751f-d453-00c5accc862d).cab">v2sql_shared_language_reference_b4621_sql_130_en-us_1(83748a56-8810-751f-d453-00c5accc862d).cab</a>
        </div>
        <div class="package">
            <span class="name">v2SQL_Shared_Language_Reference_B4621_SQL_130_en-us_2</span>
            <span class="deployed">False</span>
            <a class="current-link" href="v2sql_shared_language_reference_b4621_sql_130_en-us_2(ccc38276-b744-93bd-9008-fe79b294ff41).cab">v2sql_shared_language_reference_b4621_sql_130_en-us_2(ccc38276-b744-93bd-9008-fe79b294ff41).cab</a>
        </div>
    </div>
</body>
</html>
      1. First, set the Help Viewer to open help from the local sources:
        1-viewer
      2. Then select the “Add and Remove Help Content” command:
        2-addremove
      3. This command opens the Help Viewer and asks for the content to add.
        Browse to the file you created in step 3.
        Click “Add” on all the items you wish to add to the library. In this case you will have only 1 item.
        When done, click the “Update” button.
        3-addcontent
      4. Unfortunately, during the installation phase of the library item, something crashes and the installation won’t proceed until you tell it to ignore or report the error.
        4-crash
      5. Despite the crash, everything works as expected and you will find the topic installed in your help library:
        5-installed

Here it is, nice and easy. Hope it works for you too.

Counting the number of rows in a table


Don’t be fooled by the title of this post: while counting the number of rows in a table is a trivial task for you, it is not trivial at all for SQL Server.

Every time you run your COUNT(*) query, SQL Server has to scan an index or a heap to calculate that seemingly innocuous number and send it to your application. This means a lot of unnecessary reads and unnecessary blocking.

Jes Schultz Borland blogged about it some time ago and also Aaron Bertrand has a blog post on this subject. I will refrain from repeating here what they both said: go read their blogs to understand why COUNT(*) is a not a good tool for this task.

The alternative to COUNT(*) is reading the count from the table metadata, querying sys.partitions, something along these lines:

SELECT SUM(p.rows)
FROM sys.partitions p
WHERE p.object_id = OBJECT_ID('MyTable')
    AND p.index_id IN (0,1); -- heap or clustered index

Many variations of this query include JOINs to sys.tables, sys.schemas or sys.indexes, which are not strictly necessary in my opinion. However, the shortest version of the count is still quite verbose and error prone.

Fortunately, there’s a shorter version of this query that relies on the system function OBJECTPROPERTYEX:

SELECT OBJECTPROPERTYEX(OBJECT_ID('MyTable'),'cardinality')

Where does it read data from? STATISTICS IO doesn’t return anything for this query, so I had to set up an Extended Events session to capture lock_acquired events and find out the system tables read by this function:

sysalloc

Basically, it’s just sysallocunits and sysrowsets.

It’s nice, short and easy to remember. Enjoy.

How to post a T-SQL question on a public forum


If you want to have faster turnaround on your forum questions, you will need to provide enough information to the forum users in order to answer your question.

In particular, talking about T-SQL questions, there are three things that your question must include:

  1. Table scripts
  2. Sample data
  3. Expected output

 

Table Script and Sample data

Please make sure that anyone trying to answer your question can quickly work on the same data set you’re working on, or, at least the problematic part of it. The data should be in the same place where you have it, which is inside your tables.

You will have to provide a script that creates your table and inserts data inside that table.

Converting your data to INSERT statements can be tedious: fortunately, some tools can do it for you.

How do you convert a SSMS results grid, a CSV file or an Excel spreadsheet to INSERT statements? In other words, how do you convert this…

table_data

into this?

USE [tempdb]
GO

CREATE TABLE [dbo].[Person](
	[BusinessEntityID] [int] NOT NULL PRIMARY KEY CLUSTERED,
	[PersonType] [nchar](2) NOT NULL,
	[FirstName] [nvarchar](50) NOT NULL,
	[LastName] [nvarchar](50) NOT NULL
)

GO

INSERT INTO Person VALUES (6106,'IN','Beth','Carlson');
INSERT INTO Person VALUES (17889,'IN','Dennis','Li');
INSERT INTO Person VALUES (17989,'IN','Brent','Li');
INSERT INTO Person VALUES (9424,'IN','Brad','Raji');
INSERT INTO Person VALUES (5842,'IN','Aimee','She');
INSERT INTO Person VALUES (2144,'GC','Carol','Philips');
INSERT INTO Person VALUES (2582,'IN','Gregory','Tang');
INSERT INTO Person VALUES (2012,'SC','Jian','Wang');
INSERT INTO Person VALUES (12624,'IN','Clayton','She');
INSERT INTO Person VALUES (12509,'IN','Madison','Russell');
GO

The easiest way to perform the transformation is to copy all the data and paste it over at ConvertCSV:

ConvertCSVToSQL_1

ConvertCSVToSQL_2

ConvertCSVToSQL_3

Another great tool for this task is SQLFiddle.

OPTIONAL: The insert statements will include the field names: if you want to make your code more concise, you can remove that part by selecting the column names with your mouse holding the ALT key and then delete the selection. Here’s a description of how the rectangular selection works in SSMS 2012 and 2014 (doesn’t work in SSMS 2008).

Expected output

The expected output should be something immediately readable and understandable. There’s another tool that can help you obtain it.

Go to https://ozh.github.io/ascii-tables/, paste your data in the “Input” textarea, press “Create Table” and grab your table from the “Output” textarea.text tables generatorHere’s what your output should look like:

+------------+-------------+
| PersonType | PersonCount |
+------------+-------------+
| GC         |           1 |
| IN         |           8 |
| SC         |           1 |
+------------+-------------+

Show what you have tried

Everybody will be more willing to help you if you show that you have put some effort into solving your problem. If you have a query, include it, even if it doesn’t do exactly what you’re after.

Please please please, format your query before posting! You can format your queries online for free at PoorSQL.com

PoorMans2

Simply paste your code then open the “Formatted SQL” tab to grab your code in a more readable way.

Putting it all together

Here is what your question should look like when everything is ok:

Hi all, I have a table called Person and I have to extract the number of rows for each person type.

This is the table script and some sample data:

USE [tempdb]
GO

CREATE TABLE [dbo].[Person](
	[BusinessEntityID] [int] NOT NULL PRIMARY KEY CLUSTERED,
	[PersonType] [nchar](2) NOT NULL,
	[FirstName] [nvarchar](50) NOT NULL,
	[LastName] [nvarchar](50) NOT NULL
)

GO

INSERT INTO Person VALUES (6106,'IN','Beth','Carlson');
INSERT INTO Person VALUES (17889,'IN','Dennis','Li');
INSERT INTO Person VALUES (17989,'IN','Brent','Li');
INSERT INTO Person VALUES (9424,'IN','Brad','Raji');
INSERT INTO Person VALUES (5842,'IN','Aimee','She');
INSERT INTO Person VALUES (2144,'GC','Carol','Philips');
INSERT INTO Person VALUES (2582,'IN','Gregory','Tang');
INSERT INTO Person VALUES (2012,'SC','Jian','Wang');
INSERT INTO Person VALUES (12624,'IN','Clayton','She');
INSERT INTO Person VALUES (12509,'IN','Madison','Russell');

This is what I’m trying to obtain:

+------------+-------------+
| PersonType | PersonCount |
+------------+-------------+
| GC         |           1 |
| IN         |           8 |
| SC         |           1 |
+------------+-------------+

Here is what I have tried:

SELECT PersonType
FROM Person

How do I do that?

If you include this information in your posts, I promise you will get blazingly fast answers.

Announcing ExtendedTSQLCollector


I haven’t been blogging much lately, actually I haven’t been blogging at all in the last 4 months. The reason behind is I have been putting all my efforts in a new project I started recently, which absorbed all my attention and spare time.

I am proud to announce that my project is now live and available to everyone for download.

1397849640_69471The project name is ExtendedTSQLCollector and you can find it at http://extendedtsqlcollector.codeplex.com. As you may have already guessed, it’s a bridge between two technologies that were not meant to work together, that could instead bring great advantages when combined: Extended Events and Data Collector.

ExtendedTSQLCollector is a set of two Collector Types built to overcome some of the limitations found in the built-in collector types and extend their functionality to include the ability to collect data from XE sessions.

The first Collector Type is the “Extended T-SQL Query” collector type, which was my initial goal when I started the project. If you have had the chance to play with the built-in “Generic T-SQL Query” collector type, you may have noticed that not all datatypes are supported. For instance, it’s impossible to collect data from XML or varchar(max) columns. This is due to the intermediate format used by this collector type: the SSIS raw files.

The “Extended T-SQL Query” collector type uses a different intermediate format, which allows collecting data of any data type. This is particularly useful, because SQL Server exposes lots of information in XML format (just think of the execution plans!) and you no longer need to code custom SSIS packages to collect that data.

The second Collector Type is the “Extended XE Reader” collector type, which takes advantage of the Extended Events streaming APIs to collect data from an Extended Events session, without the need to specify additional targets such as .xel files or ring buffers. This means no file system bloat due to .xel rollover files and no memory consumption for additional ring buffers: all the events are read directly from the session and processed in near real-time.

In addition to the filter predicates defined in the XE session, you can add more filter predicates on the data to collect and upload to the MDW and decide which columns (fields and actions) to collect. The collector will take care of creating the target table in your MDW database and upload all the data that satisfies the filter predicates.

The near real-time behavior of this collector type allowed me to include an additional feature to the mix: the ability to fire alerts in response to Extended Events. The current release (1.5) allows firing email alerts when the events are captured, with additional filter predicates and the ability to include event fields and actions in the email body. You can find more information on XE alerts in the documentation.

Here is an example of the email alerts generated by the XEReader collector type for the blocked_process event:

email

 

Another part of the project is the CollectionSet Manager, a GUI to install the collector types to the target servers and configure collection sets and collection items. I think that one of the reasons why the Data Collector is very underutilized by DBAs is the lack of a Graphical UI. Besides the features specific to the ExtendedTSQLCollector, such as installing the collector type, this small utility aims at providing the features missing in the SSMS Data Collector UI. This part of the project is still at an early stage, but I am planning to release it in the next few months.

My journey through the ins and outs of the Data Collector allowed me to understand deeply how it works and how to set it up and troubleshoot it. Now I am planning to start a blog series on this topic, from the basics to the advanced features. Stay tuned 🙂

I don’t want to go into deep details on the setup and configuration of this small project: I just wanted to ignite your curiosity and make you rush to codeplex to download your copy of ExtendedTSQLCollector.

What are you waiting for?

Uniquifiers: all rows or the duplicate keys only?


Some days ago I was talking with my friend Davide Mauri about the uniquifier that SQL Server adds to clustered indexes when they are not declared as UNIQUE.

We were not completely sure whether this behaviour applied to duplicate keys only or to all keys, even when unique.

The best way to discover the truth is a script to test what happens behind the scenes:

-- =============================================
-- Author:      Gianluca Sartori - @spaghettidba
-- Create date: 2014-03-15
-- Description: Checks whether the UNIQUIFIER column
--              is added to a column only on
--              duplicate clustering keys or all
--              keys, regardless of uniqueness
-- =============================================

USE tempdb
GO

IF OBJECT_ID('sizeOfMyTable') IS NOT NULL
    DROP VIEW sizeOfMyTable;
GO

-- Create a view to query table size information
-- Not very elegant, but saves a lot of typing

CREATE VIEW sizeOfMyTable
AS
SELECT OBJECT_NAME(si.object_id) AS table_name,
    si.name AS index_name,
    SUM(total_pages) AS total_pages,
    SUM(used_pages) AS used_pages,
    SUM(data_pages) AS data_pages
FROM sys.partitions AS p
INNER JOIN sys.allocation_units AS AU
    ON P.hobt_id = AU.container_id
INNER JOIN sys.indexes AS si
    ON si.index_id = p.index_id
    AND si.object_id = p.object_id
WHERE si.object_id = OBJECT_ID('#testUniquifier')
GROUP BY OBJECT_NAME(si.object_id),
    si.name
GO

IF OBJECT_ID('#testUniquifier') IS NOT NULL
    DROP TABLE #testUniquifier;

-- Create a test table

CREATE TABLE #testUniquifier (
    i int NOT NULL
)

-- Results table: will receive table size
-- in different scenarios

DECLARE @results TABLE(
    description varchar(500),
    table_name sysname,
    index_name sysname,
    total_pages int,
    used_pages int,
    data_pages int
);

-- INSERTS 100K UNIQUE VALUES

INSERT INTO #testUniquifier
SELECT TOP(100000) ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM sys.all_columns AS AC
CROSS JOIN sys.all_columns AS AC1;

-- -----------------------------------------------------------------
-- TEST1: CREATES A UNIQUE CLUSTERED INDEX (NO UNIQUIFIER)
-- -----------------------------------------------------------------

CREATE UNIQUE CLUSTERED INDEX UK_test ON #testUniquifier(i);

INSERT @results
SELECT 'Unique clustered index' AS description, * FROM sizeOfMyTable;

DROP INDEX UK_test ON #testUniquifier

-- -----------------------------------------------------------------
-- TEST2: CREATES A NON-UNIQUE CLUSTERED INDEX
--        NO DUPLICATES ARE PRESENT YET
-- -----------------------------------------------------------------

CREATE CLUSTERED INDEX IX_test ON #testUniquifier(i)

INSERT @results
SELECT 'Non-Unique clustered index, no duplicates' AS description, * FROM sizeOfMyTable

DROP INDEX IX_test ON #testUniquifier

-- -----------------------------------------------------------------
-- TEST3: CREATES A NON-UNIQUE CLUSTERED INDEX
--        10000 DUPLICATE VALUES ARE PRESENT
-- -----------------------------------------------------------------

UPDATE TOP(10000) #testUniquifier SET i = 1

CREATE CLUSTERED INDEX IX_test ON #testUniquifier(i)

INSERT @results
SELECT 'Non-Unique clustered index, some duplicates' AS description, * FROM sizeOfMyTable

DROP INDEX IX_test ON #testUniquifier

-- -----------------------------------------------------------------
-- TEST4: CREATES A NON-UNIQUE CLUSTERED INDEX
--        ALL ROWS CONTAIN THE SAME VALUE (1)
-- -----------------------------------------------------------------

UPDATE #testUniquifier SET i = 1

CREATE CLUSTERED INDEX IX_test ON #testUniquifier(i)

INSERT @results
SELECT 'Non-Unique clustered index, all duplicates' AS description, * FROM sizeOfMyTable

-- -----------------------------------------------------------------
-- Display results
-- -----------------------------------------------------------------

SELECT *
FROM @results;

As you can see, the uniquifier is added only to the keys that are duplicated:

uniquifier

Another way to discover the same results would be looking at the output of DBCC PAGE().

Looking at the text output of DBCC PAGE, uniquifiers are displayed as 0 (zero) when the values are not set, but the values are actually missing from the page.

This becomes even clearer when using DBCC PAGE WITH TABLERESULTS:

IF OBJECT_ID('tempdb..#formatteddata') IS NOT NULL	
    DROP TABLE #formatteddata;

SELECT *, ROWNUM = ROW_NUMBER() OVER (ORDER BY page_id, slot_id)
INTO #formatteddata
FROM #testUniquifier
CROSS APPLY sys.fn_PhysLocCracker(%%physloc%%);

IF OBJECT_ID('tempdb..#dbccpage') IS NOT NULL
    DROP TABLE #dbccpage;


CREATE TABLE #dbccpage (
    page_id int,
    ParentObject varchar(128),
    Object varchar(128),
    Field varchar(128),
    value varchar(4000),
    Slot AS SUBSTRING(Object, NULLIF(CHARINDEX('Slot ',Object,1),0) + 5, ISNULL(NULLIF(CHARINDEX(' ',Object,6),0),0) - 5)
)

DECLARE @current_page_id int;

DECLARE pages CURSOR STATIC LOCAL FORWARD_ONLY READ_ONLY
FOR
SELECT DISTINCT page_id 
FROM #formatteddata

OPEN pages

FETCH NEXT FROM pages INTO @current_page_id

WHILE @@FETCH_STATUS = 0
BEGIN 
    INSERT INTO #dbccpage (ParentObject, Object, Field, value)
    EXEC sp_executesql N'DBCC PAGE (2, 1, @pageid, 3) WITH TABLERESULTS;', N'@pageid int', @current_page_id

    UPDATE #dbccpage 
    SET page_id = @current_page_id
    WHERE page_id IS NULL

    FETCH NEXT FROM pages INTO @current_page_id
END

CLOSE pages;
DEALLOCATE pages;


WITH PageData AS  (
    SELECT page_id, slot, field, value
    FROM #dbccpage
    WHERE field IN ('i', 'UNIQUIFIER')
),
Uniquifiers AS (
    SELECT *
    FROM PageData
    PIVOT (MAX(value) FOR field IN ([i], [UNIQUIFIER])) AS pvt
),
sourceData AS (
    SELECT *
    FROM #formatteddata
)
SELECT src.ROWNUM, 
    src.i, 
    src.page_id,
    src.slot_id,
    UNIQUIFIER
FROM sourceData AS src
LEFT JOIN Uniquifiers AS unq
    ON src.slot_id = unq.slot
    AND src.page_id = unq.page_id
ORDER BY ROWNUM;

If you run the code in the different situations outlined before (unique clustered index, non-unique clustered index with or without duplicate keys) you will find the uniquifiers associated with each duplicate key and you will also notice that no uniquifier is generated for the keys that are unique.

COPY_ONLY backups and Log Shipping


Last week I was in the process of migrating a couple of SQL Server instances from 2008 R2 to 2012.

In order to let the migration complete quickly, I set up log shipping from the old instance to the new instance. Obviously, the existing backup jobs had to be disabled, otherwise they would have broken the log chain.

That got me thinking: was there a way to keep both “regular” transaction log backups (taken by the backup tool) and the transaction log backups taken by log shipping?

 ls_architecture

The first thing that came to my mind was the COPY_ONLY option available since SQL Server 2005.

You probably know that COPY_ONLY backups are useful when you have to take a backup for a special purpose, for instance when you have to restore from production to test. With the COPY_ONLY option, database backups don’t break the differential base and transaction log backups don’t break the log chain.

My initial thought was that I could ship COPY_ONLY backups to the secondary and keep taking scheduled transaction log backups with the existing backup tools.

I was dead wrong.

Let’s see it with an example on a TEST database.

I took 5 backups:

  1. FULL database backup, to initialize the log chain. Please note that COPY_ONLY backups cannot be used to initialize the log chain.
  2. LOG backup
  3. LOG backup with the COPY_ONLY option
  4. LOG backup
  5. LOG backup with the COPY_ONLY option

The backup information can be queried from backupset in msdb:

SELECT
     ROW_NUMBER() OVER(ORDER BY bs.backup_start_date) AS [backup #]
    ,first_lsn
    ,last_lsn
    ,backup_start_date
    ,type
    ,is_copy_only
    ,DENSE_RANK() OVER(ORDER BY type, bs.first_lsn) AS sequence
FROM msdb.dbo.backupset bs
WHERE bs.database_name = 'TEST'

 backupsets

As you can see, the COPY_ONLY backups don’t truncate the transaction log and losing one of those backups wouldn’t break the log chain.

However, all backups always start from the first available LSN, which means that scheduled log backups taken without the COPY_ONLY option truncate the transaction log and make significant portions of the transaction log unavailable in the next COPY_ONLY backup.

You can see it clearly in the following picture: the LSNs highlighted in red should contain no gaps in order to be restored successfully to the secondary, but the regular TLOG backups break the log chain in the COPY_ONLY backups.

backupsets2

That means that there’s little or no point in taking COPY_ONLY transaction log backups, as “regular” backups will always determine gaps in the log chain.

When log shipping is used, the secondary server is the only backup you can have, unless you keep the TLOG backups or use your backup tool directly to ship the logs.

Why on earth should one take a COPY_ONLY TLOG backup (more than one at least) is beyond my comprehension, but that’s a whole different story.

Ten features you had in Profiler that are missing in Extended Events


Oooooops!

F1CRASH

I exchanged some emails about my post with Jonathan Kehayias and looks like I was wrong on many of the points I made.

I don’t want to keep misleading information around and I definitely need to fix my wrong assumptions.

Unfortunately, I don’t have the time to correct it immediately and I’m afraid it will have to remain like this for a while.

Sorry for the inconvenience, I promise I will try to fix it in the next few days.

SQL2014: Defining non-unique indexes in the CREATE TABLE statement


Now that my SQL Server 2014 CTP1 virtual machine is ready, I started to play with it and some new features and differences with the previous versions are starting to appear.

What I want to write about today is a T-SQL enhancement to DDL statements that brings in some new interesting considerations.

SQL Server 2014 now supports a new T-SQL syntax that allows defining an index in the CREATE TABLE statement without having to issue separate CREATE INDEX statements.

Up to now, the same could be achieved only with PRIMARY KEY and UNIQUE constraints, thus allowing UNIQUE indexes only.

For instance, the following statement creates a table with a unique clustered index on order_id and a unique nonclustered index on PO_number:

CREATE TABLE #orders (
     order_id uniqueidentifier NOT NULL
        PRIMARY KEY CLUSTERED DEFAULT NEWSEQUENTIALID()
    ,PO_number varchar(50) NOT NULL UNIQUE
    ,order_date datetime NOT NULL
    ,total_amount decimal(18,3)
)

OK, but what if I want to add a non-unique index to my table?

SQL Server 2014 offers a new syntax to do that inline with the table DDL:

CREATE TABLE #orders (
     order_id uniqueidentifier NOT NULL
        PRIMARY KEY CLUSTERED DEFAULT NEWSEQUENTIALID()
    ,PO_number varchar(50) NOT NULL UNIQUE
     -- creates a nonclustered index on order_date
    ,order_date datetime NOT NULL INDEX IX_order_date
    ,total_amount decimal(18,3)
)

A similar syntax can be used to create a compound index:

CREATE TABLE #orders (
     order_id uniqueidentifier NOT NULL
        PRIMARY KEY CLUSTERED DEFAULT NEWSEQUENTIALID()
    ,PO_number varchar(50) NOT NULL UNIQUE
    ,order_date datetime NOT NULL INDEX IX_order_date
    ,total_amount decimal(18,3)
    -- creates a compound index on PO_number and order_date
    ,INDEX IX_orders_compound(PO_number, order_date)
)

An interesting aspect of this new syntax is that it allows creating non-unique nonclustered indexes to table variables, which is something that couldn’t be done in the previous versions.

The syntax to use is the same as for permanent tables:

DECLARE @orders TABLE (
     order_id uniqueidentifier NOT NULL
        PRIMARY KEY CLUSTERED DEFAULT NEWSEQUENTIALID()
    ,PO_number varchar(50) NOT NULL UNIQUE
    ,order_date datetime NOT NULL INDEX IX_order_date
    ,total_amount decimal(18,3)
)

Cool! But, wait: does this mean that table variables will now behave in the same way permanent tables do?

Not exactly.

Table variables don’t have statistics, and being able to create indexes on them won’t change anything in this regard.

Do you want a proof? OK, the skeptics can run the following code. Please make sure you capture the actual execution plan.

SET NOCOUNT ON;

-- create the table variable
DECLARE @orders TABLE (
     order_id uniqueidentifier NOT NULL
        PRIMARY KEY CLUSTERED DEFAULT NEWSEQUENTIALID()
    ,PO_number varchar(50) NOT NULL UNIQUE
    ,order_date datetime NOT NULL INDEX IX_order_date
    ,total_amount decimal(18,3)
)

-- insert some data
INSERT INTO @orders (order_date, PO_number, total_amount)
SELECT
     order_date   = DATEADD(second, CHECKSUM(NEWID()), GETDATE())
    ,PO_number    = CAST(NEWID() AS varchar(50))
    ,total_amount = CHECKSUM(NEWID()) / 1000.0
FROM sys.all_columns

-
SELECT COUNT(*)
FROM @orders
WHERE order_date > GETDATE()
OPTION (
     -- activate some (undocumented) trace flags to show
     -- statistics usage. More information on the flags
     -- can be found on Paul White's blog:
     -- http://sqlblog.com/blogs/paul_white/archive/2011/09/21/how-to-find-the-statistics-used-to-compile-an-execution-plan.aspx

     -- redirect output to the messages tab
     QUERYTRACEON 3604
     -- show "interesting" statistics
    ,QUERYTRACEON 9292
    -- show loaded statistics
    ,QUERYTRACEON 9402
    -- add RECOMPILE to let the optimizer "see"
    -- the table cardinality
    ,RECOMPILE
)

The output of the above batch is empty. Looks like no stats were loaded.

The actual execution plan confirms that no stats were loaded and the estimated cardinality of the table variable is way off:

plan_1

If we repeat the test with a temporary table, we see a different behaviour.

SET NOCOUNT ON;

IF OBJECT_ID('tempdb..#orders') IS NOT NULL
    DROP TABLE #orders;

CREATE TABLE #orders (
     order_id uniqueidentifier NOT NULL
        PRIMARY KEY CLUSTERED DEFAULT NEWSEQUENTIALID()
    ,PO_number varchar(50) NOT NULL UNIQUE
    ,order_date datetime NOT NULL INDEX IX_order_date
    ,total_amount decimal(18,3)
)

INSERT INTO #orders (order_date, PO_number, total_amount)
SELECT
     order_date   = DATEADD(second, CHECKSUM(NEWID()), GETDATE())
    ,PO_number    = CAST(NEWID() AS varchar(50))
    ,total_amount = CHECKSUM(NEWID()) / 1000.0
FROM sys.all_columns

SELECT COUNT(*)
FROM #orders
WHERE order_date > GETDATE()
OPTION (
     QUERYTRACEON 3604
    ,QUERYTRACEON 9292
    ,QUERYTRACEON 9402
)

This time the messages tab contains some output:

Stats header loaded: DbName: tempdb, ObjName: #orders, IndexId: 2, ColumnName: order_date, EmptyTable: FALSE

Stats header loaded: DbName: tempdb, ObjName: #orders, IndexId: 2, ColumnName: order_date, EmptyTable: FALSE

The optimizer identified the statistics on the oder_date column as “interesting” and then loaded the stats header.

Again, the actual execution plan confirms that a better estimation is available:

plan_2

The lack of statistics has always been the most significant difference between table variables and temporary tables and SQL2014 doesn’t appear to change the rules (yet).

A viable alternative to dynamic SQL in administration scripts


As a DBA, you probably have your toolbox full of scripts, procedures and functions that you use for the day-to-day administration of your instances.

I’m no exception and my hard drive is full of scripts that I tend to accumulate and never throw away, even if I know I will never need (or find?) them again.

However, my preferred way to organize and maintain my administration scripts is a database called “TOOLS”, which contains all the scripts I regularly use.

One of the challenges involved in keeping the scripts in a database rather than in a script file is the inability to choose a database context for the execution. When a statement is encapsulated in a view, function or stored procedure in a database, every reference to a database-specific object is limited to the database that contains the programmable object itself. The only way to overcome this limitation is the use of dynamic sql.

For instance, if I want to query the name of the tables in a database, I can use the following statement:

SELECT name FROM sys.tables

The statement references a catalog view specific to a single database, so if I enclose it in a stored procedure, the table names returned by this query are those found in the database that contains the stored procedure itself:

USE TOOLS;
GO

CREATE PROCEDURE getTableNames
AS
SELECT name FROM sys.tables;
GO

EXEC getTableNames;

This is usually not an issue, since most stored procedures will not cross the boundaries of the database they are created in. Administration scripts are different, because they are meant as a single entry point to maintain the whole SQL server instance.

In order to let the statement work against a different database, you need to choose one of the following solutions:

  1. dynamic SQL
  2. sp_executesql
  3. marking as a system object
  4. … an alternative way

Each of these techniques has its PROs and its CONs and I will try to describe them in this post.

1. Dynamic SQL

It’s probably the easiest way to solve the issue: you just have to concatenate the database name to the objects names.

USE TOOLS;
GO

ALTER PROCEDURE getTableNames @db_name sysname
AS
BEGIN
    DECLARE @sql nvarchar(max)
    SET @sql = 'SELECT name FROM '+ QUOTENAME(@db_name) +'.sys.tables';

    EXEC(@sql)
END
GO

EXEC getTableNames 'msdb';

PROS:

  • very easy to implement for simple statements

CONS:

  • can rapidly turn to a nightmare with big, complicated statements, as each object must be concatenated with the database name. Different objects have different ways to be related to the database: tables and views can be concatenated directly, while functions such as OBJECT_NAME accept an additional parameter to specify the database name.
  • the statement has to be treated as a string and enclosed in quotes, which means that:
    • quotes must be escaped, and escaped quotes must be escaped again and escaped and re-escaped quotes… ok, you know what I mean
    • no development aids such as intellisense, just-in-time syntax checks and syntax coloring

 

2. sp_executesql

It’s a neater way to avoid concatenating the database name to each object referenced in the statement.

USE TOOLS;
GO

ALTER PROCEDURE getTableNames @db_name sysname
AS
BEGIN

    -- use a @sql variable to store the whole query
    -- without concatenating the database name

    DECLARE @sql nvarchar(max);

    SET @sql = 'SELECT name FROM sys.tables';

    -- concatenate the database name to the
    -- sp_executesql call, just once

    DECLARE @cmd nvarchar(max);

    SET @cmd = 'EXEC '+ QUOTENAME(@db_name) +'.sys.sp_executesql @sql';

    EXEC sp_executesql @cmd, N'@sql nvarchar(max)', @sql

END
GO

EXEC getTableNames 'msdb';

PROS:

  • the dynamic sql is taken as a whole and does not need to be cluttered with multiple concatenations

CONS:

  • needs some more work than a straight concatenation and can be seen as “obscure”
  • suffers from the same issues found with plain dynamic sql, because the statement is, again, treated as a string

3. System object

Nice and easy: every stored procedure you create in the master database with the “sp_” prefix can be executed from any database context.

Using the undocumented stored procedure sp_MS_marksystemobject you can also mark the stored procedure as a “system object” and let it reference the tables in the database from which it is invoked.

USE master;
GO

ALTER PROCEDURE sp_getTableNames
AS
BEGIN
    SELECT name FROM sys.tables
END
GO

EXEC sys.sp_MS_marksystemobject 'sp_getTableNames'
GO

USE msdb;
GO

EXEC sp_getTableNames;

PROS:

  • no need to use dynamic sql

CONS:

  • requires creating objects in the “master” database, which is something I tend to avoid
  • works with stored procedures only (actually, it works with other objects, such as tables and views, but you have to use the “sp_” prefix. The day I will find a view named “sp_getTableNames” in the master database it won’t be safe to stay near me)

An alternative method:

It would be really helpful if we could store the statement we want to execute inside an object that doesn’t involve dynamic sql and doesn’t need to be stored in the master database. In other words, we need a way to get the best of both worlds.

Is there such a solution? Apparently, there isn’t.

The ideal object to store a statement and reuse it later is a view, but there is no way to “execute” a view against a different database. In fact you don’t execute a view, you just select from it, which is quite a different thing.

What you “execute” when you select from a view is the statement in its definition (not really, but let me simplify).

So, what we would need to do is just read the definition from a view and use the statement against the target database. Sounds straightforward, but it’s not.

The definition of a view also contains the “CREATE VIEW” statement and stripping it off is not just as easy as it seems.

Let’s see the issue with an example: I will create a view to query the last update date of the index statistics in a database, using the query from Glenn Berry’s Diagnostic Queries.

USE TOOLS;
GO

-- When were Statistics last updated on all indexes?  (Query 48)
CREATE VIEW statisticsLastUpdate
AS
SELECT
     DB_NAME() AS database_name
    ,o.NAME AS stat_name
    ,i.NAME AS [Index Name]
    ,STATS_DATE(i.[object_id], i.index_id) AS [Statistics Date]
    ,s.auto_created
    ,s.no_recompute
    ,s.user_created
    ,st.row_count
FROM sys.objects AS o WITH (NOLOCK)
INNER JOIN sys.indexes AS i WITH (NOLOCK)
    ON o.[object_id] = i.[object_id]
INNER JOIN sys.stats AS s WITH (NOLOCK)
    ON i.[object_id] = s.[object_id]
    AND i.index_id = s.stats_id
INNER JOIN sys.dm_db_partition_stats AS st WITH (NOLOCK)
    ON o.[object_id] = st.[object_id]
    AND i.[index_id] = st.[index_id]
WHERE o.[type] = 'U';

I just had to remove ORDER BY and OPTION(RECOMPILE) because query hints cannot be used in views.

Querying the object definition returns the whole definition of the view, not only the SELECT statement:

SELECT OBJECT_DEFINITION(OBJECT_ID('statisticsLastUpdate')) AS definition
definition
-------------------------------------------------------------------
-- When were Statistics last updated on all indexes?  (Query 48)
CREATE VIEW statisticsLastUpdate
AS
SELECT
     DB_NAME() AS database_name
    ,o.NAME AS stat_name
    ,i.NAME AS [Index Name]
    ,STATS_DATE(i.[object_id], i.index_id) AS [Statistics Date]
    ,s.auto_created
    ,s.no_recompute

(1 row(s) affected)

In order to extract the SELECT statement, we would need something able to parse (properly!) the view definition and we all know how complex it can be.

Fortunately, SQL Server ships with an undocumented function used in replication that can help solving the problem: its name is fn_replgetparsedddlcmd.

This function accepts some parameters, lightly documented in the code: fn_replgetparsedddlcmd (@ddlcmd, @FirstToken, @objectType, @dbname, @owner, @objname, @targetobject)

Going back to the example, we can use this function to extract the SELECT statement from the view definition:

SELECT master.sys.fn_replgetparsedddlcmd(
    OBJECT_DEFINITION(OBJECT_ID('statisticsLastUpdate'))
        ,'CREATE'
        ,'VIEW'
        ,DB_NAME()
        ,'dbo'
        ,'statisticsLastUpdate'
        ,NULL
) AS statement
statement
---------------------------------------------------------------------
AS
SELECT
     DB_NAME() AS database_name
    ,o.NAME AS stat_name
    ,i.NAME AS [Index Name]
    ,STATS_DATE(i.[object_id], i.index_id) AS [Statistics Date]
    ,s.auto_created
    ,s.no_recompute
    ,s.user_created
    ,st.row_count

(1 row(s) affected)

The text returned by the function still contains the “AS” keyword, but removing it is a no-brainer:

DECLARE @cmd nvarchar(max)
SELECT @cmd = master.sys.fn_replgetparsedddlcmd(
    OBJECT_DEFINITION(OBJECT_ID('statisticsLastUpdate'))
        ,'CREATE'
        ,'VIEW'
        ,DB_NAME()
        ,'dbo'
        ,'statisticsLastUpdate'
        ,NULL
    )

SELECT @cmd = RIGHT(@cmd, LEN(@cmd) - 2) -- Removes "AS"

SELECT @cmd AS statement
statement
-------------------------------------------------------------------

SELECT
     DB_NAME() AS database_name
    ,o.NAME AS stat_name
    ,i.NAME AS [Index Name]
    ,STATS_DATE(i.[object_id], i.index_id) AS [Statistics Date]
    ,s.auto_created
    ,s.no_recompute
    ,s.user_created
    ,st.row_count

(1 row(s) affected)

Now that we are able to read the SELECT statement from a view’s definition, we can execute that statement against any database we like, or even against all the databases in the instance.

-- =============================================
-- Author:      Gianluca Sartori - spaghettidba
-- Create date: 2013-04-16
-- Description: Extracts the view definition
--              and runs the statement in the
--              database specified by @db_name
--              If the target database is a pattern,
--              the statement gets executed against
--              all databases matching the pattern.
-- =============================================

CREATE PROCEDURE [dba_execute_view]
     @view_name sysname
    ,@db_name sysname
AS
BEGIN

SET NOCOUNT,
    XACT_ABORT,
    QUOTED_IDENTIFIER,
    ANSI_NULLS,
    ANSI_PADDING,
    ANSI_WARNINGS,
    ARITHABORT,
    CONCAT_NULL_YIELDS_NULL ON;
    SET NUMERIC_ROUNDABORT OFF;

DECLARE @cmd nvarchar(max)
DECLARE @sql nvarchar(max)

DECLARE @vw_schema sysname
DECLARE @vw_name sysname

IF OBJECT_ID(@view_name) IS NULL
BEGIN
    RAISERROR('No suitable object found for name %s',16,1,@view_name)
    RETURN
END

IF DB_ID(@db_name) IS NULL
    AND @db_name NOT IN ('[USER]','[SYSTEM]')
    AND @db_name IS NOT NULL
BEGIN
    RAISERROR('No suitable database found for name %s',16,1,@view_name)
    RETURN
END

SELECT @vw_schema = OBJECT_SCHEMA_NAME(OBJECT_ID(@view_name)),
    @vw_name = OBJECT_NAME(OBJECT_ID(@view_name))

SELECT @cmd = master.sys.fn_replgetparsedddlcmd(
    OBJECT_DEFINITION(OBJECT_ID(@view_name))
        ,'CREATE'
        ,'VIEW'
        ,DB_NAME()
        ,@vw_schema
        ,@vw_name
        ,NULL
    )

SELECT @cmd = RIGHT(@cmd, LEN(@cmd) - 2) -- Removes "AS"

-- CREATE A TARGET TEMP TABLE
SET @sql = N'
    SELECT TOP(0) * INTO #results FROM ' + @view_name + ';

    INSERT #results
    EXEC [dba_ForEachDB]
        @statement = @cmd,
        @name_pattern = @db_name;

    SELECT * FROM #results;'

EXEC sp_executesql
     @sql
    ,N'@cmd nvarchar(max), @db_name sysname'
    ,@cmd
    ,@db_name

END

The procedure depends on dba_ForEachDB, the stored procedure I posted a couple of years ago that replaces the one shipped by Microsoft. If you still prefer their version, you’re free to modify the code as you wish.

Now that we have a stored procedure that “executes” a view, we can use it to query statistics update information from a different database:

EXEC [dba_execute_view] 'statisticsLastUpdate', 'msdb'

single_db

We could also query the same information from all user databases:

EXEC [dba_execute_view] 'statisticsLastUpdate', '[USER]'

multidb

That’s it, very easy and straightforward.

Just one suggestion for the SELECT statements in the views: add a DB_NAME() column, in order to understand where the data comes from, or it’s going to be a total mess.

Next steps:

This is just the basic idea, the code can be improved in many ways.

For instance, we could add a parameter to decide whether the results must be piped to a temporary table or not. As you probably know, INSERT…EXEC cannot be nested, so you might want to pipe the results to a table in a different way.

Another thing you might want to add is the ability to order the results according to an additional parameter.

To sum it up, with a little help from Microsoft, we can now safely create a database packed with all our administration stuff and execute the queries against any database in our instance.

Moving system databases to the default data and log paths


Recently I had to assess and tune quite a lot of SQL Server instances and one the things that are often overlooked is the location of the system databases.

I often see instance where the system databases are located in the system drives under the SQL Server default installation path, which is bad for many reasons, especially for tempdb.

I had to move the system databases so many times that I ended up coding a script to automate the process.

The script finds all system databases that are not sitting in the default data and log paths and issues the ALTER DATABASE statements needed to move the files to the default paths.

Obviously, to let the script work, the default data and log paths must have been set in the instance properties:

MoveSystemDBs

You may also point out that moving all system databases to the default data and log paths is not always a good idea. And you would be right: for instance, if possible, the tempdb database should be working on a fast dedicated disk. However, very often I find myself dealing with low-end servers where separate data and log disks are a luxury, not to mention a dedicated tempdb disk.  If you are concerned about moving tempd to the default data and log paths, you can modify the script accordingly.

-- =============================================
-- Author:      Gianluca Sartori - spaghettidba
-- Create date: 2013-03-22
-- Description: Moves the system databases to the
--              default data and log paths and 
--              updates SQL Server startup params
--              accordingly.
-- =============================================
SET NOCOUNT ON;

USE master;

-- Find default data and log paths
-- reading from the registry

DECLARE @defaultDataLocation nvarchar(4000)
DECLARE @defaultLogLocation nvarchar(4000)

EXEC master.dbo.xp_instance_regread
    N'HKEY_LOCAL_MACHINE',
    N'Software\Microsoft\MSSQLServer\MSSQLServer',
    N'DefaultData',
    @defaultDataLocation OUTPUT

EXEC master.dbo.xp_instance_regread
    N'HKEY_LOCAL_MACHINE',
    N'Software\Microsoft\MSSQLServer\MSSQLServer',
    N'DefaultLog',
    @defaultLogLocation OUTPUT

-- Loop through all system databases
-- and move to the default data and log paths

DECLARE @sql nvarchar(max)

DECLARE stmts CURSOR STATIC LOCAL FORWARD_ONLY
FOR
SELECT
    ' ALTER DATABASE '+ DB_NAME(database_id) +
    ' MODIFY FILE ( ' +
    '     NAME = '''+ name +''', ' +
    '     FILENAME = '''+
    CASE type_desc
        WHEN 'ROWS' THEN @defaultDataLocation
        ELSE @defaultLogLocation
    END +
    '\'+ RIGHT(physical_name,CHARINDEX('\',REVERSE(physical_name),1)-1) +'''' +
    ' )'
FROM sys.master_files
WHERE DB_NAME(database_id) IN ('master','model','msdb','tempdb')
    AND (
        physical_name NOT LIKE @defaultDataLocation + '%'
        OR physical_name NOT LIKE @defaultLogLocation + '%'
    )

OPEN stmts
FETCH NEXT FROM stmts INTO @sql

WHILE @@FETCH_STATUS = 0
BEGIN

    PRINT @sql
    EXEC(@sql)

    FETCH NEXT FROM stmts INTO @sql
END

CLOSE stmts
DEALLOCATE stmts

-- Update SQL Server startup parameters
-- to reflect the new master data and log
-- files locations

DECLARE @val nvarchar(500)
DECLARE @key nvarchar(100)

DECLARE @regvalues TABLE (
    parameter nvarchar(100),
    value nvarchar(500)
)

INSERT @regvalues
EXEC master.dbo.xp_instance_regenumvalues
        N'HKEY_LOCAL_MACHINE',
        N'SOFTWARE\Microsoft\MSSQLServer\MSSQLServer\Parameters'

DECLARE reg CURSOR STATIC LOCAL FORWARD_ONLY
FOR
SELECT *
FROM @regvalues
WHERE value LIKE '-d%'
    OR value LIKE '-l%'

OPEN reg

FETCH NEXT FROM reg INTO @key, @val

WHILE @@FETCH_STATUS = 0
BEGIN

    IF @val LIKE '-d%'
        SET @val = '-d' + (
            SELECT physical_name
            FROM sys.master_files
            WHERE DB_NAME(database_id) = 'master'
                AND type_desc = 'ROWS'
        )
    IF @val LIKE '-l%'
        SET @val = '-l' + (
            SELECT physical_name
            FROM sys.master_files
            WHERE DB_NAME(database_id) = 'master'
                AND type_desc = 'LOG'
        )

    EXEC master.dbo.xp_instance_regwrite
        N'HKEY_LOCAL_MACHINE',
        N'SOFTWARE\Microsoft\MSSQLServer\MSSQLServer\Parameters',
        @key,
        N'REG_SZ',
        @val

    FETCH NEXT FROM reg INTO @key, @val

END

CLOSE reg
DEALLOCATE reg

After running this script, you can shut down the SQL Server service and move the data and log files to the appropriate locations.

When the files are ready, you can bring SQL Server back online.

BE CAREFUL! Before running this script against a clustered instance, check what the xp_instance_regread commands return: I have seen cases with SQL Server not reading from the appropriate keys.