Category Archives: SQL Server

SQL Server Infernals – Circle 3: Shaky Typers


Infernals

Choosing the right data type for your columns is first of all a design decision that has tremendous impact on the correctness of the database schema. It is not just about performance or space usage: the data type is the first constraint on your data and it decides what can be persisted in your columns and what is not acceptable.

Choosing the wrong data type for your columns is a mistake that might make your life as a DBA look like hell.

What they say in Heaven

Guided by angelic spells, the hands that design databases in Heaven always choose the right data type. Database architects always look at the logical schema and ask the right questions about each attribute and they always manage to understand what the attribute is used for and what it will be used for in the future.

What will put you to hell

Choosing the wrong data type is like trying to fit a square peg in a round hole. The worst thing about it is that you end up damaging the peg… ahem… the data.

SquarePegRoundHole

  1. Using numeric data types for non-numeric attributes: Even if a telephone number contains only digits and it’s called telephone number, it is not a number at all. It does not allow mathematical operations and it has no order relation (saying that a telephone number is greater than another one makes no sense). In fact, a telephone number is a code you have to dial to contact a telephone extension. The same can be said for ZIP codes, which only allow numeric digits, but are nothing like a number. Storing this data in a numeric column is looking for trouble.
  1. Storing data as their human-readable representation: A Notable example is dates stored as (var)char. The string representation of a date is not a date at all: without the validation rules included in the date types, any invalid date could be saved in your column, including ‘2015-02-30’ or ‘2015-33-99’. Moreover, varchar columns do not allow date manipulation functions, such as DATEADD, DATEDIFF, YEAR, MONTH and so on. Another reason why this is a terrible idea is that dates have their own sorting rules, which you lose when you store them as strings. You also need more storage space to save a string representation of a date compared to the proper date type. If you really want to convert a date to a string, you can find many algorithms and functions to perform the conversion in this article I wrote for SQLServerCentral in 2012, but please do it in your presentation layer, not when storing the data in your tables.
    Another surprisingly common mistake in the AS/400 world is storing dates in three separate integer columns for year, month and day. I have no idea where this pattern comes from, but it definitely belongs to hell.
    While much more uncommon in the wild, the same applies to numbers: storing them as varchars is a terrible idea.
    Extra evil bonus: you get double evil points for storing dates and numbers as nvarchar: double the storage, double the pain.
  1. Using deprecated data types: (n)text and image are things of the past: get over it. The replacement (n)varchar(max) and varbinary(max) are much more powerful and flexible.
  1. Using “extended” data type just to “be safe”: This applies both to numeric and character columns: using a bigger data type just to play it safe can be a good idea at times, but not when the size of the column is well known upfront and is instead a vital constraint on the data itself. For instance, a ZIP code longer than 5 characters is obviously an error. A social security number longer than 9 digits is not valid.
    Along the same lines, storing years in a int column is only going to be a waste of storage space. The same can be said about small lookup tables with just a handful of rows in them, where the key column can be a smallint or even a tinyint: it won’t save much space in the lookup table itself, but it can save lots of space in the main tables (with many more rows) where the code is referenced.
  1. Storing fixed-size information in varchar columns: Similarly to the previous sin, when your attribute has a fixed character size, there is no point in using a varying character type. If your attribute has exactly 3 characters, why use varchar(3)?
    Extra evil bonus: varchar(1) will get you double points.
  1. Storing duration in time or datetime columns: Datetime and time represent points in time and they are not meant for storing durations. If you really want to store a duration, use a numeric column to store the number of seconds (it’s the ANSI standard unit measure for representing a duration). Even better, you could store the start/end date and time in two separate datetime columns. SQL Server 2016 also supports periods. 
  1. Getting Unicode wrong: Choosing nvarchar for attributes that will never contain Unicode data and choosing varchar for attributes that can contain Unicode data are equally evil and will get you instant damnation. For instance, a ZIP code will only contain numeric characters, so using Unicode data types will have the only outcome of wasting space. At the same time, storing customer business names or annotations in varchar columns means that you won’t be able to persist international characters. While it may appear quite unlikely that such characters will ever appear in your database, you will regret your decision when that happens (and it will). 
  1. Messing with XML: I’m not a big fan of XML in the database, but sometimes it can come handy. Storing XML data in a plain varchar column is a very bad idea. The XML data type provides validation rules that won’t allow in invalid or malformed XML and also provides functions to manipulate the XML data. Storing schema-less XML is another bad idea: if you have an XML schema use it, otherwise you will end up saving invalid data. On the other hand, using XML to go “beyond relational” and mimic Oracle’s nested tables will only get you damned. Fun times. 
  1. Using different data types in different tables for the same attribute: there’s only one thing worse than getting your data types wrong: getting them wrong in multiple places. Once you decided the data type to store an attribute, don’t change your mind when designing new tables. If it is a varchar(10), don’t use varchar(15) in your next table. Usually proper foreign key constraints help you avoid this issue, but it’s not always the case.
    If this query returns rows, chances are that you have schizophrenic columns in your database schema:

    WITH my_schema AS (
        SELECT OBJECT_NAME(c.object_id) AS table_name,
            c.name AS column_name,
            t.name AS type_name,
            c.max_length,
            c.precision,
            c.scale
        FROM sys.columns AS c
        INNER JOIN sys.types AS t
            ON c.system_type_id = t.system_type_id
    ),
    incarnations AS (
        SELECT *,
            DENSE_RANK() OVER (
                PARTITION BY column_name
                ORDER BY type_name, max_length, precision, scale
            ) AS incarnation_number
        FROM my_schema
    ),
    incarnation_count AS (
        SELECT *,
            MAX(incarnation_number) OVER (
                PARTITION BY column_name
            ) AS incarnation_count
        FROM incarnations
    )
    SELECT *
    FROM incarnation_count
    WHERE incarnation_count > 1
    ORDER BY incarnation_count DESC,
        column_name,
        type_name,
        max_length,
        precision,
        scale;
    

The lack of proper constraints will be the topic of the next post, when we will meet the anarchic designers. Stay tuned!

SQL Server Infernals – Circle 2: Generalizers


Infernals

Object-Oriented programming taught us that generalizing is a good thing and, whenever possible, we should do it. Complex class hierarchies are a good way of reusing code, hitting the specialized classes only when a special implementation is needed.

In the database world, the concept doesn’t play exactly well.

What they say in Heaven

In Heaven, there is a lookup table for each attribute, no matter how simple and no matter how small is the lookup table.

For instance, if your database is about sales, you probably have a Customers table and an Orders table, each with its own attributes resolved through a Foreign Key. The lookup tables are usually very small, with just a handful of rows in them:

lookup_good

Temptation comes from our own desires

Wouldn’t that be great if you could stop adding small, insignificant tables to your database schema? Wouldn’t it be a lot easier if you had ONE table to store all that lookup nonsense? “Less is more” after all, isn’t it?

If you had a “One True Lookup Table”, everything would be more elegant and simple. Look at this database schema:

lookup_bad

Isn’t it elegant and clean?

Whenever you need a new lookup table, you just have to add rows to your OTLT™ (thanks Phil Factor for the acronym):

INSERT INTO LookupTable
    (table_name, lookup_code, lookup_description)
VALUES
    ('Order_Status', 'OP', 'Open'),
    ('Order_Status', 'CL', 'Closed'),
    ('Order_Status', 'SH', 'Shipped');

Devil’s in the details

You may have less tables to deal with now, but there’s a price to pay. A bigger price than you would have expected.

  1. No foreign keys: Did you notice that the foreign keys are gone? In order to create a foreign key, you would have to add the lookup table name to the Orders and Customers tables, for each attribute stored in the lookup table. I don’t think you would like it.
  1. Generic data type: In order to merge all lookup tables in one, you need to choose a “generic” data type that fits for all. The most generic data type is a character-based type, so you’ll probably end up with a huge nvarchar column. You probably don’t want the same huge column in the referencing tables and you could end up having different data types between the main tables and the lookups. One more not-so-good idea. Moreover, when you’re joining your tables with the lookup table, you will have implicit (or explicit) conversions happening, which is a performance nightmare.
  2. Single Hotspot: Instead of hitting multiple tables for lookups, everyone will hit the same table over and over. This will create a hotspot in the database, with locking and latching issues all over the place.
  1. Acrobatic constraints: Defining constraints on a generic table becomes very difficult. Not an impossible deal, but very difficult. For the schema in this example, you could define a CHECK constraint to enforce the use of the correct data type, but the syntax of the constraint will not be very straightforward:
    CHECK(
        CASE
            WHEN lookup_code = 'states'     AND lookup_code LIKE '[A-Z][A-Z]'      THEN 1
            WHEN lookup_code = 'priorities' AND lookup_code LIKE '[0-9]'           THEN 1
            WHEN lookup_code = 'countries'  AND lookup_code LIKE '[0-9][0-9][0-9]' THEN 1
            WHEN lookup_code = 'status'     AND lookup_code LIKE '[A-Z][A-Z]'      THEN 1
            ELSE 0
        END = 1
    )
    

It could get even worse

As soon as you start to realize that trading multiple lookup tables for an OTLT is not a good deal, devil will raise the bid and offer the ultimate generalization: the Entity Attribute Value, also known as “EAV”.

If you come to think of it, who needs fixed attributes in a table when you can have as many attributes as you want in a general-purpose table? Why messing with ALTER TABLE statements when you can have a single table that can store an infinite number of attributes that you can bind to any row in any table?

A typical EAV schema looks like this:

EAV

This way, you can have any type of attribute bound to your main entities. For instance, to define a “ship_date” attribute in your Orders table, you just have to insert a couple of rows in your EAV schema:

INSERT INTO Entities
    (entity_id, entity_name)
VALUES
    (1, 'Orders');

INSERT INTO AttributeNames
    (attribute_id, entity_id, attribute_name)
VALUES
    (1, 1, 'ship_date');

INSERT INTO AttributeValues
    (attribute_id, entity_id, id, value)
VALUES
    (1, 1, 123, '2015-06-24 22:10:00.000');

Looks like a great idea, doesn’t it? Unfortunately, it is not.

  1. Generic data types: again, what would prevent a date such as ‘2015-02-30 18:30:00.000’ from being assigned to the ship date? Uh-oh: nothing.
  1. No foreign keys: again, enforcing foreign key constraints would be impossible.
  1. A single hotspot in the database: every attribute for every table involved in this nonsense would have to be looked up in the same table.
  1. No constraints: how would you enforce a constraint as simple as “NOT NULL”? Good luck with that.
  1. Dreadful reporting queries: when you will be asked to create a report on a table that uses this paradigm (I said “when”, not “if”, because it will happen), you will have to OUTER JOIN to the EAV table for each and every attribute that you want to retrieve. In case you are wondering if this is good or bad, take into account that the optimizer starts to freak out when it finds too many JOINS in a query and will likely timeout looking for a decent execution plan, feeding you the best it could come up with (usually, a mess).

It depends?

Some software solutions are entirely based on user-defined attributes and the ability to define them is a central feature. For instance, many CRM solutions are heavily dependent on user-defined attributes. However, there are many ways to achieve the same results without resorting to an EAV design. For instance, one could wonder why ALTERing the database schema seems to be a less desirable solution.

There are also many flavors of EAV, with different degrees of evil involved. Some implementations at least provide different columns for different data types, some others use XML or JSON.

The EAV design comes with the intent of solving a real world problem that doesn’t have a definitive answer in the relational model. In partial defense of the “generalizers”, it has to be said that this is a challenging problem. Nevertheless, like Dante put his political enemies to hell, I am the “poet” and I’m afraid that the generalizers will have to get accustomed to sulfur. It just takes a couple of thousand years, after all.

Who will be damned next?

In the next circle of the SQL Server hell we will meet the shaky typers – the poor souls that chose the wrong data types for their columns. Stay tuned for more!

SQL Server Infernals – Circle 1: Undernormalizers


Infernals

There’s a special place in the SQL Server Hell for those who design their schema without following the Best Practices. In this first episode of SQL Server Infernals, we will explore together the Row of the Poor Schema Designers, also known as “undernormalizers”.

What they say in Heaven

In Heaven, where all Best Practices are followed and everything runs smoothly while angels sing, they design their databases following the rules of normalization. Once upon a time, there was a man who spent a considerable amount of his life working on defining the rules of the relational model. That man was Edgar Codd.

Mr. Codd laid down the rules of normalization, which are known as “normal forms”. The normal forms define the attributes of a well-designed database schema. While there are more normal forms, it is widely accepted that a schema is normalized when it follows the first three normal forms. Here is the simplest possible enunciations of each:

  • 1NF – Every relation has a primary key, every relation contains only atomic attributes
  • 2NF – 1NF + Every attribute in a relation depends on the whole key
  • 3NF – 2NF + Every attribute in a relation depends only on the key

In a single line: “The key, the whole key, nothing but the key (so help me Codd)”.

Clues you’re doing it wrong

  • Repeating data (redundancies): the same information has to be saved in multiple places
  • Inconsistent data between tables (anomalies): the same information has different values in different tables
  • Data separated by commas
  • Structured data in “note” columns
  • Columns with a numeric suffix (e.g. Zone1, Zone2, Zone3…)

databasedesign

What will put you to hell

  1. No primary key: did you notice that the normal forms talk about “relations” rather than “tables”? The relational model is a mathematical model, which, at some point has to be translated to a physical implementation. Tables are exactly this: the physical implementation of relations.
    If your table has no primary key and relations must have a primary key, chances are that your table is the physical implementation of something else (a bin, a pile, a bag… whatever: not a relation anyway).
    When tables have no primary key, any data can be stored inside them, even duplicate rows. Once duplicate data is inside the table, there is no way to tell which row is good and which one is the duplicate.
  1. Surrogate keys only: this is in fact a variation on the “no primary key” sin: if your table has a surrogate key (such as an identity or uniqueidentifier column), make sure that it is not the only unique key in the table, otherwise you will end up storing duplicates, with only the surrogate key as a difference. This is no different from having no primary key at all.
    If you decide that your table is best implemented with a surrogate key (often because the natural key is composite or too wide), make sure that you create a UNIQUE constraint on the natural key.
  1. Non-atomic attributes: if your table has columns that contain multiple values, the likelihood of a design mistake goes to the roof. For instance, when you find data such as “sales@company.com,marketing@company.com” in a “email” column, chances are that the designer of the database forgot to take into account that the entity (for instance a customer) might have multiple email addresses.
    While many efficient split algorithms are available, storing the data in this format has many downsides:

    • Indexing individual items is impossible
    • Searching for individual items is hard
    • Updating an item requires writing the whole comma separated value
    • Locking a single item is impossible (reduced concurrency)
    • CHECK constraints are hard to implement

    Whenever you find non-atomic attributes, be prepared to refactor the database schema, because something is really wrong and there is no way to fix it without moving the attribute to a different table.

  1. Use of NULL when not necessary: NULL is a constraint on the data: if an attribute is mandatory, make it mandatory! Allowing NULLs on mandatory data will open the door to data that does not meet the business rules. What are you doing with rows that are missing mandatory attributes?
    If your table has too many NULLs in it, you probably have designed it wrong and you are trying to fit too many attributes in a single table (an implicit dependency?): move them to a separate table.
  1. Use of dummy data: The other side of the coin is “no NULLs allowed anywhere”. Not all attributes are mandatory: if you pretend it is so, the users will start putting dummy data into your columns to work around the restriction. Typical examples are “.” or empty strings for character-based columns and “0” for numeric-based columns. Once those dummy values are in, can you tell the difference between “dummy” zeros and “real” zeros? Of course you can’t.
  1. Designing the database when specs are incomplete/unclear: This is the worst of all mistakes. Changing the database schema once it is in production is a bloodbath: everything built on top of that schema will have to change. Many of the design mistakes described above are the consequence of incomplete specifics or lack of analysis.
    It is discouraging to note how some popular design patterns do not take into account the intrinsic complexity of refactoring a database schema and demand the implementation of the schema to automated tools, based on the object classes that represent the domain. Yes, I’m talking about you, Code First. In my book, “code first” is a synonym of “design someday”. Don’t let any automated tool design your database schema: you know better than that!
  1. Premature denormalization: some devil’s advocates will tell you that normalization slows down the database and that you should be denormalizing your schema from the start. Don’t believe what they say! Most normalized schemas can cope with sustained reads and SQL Server offers many features (such as indexed views) to deal with high numbers of joins, which is usually the point for denormalization. Of course, if you’re working on a BI project, denormalization is expected and desirable instead.
    An old saying goes: “Normalize ‘til it hurts, then denormalize ‘til it works”, but there’s nothing preventing the database from working on a normalized schema. If reads on the database are slow, it is quite unlikely that the cause is over-normalization: it is much more presumable that your queries and/or your indexes suck are sub-optimal.

In the next episode of SQL Server Infernals I will discuss a particular database design sin: the dynamic schema. Stay tuned for more.

Announcing SQL Server Infernals


Infernals

Today I’m starting a new blog series called “SQL Server Infernals”.

Throughout this series, I will take your hand and walk you through the hell of SQL Server Worst Practices, as Virgil did with Dante in his Commedia.

You may ask why you should care about worst practices, when you have loads of great sources for Best Practices. The answer is that they are not enough.

  • There are too many Best Practices: how are you supposed to know all of them?
  • There is no time to follow them all: when you’re in a hurry, sometimes it’s enough to know that you’re not doing it completely wrong.
  • They seem to be all equally important: experience helps you understand which Best Practices really are important and which ones are not.
  • It’s not always clear what happens when you don’t follow them.

On the other hand, Worst Practices can help you understand what threat is behind the corner:

  • They show you the mistakes to avoid
  • You can learn from someone else’s mistakes (I made plenty throughout my career…)

Obviously, as usual, it depends: not everything is black or white and sometimes you will find in this blog series something that not everyone will agree is a Worst Practice. Sometimes you are forced by some constraint to adopt a solution that you will find here listed as a bad idea. No problem: nobody is pointing a finger at your work.

I will break the Worst Practices in categories, each one related to a specific area of SQL Server:

worst_practices_areas

Stay tuned for your walk through the SQL Server hell!

Your pilgrim’s guide,
The SpaghettiDBA

Native Client Aliases don’t like Trailing Spaces


I usually don’t post small things like this, but today I fought with this obnoxious problem long enough to convince me that it deserved a shout out to the community.

When you create an alias in the SQL Server Configuration Manager, make sure that the alias name contains no spaces, otherwise it won’t work as you expect.

In my case, I had a Reporting Services instance with many data sources pointing to a SQL Server instance (let’s call it MyServer) and I wanted to redirect connections to a different instance using an alias. So I opened Configuration Manager and created an alias like this:

alias

To my great surprise, the alias didn’t work and it took quite some time to notice that something was wrong. The server “MyServer” was a perfectly working existing instance of SQL Server, so no connection was dropped: they all just happened to contact the wrong server. If spotting the problem was hard, fixing it turned out to be even harder: why on earth did the alias refuse to work, while all other aliases were working perfectly?

It turned out to be the simplest of all answers: a trailing space in the alias name.

Just looking at the alias properties it wasn’t too obvious that something was off, but clicking on the alias name field, the cursor appeared slightly more on the right than it should have been:

alias2

Bottom line is: always check your assumptions, because problems like to hide where you won’t search for them.

Counting the number of rows in a table


Don’t be fooled by the title of this post: while counting the number of rows in a table is a trivial task for you, it is not trivial at all for SQL Server.

Every time you run your COUNT(*) query, SQL Server has to scan an index or a heap to calculate that seemingly innocuous number and send it to your application. This means a lot of unnecessary reads and unnecessary blocking.

Jes Schultz Borland blogged about it some time ago and also Aaron Bertrand has a blog post on this subject. I will refrain from repeating here what they both said: go read their blogs to understand why COUNT(*) is a not a good tool for this task.

The alternative to COUNT(*) is reading the count from the table metadata, querying sys.partitions, something along these lines:

SELECT SUM(p.rows)
FROM sys.partitions p
WHERE p.object_id = OBJECT_ID('MyTable')
    AND p.index_id IN (0,1); -- heap or clustered index

Many variations of this query include JOINs to sys.tables, sys.schemas or sys.indexes, which are not strictly necessary in my opinion. However, the shortest version of the count is still quite verbose and error prone.

Fortunately, there’s a shorter version of this query that relies on the system function OBJECTPROPERTYEX:

SELECT OBJECTPROPERTYEX(OBJECT_ID('MyTable'),'cardinality')

Where does it read data from? STATISTICS IO doesn’t return anything for this query, so I had to set up an Extended Events session to capture lock_acquired events and find out the system tables read by this function:

sysalloc

Basically, it’s just sysallocunits and sysrowsets.

It’s nice, short and easy to remember. Enjoy.

How to post a T-SQL question on a public forum


If you want to have faster turnaround on your forum questions, you will need to provide enough information to the forum users in order to answer your question.

In particular, talking about T-SQL questions, there are three things that your question must include:

  1. Table scripts
  2. Sample data
  3. Expected output

 

Table Script and Sample data

Please make sure that anyone trying to answer your question can quickly work on the same data set you’re working on, or, at least the problematic part of it. The data should be in the same place where you have it, which is inside your tables.

You will have to provide a script that creates your table and inserts data inside that table.

Converting your data to INSERT statements can be tedious: fortunately, some tools can do it for you.

How do you convert a SSMS results grid, a CSV file or an Excel spreadsheet to INSERT statements? In other words, how do you convert this…

table_data

into this?

USE [tempdb]
GO

CREATE TABLE [dbo].[Person](
	[BusinessEntityID] [int] NOT NULL PRIMARY KEY CLUSTERED,
	[PersonType] [nchar](2) NOT NULL,
	[FirstName] [nvarchar](50) NOT NULL,
	[LastName] [nvarchar](50) NOT NULL
)

GO

INSERT INTO Person VALUES (6106,'IN','Beth','Carlson');
INSERT INTO Person VALUES (17889,'IN','Dennis','Li');
INSERT INTO Person VALUES (17989,'IN','Brent','Li');
INSERT INTO Person VALUES (9424,'IN','Brad','Raji');
INSERT INTO Person VALUES (5842,'IN','Aimee','She');
INSERT INTO Person VALUES (2144,'GC','Carol','Philips');
INSERT INTO Person VALUES (2582,'IN','Gregory','Tang');
INSERT INTO Person VALUES (2012,'SC','Jian','Wang');
INSERT INTO Person VALUES (12624,'IN','Clayton','She');
INSERT INTO Person VALUES (12509,'IN','Madison','Russell');
GO

The easiest way to perform the transformation is to copy all the data and paste it over at ConvertCSV:

ConvertCSVToSQL_1

ConvertCSVToSQL_2

ConvertCSVToSQL_3

Another great tool for this task is SQLFiddle.

OPTIONAL: The insert statements will include the field names: if you want to make your code more concise, you can remove that part by selecting the column names with your mouse holding the ALT key and then delete the selection. Here’s a description of how the rectangular selection works in SSMS 2012 and 2014 (doesn’t work in SSMS 2008).

Expected output

The expected output should be something immediately readable and understandable. There’s another tool that can help you obtain it.

Go to http://www.sensefulsolutions.com/2010/10/format-text-as-table.html and paste your data in the textarea, press “Create Table” and voila.

FormatAsTable_2

Here’s what your output should look like:

+------------+-------------+
| PersonType | PersonCount |
+------------+-------------+
| GC         |           1 |
| IN         |           8 |
| SC         |           1 |
+------------+-------------+

Show what you have tried

Everybody will be more willing to help you if you show that you have put some effort into solving your problem. If you have a query, include it, even if it doesn’t do exactly what you’re after.

Please please please, format your query before posting! You can format your queries online for free at PoorSQL.com

PoorMans2

Simply paste your code then open the “Formatted SQL” tab to grab your code in a more readable way.

Putting it all together

Here is what your question should look like when everything is ok:

Hi all, I have a table called Person and I have to extract the number of rows for each person type.

This is the table script and some sample data:

USE [tempdb]
GO

CREATE TABLE [dbo].[Person](
	[BusinessEntityID] [int] NOT NULL PRIMARY KEY CLUSTERED,
	[PersonType] [nchar](2) NOT NULL,
	[FirstName] [nvarchar](50) NOT NULL,
	[LastName] [nvarchar](50) NOT NULL
)

GO

INSERT INTO Person VALUES (6106,'IN','Beth','Carlson');
INSERT INTO Person VALUES (17889,'IN','Dennis','Li');
INSERT INTO Person VALUES (17989,'IN','Brent','Li');
INSERT INTO Person VALUES (9424,'IN','Brad','Raji');
INSERT INTO Person VALUES (5842,'IN','Aimee','She');
INSERT INTO Person VALUES (2144,'GC','Carol','Philips');
INSERT INTO Person VALUES (2582,'IN','Gregory','Tang');
INSERT INTO Person VALUES (2012,'SC','Jian','Wang');
INSERT INTO Person VALUES (12624,'IN','Clayton','She');
INSERT INTO Person VALUES (12509,'IN','Madison','Russell');

This is what I’m trying to obtain:

+------------+-------------+
| PersonType | PersonCount |
+------------+-------------+
| GC         |           1 |
| IN         |           8 |
| SC         |           1 |
+------------+-------------+

Here is what I have tried:

SELECT PersonType
FROM Person

How do I do that?

If you include this information in your posts, I promise you will get blazingly fast answers.

Tracking Table Usage and Identifying Unused Objects


One of the things I hate the most about “old” databases is the fact that unused tables are kept forever, because nobody knows whether they’re used or not. Sometimes it’s really hard to tell. Some databases are accessed by a huge number of applications, reports, ETL tools and God knows what else. In these cases, deciding whether you should drop a table or not is a tough call.

Search your codebase

The easiest way to know if a table is used, is to search the codebase for occurences of the table name. However, finding the table name in the code does not mean it is used: there are code branches that in turn are not used. Modern languages and development tools can help you identify unused methods and objects, but it’s not always feasible or 100% reliable (binary dependencies, scripts, dynamic code are, off top of my head, some exceptions).
On the other hand, not finding the table name in the code does not mean you can delete it with no issues. The table could be used by dynamic code and the name retrieved from a configuration file or a table in the database.

In other cases, the source code is not available at all.

Index usage: clues, not evidence

Another way to approach the problem is by measuring the effects of the code execution against the database, in other words, by looking at the information stored by SQL Server whenever a table is accessed.

The DMV sys.dm_db_index_usage_stats records information on all seeks, scans, lookups and updates against indexes and is a very good place to start the investigation. If something is writing to the table or reading from it, you will see the numbers go up and the dates moving forward.
Great, so we’re done and this post is over? Not exactly: there are some more facts to take into account.

First of all, the DMV gets cleared every time SQL Server is restarted, so the accuracy of the data returned is heavily dependant on how long the instance has been running. Moreover, some actions (rebuilding the index, to name one) reset the index usage stats and if you want to rely on sensible stats, your only option is to persist the data in some place regularly.

To achieve this goal, I coded this simple stored procedure that reads the stats from the DMV and stores it in a table, updating the read and write counts for each subsequent execution.

-- You have a TOOLS database, right?
-- If not, create one, you will thank me later
USE TOOLS;
GO

-- A place for everything, everything in its place
IF SCHEMA_ID('meta') IS NULL
	EXEC('CREATE SCHEMA meta;')
GO

-- This table will hold index usage summarized at table level
CREATE TABLE meta.index_usage(
       db_name sysname,
       schema_name sysname,
       object_name sysname,
       read_count bigint,
       last_read datetime,
       write_count bigint,
       last_write datetime,
       PRIMARY KEY CLUSTERED (db_name, schema_name, object_name)
)

-- This table will hold the last snapshot taken
-- It will be used to capture the snapshot and
-- merge it with the destination table
CREATE TABLE meta.index_usage_last_snapshot(
       db_name sysname,
       schema_name sysname,
       object_name sysname,
       read_count bigint,
       last_read datetime,
       write_count bigint,
       last_write datetime,
       PRIMARY KEY CLUSTERED (db_name, schema_name, object_name)
)
GO

-- This procedure captures index usage stats
-- and merges the stats with the ones already captured
CREATE PROCEDURE meta.record_index_usage
AS
BEGIN

SET NOCOUNT ON;

IF OBJECT_ID('tempdb..#stats') IS NOT NULL
       DROP TABLE #stats;

-- We will use the index stats multiple times, so parking
-- them in a temp table is convenient
CREATE TABLE #stats(
       db_name sysname,
       schema_name sysname,
       object_name sysname,
       read_count bigint,
       last_read datetime,
       write_count bigint,
       last_write datetime,
       PRIMARY KEY CLUSTERED (db_name, schema_name, object_name)
);

-- Reads index usage stats and aggregates stats at table level
-- Aggregated data is saved in the temporary table
WITH index_stats AS (
       SELECT DB_NAME(database_id) AS db_name,
              OBJECT_SCHEMA_NAME(object_id,database_id) AS schema_name,
              OBJECT_NAME(object_id, database_id) AS object_name,
              user_seeks + user_scans + user_lookups AS read_count,
              user_updates AS write_count,
              last_read = (
                  SELECT MAX(value)
                  FROM (
                      VALUES(last_user_seek),(last_user_scan),(last_user_lookup)
                  ) AS v(value)
              ),
              last_write = last_user_update
       FROM sys.dm_db_index_usage_stats
       WHERE DB_NAME(database_id) NOT IN ('master','model','tempdb','msdb')
)
INSERT INTO #stats
SELECT db_name,
       schema_name,
       object_name,
       SUM(read_count) AS read_count,
       MAX(last_read) AS last_read,
       SUM(write_count) AS write_count,
       MAX(last_write) AS last_write
FROM index_stats
GROUP BY db_name,
       schema_name,
       object_name;

DECLARE @last_date_in_snapshot datetime;
DECLARE @sqlserver_start_date datetime;

-- reads maximum read/write date from the data already saved in the last snapshot table
SELECT @last_date_in_snapshot = MAX(CASE WHEN last_read > last_write THEN last_read ELSE last_write END)
FROM meta.index_usage_last_snapshot;

-- reads SQL Server start time
SELECT @sqlserver_start_date = sqlserver_start_time FROM sys.dm_os_sys_info;

-- handle restarted server: last snapshot is before server start time
IF (@last_date_in_snapshot) < (@sqlserver_start_date)
       TRUNCATE TABLE meta.index_usage_last_snapshot;

-- handle snapshot table empty
IF NOT EXISTS(SELECT * FROM meta.index_usage_last_snapshot)
       INSERT INTO meta.index_usage_last_snapshot
       SELECT * FROM #stats;

-- merges data in the target table with the new collected data
WITH offset_stats AS (
       SELECT newstats.db_name,
              newstats.schema_name,
              newstats.object_name,
              -- if new < old, the stats have been reset
              newstats.read_count -
                  CASE
                      WHEN newstats.read_count < ISNULL(oldstats.read_count,0) THEN 0
                      ELSE ISNULL(oldstats.read_count,0)
                  END
                  AS read_count,
              newstats.last_read,
              -- if new < old, the stats have been reset
              newstats.write_count -
                  CASE
                      WHEN newstats.write_count < ISNULL(oldstats.write_count,0) THEN 0
                      ELSE ISNULL(oldstats.write_count,0)
                  END
              AS write_count,
              newstats.last_write
       FROM #stats AS newstats
       LEFT JOIN meta.index_usage_last_snapshot AS oldstats
              ON newstats.db_name = oldstats.db_name
              AND newstats.schema_name = oldstats.schema_name
              AND newstats.object_name = oldstats.object_name
)
MERGE INTO meta.index_usage AS dest
USING offset_stats AS src
       ON src.db_name = dest.db_name
       AND src.schema_name = dest.schema_name
       AND src.object_name = dest.object_name
WHEN MATCHED THEN
       UPDATE SET read_count += src.read_count,
              last_read = src.last_read,
              write_count += src.write_count,
              last_write = src.last_write
WHEN NOT MATCHED BY TARGET THEN
       INSERT VALUES (
           src.db_name,
           src.schema_name,
           src.object_name,
           src.read_count,
           src.last_read,
           src.write_count,
           src.last_write
       );

-- empty the last snapshot
TRUNCATE TABLE meta.index_usage_last_snapshot;

-- replace it with the new collected data
INSERT INTO meta.index_usage_last_snapshot
SELECT * FROM #stats;

END

GO

You can schedule the execution of the stored procedure every hour or so and you will see data flow in the meta.index_usage_last_snapshot table. Last read/write date will be updated and the read/write counts will be incremented by comparing saved counts with the captured ones: if I had 1000 reads in the previous snapshot and I capture 1200 reads, the total reads column must be incremented by 200.

So, if I don’t find my table in this list after monitoring for some days, is it safe to assume that it can be deleted? Probably yes. More on that later.
What these stats don’t tell you is what to do when you do find the table in the list. It would be reasonable to think that the table is used, but there are several reasons why it may have ended up being read or written and not all of them will be ascribable to an application.
For instance, if a table is merge replicated, the replication agents will access it and read counts will go up. What the index usage stats tell us is that something is using a table but it says nothing about the nature of that something. If you want to find out more, you need to set up some kind of monitoring that records additional information about where reads and writes come from.

Extended Events to the rescue

For this purpose, an audit is probably too verbose, because it will record an entry for each access to each table being audited. The audit file will grow very quickly if not limited to a few objects to investigate. Moreover, audits have to be set up for each table and kept running for a reasonable time before drawing conclusions.

Audits are based on Extended Events: is there another way to do the same thing Audits do using extended events directly? Of course there is, but it’s trickier than you would expect.

First of all, the Extended Events used by the audit feature are not available directly. You’ve been hearing several times that audits use Extended Events but nobody ever told you which events they are using: the reason is that those events are not usable in a custom Extended Events session (the SecAudit package is marked as “private”). As a consequence, if you want to audit table access, you will have to use some other kind of event.

In order to find out which Extended Events provide information at the object level, we can query the sys.dm_xe_object_columns DMV:

SELECT object_name, description
FROM sys.dm_xe_object_columns
WHERE name = 'object_id'

As you will see, the only event that could help in this case is the lock_acquired event. Whenever a table is accessed, a lock will be taken and capturing those locks is a quick and easy way to discover activity on the tables.

Here is the definition of a session to capture locking information:

CREATE EVENT SESSION [audit_table_usage] ON SERVER
ADD EVENT sqlserver.lock_acquired (
    SET collect_database_name = (0)
        ,collect_resource_description = (1)
    ACTION(sqlserver.client_app_name, sqlserver.is_system, sqlserver.server_principal_name)
    WHERE (
        [package0].[equal_boolean]([sqlserver].[is_system], (0)) -- user SPID
        AND [package0].[equal_uint64]([resource_type], (5)) -- OBJECT
        AND [package0].[not_equal_uint64]([database_id], (32767))  -- resourcedb
        AND [package0].[greater_than_uint64]([database_id], (4)) -- user database
        AND [package0].[greater_than_equal_int64]([object_id], (245575913)) -- user object
        AND (
               [mode] = (1) -- SCH-S
            OR [mode] = (6) -- IS
            OR [mode] = (8) -- IX
            OR [mode] = (3) -- S
            OR [mode] = (5) -- X
        )
    )
)
WITH (
     MAX_MEMORY = 20480 KB
    ,EVENT_RETENTION_MODE = ALLOW_MULTIPLE_EVENT_LOSS
    ,MAX_DISPATCH_LATENCY = 30 SECONDS
    ,MAX_EVENT_SIZE = 0 KB
    ,MEMORY_PARTITION_MODE = NONE
    ,TRACK_CAUSALITY = OFF
    ,STARTUP_STATE = OFF
);
GO

If you start this session and monitor the data captured with the “Watch live data” window, you will soon notice that a huge number of events gets captured, which means that the output will also be huge and analyzing it can become a daunting task. Saving this data to a file target is not the way to go here: is there another way?

The main point here is that there is no need for the individual events, but the interesting information is the aggregated data from those events. Ideally, you would need to group by object_id and get the maximum read or write date. If possible, counting reads and writes by object_id would be great. At a first look, it seems like a good fit for the histogram target, however you will soon discover that the histogram target can “group” on a single column, which is not what you want. Object_ids are not unique and you can have the same object_id in different databases. Moreover, the histogram target can only count events and is not suitable for other types of aggregation, such as MAX.

Streaming the events with Powershell

Fortunately, when something is not available natively, you can code your own implementation. In this case, you can use the Extended Events streaming API to attach to the session and evaluate the events as soon as they show up in the stream.

In this example, I will show you how to capture the client application name along with the database and object id and group events on these 3 fields. If you are interested in additional fields (such as host name or login name), you will need to group by those fields as well.

In the same way, if you want to aggregate additional fields, you will have to implement your own logic. In this example, I am computing the MAX aggregate for the read and write events, without computing the COUNT. The reason is that it’s not easy to predict whether the count will be accurate or not, because different kind of locks will be taken in different situations (under snapshot isolation no shared locks are taken, so you have to rely on SCH-S locks; when no dirty pages are present SQL Server takes IS locks and not S locks…).

Before going to the Powershell code, you will need two tables to store the information:

USE TOOLS;
GO
CREATE TABLE meta.table_usage_xe(
       db_name sysname,
       schema_name sysname,
       object_name sysname,
       client_app_name nvarchar(128),
       last_read datetime,
       last_write datetime,
       PRIMARY KEY(db_name, schema_name, object_name, client_app_name)
);

CREATE TABLE meta.table_usage_xe_last_snapshot(
       database_id int,
       object_id int,
       client_app_name nvarchar(128),
       last_read datetime,
       last_write datetime,
       PRIMARY KEY(database_id, object_id, client_app_name)
);

Now that you have a nice place to store the aggregated information, you can start this script to capture the events and persist them.

sl $Env:Temp

Add-Type -Path 'C:\Program Files\Microsoft SQL Server\120\Shared\Microsoft.SqlServer.XE.Core.dll'
Add-Type -Path 'C:\Program Files\Microsoft SQL Server\120\Shared\Microsoft.SqlServer.XEvent.Linq.dll'

$connectionString = 'Data Source = YourServerNameGoesHere; Initial Catalog = master; Integrated Security = SSPI'

$SessionName = "audit_table_usage"

# loads all object ids for table objects and their database id
# table object_ids will be saved in order to rule out whether
# the locked object is a table or something else.
$commandText = "
DECLARE @results TABLE (
       object_id int,
       database_id int
);

DECLARE @sql nvarchar(max);

SET @sql = '
       SELECT object_id, db_id()
       FROM sys.tables t
       WHERE is_ms_shipped = 0
';

DECLARE @statement nvarchar(max);

SET @statement = (
       SELECT 'EXEC ' + QUOTENAME(name) + '.sys.sp_executesql @sql; '
       FROM sys.databases d
       WHERE name NOT IN ('master','model','msdb','tempdb')
       FOR XML PATH(''), TYPE
).value('.','nvarchar(max)');

INSERT @results
EXEC sp_executesql @statement, N'@sql nvarchar(max)', @sql;

SELECT *
FROM @results
"

$objCache = @{}

$conn = New-Object -TypeName System.Data.SqlClient.SqlConnection -ArgumentList $connectionString
$cmd = New-Object -TypeName System.Data.SqlClient.SqlCommand
$cmd.CommandText = $commandText
$cmd.Connection = $conn
$conn.Open()
$conn.ChangeDatabase("master")
$rdr = $cmd.ExecuteReader()

# load table object_ids and store them in a hashtable

while ($rdr.Read()) {
	$objId = $rdr.GetInt32(0)
	$dbId = $rdr.GetInt32(1)
	if(-not $objCache.ContainsKey($objId)){
		$objCache.add($objId,@($dbId))
	}
	else {
		$arr = $objCache.Get_Item($objId)
		$arr += $dbId
		$objCache.set_Item($objId, $arr)
	}
}

$conn.Close()

# create a DataTable to hold lock information in memory
$queue = New-Object -TypeName System.Data.DataTable
$queue.TableName = $SessionName

[Void]$queue.Columns.Add("database_id",[Int32])
[Void]$queue.Columns.Add("object_id",[Int32])
[Void]$queue.Columns.Add("client_app_name",[String])
[Void]$queue.Columns.Add("last_read",[DateTime])
[Void]$queue.Columns.Add("last_write",[DateTime])

# create a DataView to perform searches in the DataTable
$dview = New-Object -TypeName System.Data.DataView
$dview.Table = $queue
$dview.Sort = "database_id, client_app_name, object_id"

$last_dump = [DateTime]::Now

# connect to the Extended Events session
[Microsoft.SqlServer.XEvent.Linq.QueryableXEventData] $events = New-Object -TypeName Microsoft.SqlServer.XEvent.Linq.QueryableXEventData `
    -ArgumentList @($connectionString, $SessionName, [Microsoft.SqlServer.XEvent.Linq.EventStreamSourceOptions]::EventStream, [Microsoft.SqlServer.XEvent.Linq.EventStreamCacheOptions]::DoNotCache)

$events | % {
    $currentEvent = $_

	$database_id = $currentEvent.Fields["database_id"].Value
	$client_app_name = $currentEvent.Actions["client_app_name"].Value
	if($client_app_name -eq $null) { $client_app_name = [string]::Empty }
	$object_id = $currentEvent.Fields["object_id"].Value
	$mode = $currentEvent.Fields["mode"].Value

	# search the object id in the object cache
	# if found (and database id matches) ==> table
	# otherwise ==> some other kind of object (not interesting)
	if($objCache.ContainsKey($object_id) -and $objCache.Get_Item($object_id) -contains $database_id)
	{
		# search the DataTable by database_id, client app name and object_id
		$found_rows = $dview.FindRows(@($database_id, $client_app_name, $object_id))

		# if not found, add a row
		if($found_rows.Count -eq 0){
			$current_row = $queue.Rows.Add()
			$current_row["database_id"] = $database_id
			$current_row["client_app_name"] = $client_app_name
			$current_row["object_id"] = $object_id
		}
		else {
			$current_row = $found_rows[0]
		}

		if(($mode.Value -eq "IX") -or ($mode.Value -eq "X")) {
			# Exclusive or Intent-Exclusive lock: count this as a write
			$current_row["last_write"] = [DateTime]::Now
		}
		else {
			# Shared or Intent-Shared lock: count this as a read
			# SCH-S locks counted here as well (snapshot isolation ==> no shared locks)
			$current_row["last_read"] = [DateTime]::Now
		}
    }

	$ts = New-TimeSpan -Start $last_dump -End (get-date)

	# Dump to database every 5 minutes
	if($ts.TotalMinutes -gt 5) {
		$last_dump = [DateTime]::Now

		# BCP data to the staging table TOOLS.meta.table_usage_xe_last_snapshot
		$bcp = New-Object -TypeName System.Data.SqlClient.SqlBulkCopy -ArgumentList @($connectionString)
		$bcp.DestinationTableName = "TOOLS.meta.table_usage_xe_last_snapshot"
		$bcp.Batchsize = 1000
		$bcp.BulkCopyTimeout = 0

		$bcp.WriteToServer($queue)

		# Merge data with the destination table TOOLS.meta.table_usage_xe
		$statement = "
			BEGIN TRANSACTION

			BEGIN TRY

				MERGE INTO meta.table_usage_xe AS dest
				USING (
					SELECT db_name(database_id) AS db_name,
						object_schema_name(object_id, database_id) AS schema_name,
						object_name(object_id, database_id) AS object_name,
						client_app_name,
						last_read,
						last_write
					FROM meta.table_usage_xe_last_snapshot
				) AS src
					ON src.db_name = dest.db_name
					AND src.schema_name = dest.schema_name
					AND src.object_name = dest.object_name
					AND src.client_app_name = dest.client_app_name
				WHEN MATCHED THEN
					UPDATE SET last_read = src.last_read,
						last_write = src.last_write
				WHEN NOT MATCHED THEN
					INSERT (db_name, schema_name, object_name, client_app_name, last_read, last_write)
					VALUES (db_name, schema_name, object_name, client_app_name, last_read, last_write);

				TRUNCATE TABLE meta.table_usage_xe_last_snapshot;

				COMMIT;

			END TRY
			BEGIN CATCH
				ROLLBACK;
				THROW;
			END CATCH
		"

		$conn = New-Object -TypeName System.Data.SqlClient.SqlConnection -ArgumentList $connectionString
		$cmd = New-Object -TypeName System.Data.SqlClient.SqlCommand
		$cmd.CommandText = $statement
		$cmd.Connection = $conn
		$conn.Open()
		$conn.ChangeDatabase("TOOLS")
		[Void]$cmd.ExecuteNonQuery()
		$conn.Close()

		$queue.Rows.Clear()

	}

}

WARNING: Be careful running this script against a production server: I tried it with a reasonaly busy server and the CPU/memory load of powershell.exe is non-negligible. On the other hand, the load imposed by the session per se is very low: make sure you run this script from a different machine and not on the database server.

What to do with unused objects

After monitoring for a reasonable amount of time, you will start to notice that some objects are never used and you will probably want to delete them. Don’t!

In my experience, as soon as you delete an object, something that uses it (and you didn’t capture) pops up and fails. In those cases, you want to restore the objects very quickly. I usually move everything to a “trash” schema and have it sitting there for some time (six months/one year) and eventually empty the trash. If somebody asks for a restore, it’s just as simple as an ALTER SCHEMA … TRANSFER statement.

Bottom line

Cleaning up clutter from a database is not simple: hopefully the techniques in this post will help you in the task. Everything would be much simpler if the Extended Events histogram target was more flexible, but please keep in mind that it’s not about the tools: these techniques can help you identify unused objects when no other information is available, but nothing is a good substitute for a correct use of the database. When new tables are added to a database, keep track of the request and take notes about who uses the tables for which purpose: everything will be much easier in the long run.

Should I check tempdb for corruption?


You all know that checking our databases for corruption regularly is a must. But what about tempdb? Do you need to check it as well?

The maintenance plans that come with SQL Server do not run DBCC CHECKDB on tempdb, which is a strong indicator that it’s a special database and something different is happening behind the scenes. If you think that relying on the behavior of a poor tool such as maintenance plans to make assumptions on internals is a bit far-fetched, well, I see your point. However, you can get more clues by running DBCC CHECKDB against your tempdb and see what it outputs:

DBCC CHECKDB will not check SQL Server catalog or Service Broker consistency because a database snapshot could not be created or because WITH TABLOCK was specified.

See? Something special is happening here: CHECKDB cannot take a database snapshot to run the consistency checks and has to run as if WITH TABLOCK was specified.

Big deal? It depends. Running CHECKDB WITH TABLOCK means that you revert to the SQL2000 behavior, when table locks were taken to perform the consistency checks. This means that you might impact your workload with blocking. Not good.

Well, tempdb is recreated each time you start the instance and doesn’t contain any data worth worrying about, so you don’t need to check if it got corrupt, right? Again, it depends: tempdb might get so corrupted that it could start to throw errors at the sessions using it. It might also get corrupted to a point that triggers a stop of the instance. Not good.

So, if you check tempdb, you get blocking and if you don’t, you can get errors. What’s a reasonable balance?

Don’t check tempdb in the same schedule you use for checking user databases: find a reasonable maintenance window when any blocking issue is acceptable. Probably once a week is enough.

What should you do if you find corruption in tempdb? Don’t restart the instance right away: it could still survive for some time before problems arise. The whole point of checking tempdb is avoiding unpredicted downtime, so if you cycle the instance as soon as you find corruption, you’re killing the only reason to check it. Plan downtime with your users and cycle the instance when it causes less harm to your business.
Moreover, if something corrupted your tempdb, it has likely corrupted your user databases, so make sure you check everything before shutting down the instance (time to take tail-log backups?).

If your tempdb gets corrupted, restarting the instance won’t be enough to make corruption go away: you will have to stop SQL Server, delete the files manually and the start SQL Server to let it create the new tempdb files.

Even if you don’t mind losing the data stored in tempdb, getting corruption is a sign that something went wrong: make sure you investigate the issue and find the root cause. It could be a problem with the I/O subsystem or a malfunctioning RAM module or something else: fix the root cause before it affects the user databases.

Call to action:

Are you checking your tempdb for corruption? If you are not, don’t panic: choose an appropriate schedule and start checking it.

Speaking at SQLSaturday Pordenone


Next week, on Saturday 28, make sure you don’t miss SQLSaturday Pordenone!

Pordenone is the place where the Italian adventure with SQLSaturday started, more than two years ago. It was the beginning of a journey that brought many SQLSaturdays to Italy, with our most successful one in Parma last November.

Now we’re back in Pordenone to top that result!

We have a fantastic schedule for this event, with a great speaker lineup and great topics for the sessions. Everything is set in the right direction to be a great day of free learning and fun.

I will have two sessions this time:

SQL Server Security in an Insecure World

In this session I will talk about security, with a general introduction to the topic and then I’ll go straight to demonstrate some of the vulnerabilities that attackers could use to take over your server. Yes, I’ll be demonstrating SQL-Injection attacks: SQL-I is still a top security issue, even if we’re in 2015. Everyone must be aware of the risks and take action immediately.
I will also describe the security features available in SQL Server to lock down the server as much as possible, but the key concept I will try to drive is that security is a process, not a feature.
If you want to find out more, join me at 12:00 PM in room S7.

Extending the Data Collector to Monitor SQL Server effortlessly

In this session I will try to promote one of the least used features in SQL Server: the Data Collector. It doesn’t have all the bells and whistles of the expensive monitoring suites, but it does the job pretty well. Things start to be painfully difficult when you try to extend it with additional collection sets, but the good news is that there’s an open-source project that provides a GUI to manage and customize the collection sets. The project is called ExtendedTSQLCollector and it does much more than just adding a GUI to the Data Collector: it also provides two additional collector types to collect data from LOB columns (in case you’re wondering, no – the vanilla Data Collector doesn’t support LOB columns) and Extended Events sessions.
I will also demonstrate a convenient way to centralize and extend the Data Collector reports to create a reporting and alerting solution for free.
Sounds interesting? Join me at 4:30 PM in room S7.

So, what are you waiting for? Register now and join us in Pordenone!

Follow

Get every new post delivered to your Inbox.

Join 659 other followers