You all know that checking our databases for corruption regularly is a must. But what about tempdb? Do you need to check it as well?
The maintenance plans that come with SQL Server do not run DBCC CHECKDB on tempdb, which is a strong indicator that it’s a special database and something different is happening behind the scenes. If you think that relying on the behavior of a poor tool such as maintenance plans to make assumptions on internals is a bit far-fetched, well, I see your point. However, you can get more clues by running DBCC CHECKDB against your tempdb and see what it outputs:
DBCC CHECKDB will not check SQL Server catalog or Service Broker consistency because a database snapshot could not be created or because WITH TABLOCK was specified.
See? Something special is happening here: CHECKDB cannot take a database snapshot to run the consistency checks and has to run as if WITH TABLOCK was specified.
Big deal? It depends. Running CHECKDB WITH TABLOCK means that you revert to the SQL2000 behavior, when table locks were taken to perform the consistency checks. This means that you might impact your workload with blocking. Not good.
Well, tempdb is recreated each time you start the instance and doesn’t contain any data worth worrying about, so you don’t need to check if it got corrupt, right? Again, it depends: tempdb might get so corrupted that it could start to throw errors at the sessions using it. It might also get corrupted to a point that triggers a stop of the instance. Not good.
So, if you check tempdb, you get blocking and if you don’t, you can get errors. What’s a reasonable balance?
Don’t check tempdb in the same schedule you use for checking user databases: find a reasonable maintenance window when any blocking issue is acceptable. Probably once a week is enough.
What should you do if you find corruption in tempdb? Don’t restart the instance right away: it could still survive for some time before problems arise. The whole point of checking tempdb is avoiding unpredicted downtime, so if you cycle the instance as soon as you find corruption, you’re killing the only reason to check it. Plan downtime with your users and cycle the instance when it causes less harm to your business.
Moreover, if something corrupted your tempdb, it has likely corrupted your user databases, so make sure you check everything before shutting down the instance (time to take tail-log backups?).
If your tempdb gets corrupted, restarting the instance won’t be enough to make corruption go away: you will have to stop SQL Server, delete the files manually and the start SQL Server to let it create the new tempdb files.
Even if you don’t mind losing the data stored in tempdb, getting corruption is a sign that something went wrong: make sure you investigate the issue and find the root cause. It could be a problem with the I/O subsystem or a malfunctioning RAM module or something else: fix the root cause before it affects the user databases.
Call to action:
Are you checking your tempdb for corruption? If you are not, don’t panic: choose an appropriate schedule and start checking it.
If you are one among the many that downloaded my consistency check stored procedure called “dba_RunCHECKDB”, you may have noticed a “small” glitch… it doesn’t work on SQL Server 2012!
This is due to the resultset definition of DBCC CHECKDB, which has changed again in SQL Server 2012. Trying to pipe the results of that command in the table definition for SQL Server 2008 produces a column mismatch and it obviously fails.
Fixing the code is very easy indeed, but I could never find the time to post the corrected version until today.
Also, I had to discover the new table definition for DBCC CHECKDB, and it was not just as easy as it used to be in SQL Server 2008. In fact, a couple of days ago I posted a way to discover the new resultset definition working around the cumbersome metadata discovery feature introduced in SQL Server 2012.
Basically, the new output of DBCC CHECKDB now includes 6 new columns:
CREATE TABLE ##DBCC_OUTPUT( Error int NULL, [Level] int NULL, State int NULL, MessageText nvarchar(2048) NULL, RepairLevel nvarchar(22) NULL, Status int NULL, DbId int NULL, -- was smallint in SQL2005 DbFragId int NULL, -- new in SQL2012 ObjectId int NULL, IndexId int NULL, PartitionId bigint NULL, AllocUnitId bigint NULL, RidDbId smallint NULL, -- new in SQL2012 RidPruId smallint NULL, -- new in SQL2012 [File] smallint NULL, Page int NULL, Slot int NULL, RefDbId smallint NULL, -- new in SQL2012 RefPruId smallint NULL, -- new in SQL2012 RefFile smallint NULL, -- new in SQL2012 RefPage int NULL, RefSlot int NULL, Allocation smallint NULL )
If you Google the name of one of these new columns, you will probably find a lot of blog posts (no official documentation, unfortunately) that describes the new output of DBCC CHECKDB, but none of them is strictly correct: all of them indicate the smallint columns as int.
Not a big deal, actually, but still incorrect.
As usual, suggestions and comments are welcome.
Some months ago I posted a script on a SQLServerCentral forum to help a member automating the execution of DBCC CHECKDB and send and e-mail alert in case a consistency error is found.
The original thread can be found here.
I noticed that many people are actually using that script and I also got some useful feedback on the code itself, so I decided to write this post to make an enhanced version available to everyone.
Your primary responsibility as a DBA is to safeguard your data with backups. I mean intact backups! Keep in mind that when you back up a corrupt database, you will also restore a corrupt database.
A task that checks the database integrity should be part of your backup strategy and you should be notified immediately when corruption is found.
Unfortunately, the built-in consistency check Maintenance Task does not provide an alerting feature and you have to code it yourself.
SQL Server 2000 and above accept the “WITH TABLERESULTS” option for most DBCC commands to output the messages as a result set. Those results can be saved to a table and processed to identify messages generated by corrupt data and raise an alert.
If you don’t know how to discover the resultset definition of DBCC CHECKDB WITH TABLERESULTS, I suggest that you take a look at this post.
Here is the complete code of the stored procedure I am using on my production databases:
Once the stored procedure is ready, you can run it against the desired databases:
EXEC [maint].[dba_runCHECKDB] @dbName = 'model', @PHYSICAL_ONLY = 0, @allmessages = 0
Setting up an e-mail alert
In order to receive an e-mail alert, you can use a SQL Agent job and schedule this script to run every night, or whenever you find appropriate.
EXEC [maint].[dba_runCHECKDB] @dbName = NULL, @PHYSICAL_ONLY = 0, @allmessages = 0, @dbmail_profile = 'DBA_profile', @dbmail_recipient = 'firstname.lastname@example.org'
The e-mail message generated by the stored procedure contains the summary outcome and a detailed log, attached as a text file:
Logging to a table
If needed, you can save the output of this procedure to a history table that logs the outcome of DBCC CHECKDB in time:
-- Run the stored procedure with @log_to_table = 1 EXEC TOOLS.maint.dba_runCHECKDB @dbName = NULL, @PHYSICAL_ONLY = 0, @allMessages = 0, @log_to_table = 1 -- Query the latest results SELECT * FROM ( SELECT *, RN = ROW_NUMBER() OVER (PARTITION BY DBId ORDER BY RunDate DESC) FROM DBCC_CHECKDB_HISTORY WHERE Outcome IS NOT NULL ) AS dbcc_history WHERE RN = 1
When invoked with the @log_to_table parameter for the first time, the procedure creates a log table that will be used to store the results. Subsequent executions will append to the table.
The web is full of blogs, articles and forums on how to automate DBCC CHECKDB. If your data has any value to you, CHECKDB must be part of your maintenance strategy.
Run! Check the last time you performed a successful CHECKDB on your databases NOW! Was it last year? You may be in big trouble.