Managing the errorlog file

I frequently see recommendations to regularly run sp_cycle_errorlog, so that the errorlog doesn’t become huge. My main concern with that is that the errorlog contains valuable information.

When I do a health check on a SQL Server machine, I want a few months worth of errorlog information available. I typically use my own scripts for this, available here. Almost every time I go through the errorlog, I find valuable information. Some things you address, like find whatever it is that is attempting to login every minute. Other things you might not have control over, but the information is valuable to have.

So, if you run sp_cycle_errorlog every day or week, you end up with only a week worth, or a few weeks worth of errorlog file information.

Suggestion 1: Increase the number of errorlog files.

You probably want more than 6 history errorlog files. For instance, a client of mine told me that he was about to patch a server a few days before I was to visit that client. That patch procedure resulted in enough re-start of SQL Server so we ended up with only 4 days worth of errorlog files. Yes, this client had the default of 6 historic errorlog files. I typically increase this to 15. You can do this by right-clicking the “SQL Server Logs” folder under “Management” in SSMS. If you want to use T-SQL, you can use xp_instance_regwrite, as in:

EXEC xp_instance_regwrite
N’HKEY_LOCAL_MACHINE’
,N’Software\Microsoft\MSSQLServer\MSSQLServer’
,N’NumErrorLogs’, REG_DWORD, 15;

Suggestion 2: Set a limit for the size of the errorlog file.

But what about the size? Say that we have crash dumps, for instance. Or other things that start to happen very frequently. The good news is that as of SQL Server 2012, we can set a max size for the errorlog file. There is no GUI for this, so we have to manipulate the registry directly. Again, we can use xp_instance_regwrite. Below will limit the size to 30 MB:

EXEC xp_instance_regwrite
N’HKEY_LOCAL_MACHINE’
,N’Software\Microsoft\MSSQLServer\MSSQLServer’
,N’ErrorLogSizeInKb’, REG_DWORD, 30720;

With 15 files, you can patch of your SQL Server machine without aging out all old errorlog files. And with a max size of 30 MB, you keep each file manageable in size. And you keep the total size of errorlog files for that instance to 450 MB. Not enough to fill your disks. But enough to have historical information for when you are about to perform a health check on your SQL Server instance.

References: this by Jan Kåre and this by Paul Randal.

Ola Hallengrens Maintenance Solution now supports mirrored backup

You probably know that you can mirror a backup to several destinations, assuming you are on a supported edition (Enterprise or Developer). This is not the same as striping; you can compare striping to RAID 0, and mirroring to RAID 1.

Ola now supports mirroring in his maintenance solution, found here. A couple of examples:

EXECUTE dbo.DatabaseBackup
@Databases = 'USER_DATABASES',
@Directory = 'C:\Backup',
@MirrorDirectory = 'D:\Backup',
@BackupType = 'FULL',
@CleanupTime = 24,
@MirrorCleanupTime = 48

EXECUTE dbo.DatabaseBackup
@Databases = 'USER_DATABASES',
@Directory = 'C:\Backup,D:\Backup',
@MirrorDirectory = 'E:\Backup,F:\Backup',
@BackupType = 'FULL',
@CleanupTime = 24,
@MirrorCleanupTime = 48

Note that if any of the destinations are unanavilable, then the backup fails for all destinations. SQL Server do not produce the backup to the ones that are available. This has nothing to do with Ola’s solution, it is just how MS decided to implement backup mirroring.

 

How often do you rebuild your heaps?

Never? You are not alone. None of the maintenance solutions I use includes functionality to rebuild a heap, and that includes Maintanance Plans and Ola Hallengren’s maintenance solution.

“So why would you want to rebuild a heap, when it isn’t sorted in the first place?”, you ask. The answer is to get rid of Forwarding Pointers, and these can really hurt performance by adding lots more logical and physical reads, and random I/O. See for instance this from Kalen Delaney, this from Hugo Kornelis and this from the SQLCAT team.

SQL Server 2008 added the ALTER TABLE command, with the REBUILD clause. And this is what I’m using in my procedure rebuild_heaps which rebuilds all fragmented heaps on a SQL Server.

You find the procedure here: http://karaszi.com/rebuild-all-fragmented-heaps.

Do you clean up your Database Mail log tables?

Database Mail has a couple of log tables in the msdb database. These can become large over time. I’ve seen MSDB databases over 1 GB in size, where normal size is less than 50 MB (heavy usage of old SSIS deployment model excluded).

Unfortunately Maintenance Plans do not have built-in functionality for this, nor does Ola Hallengren’s excellent maintenance solution ( http://ola.hallengren.com/ ). All you have to do is to schedule an agent job to be executed, say, every week, having one T-SQL jobstep containing: 

DECLARE @DeleteOlder DATETIME

SET @DeleteOlder = DATEADD(MONTH, -1, CURRENT_TIMESTAMP)

EXECUTE msdb.dbo.sysmail_delete_mailitems_sp @sent_before = @DeleteOlder

EXECUTE msdb.dbo.sysmail_delete_log_sp @logged_before = @DeleteOlder

Above removes mail history older than one month. Adjust to your liking, using the values in the DATEADD function.

As always, remember to comment your job and to specify appropriate database for the T-SQL jobstep (for documentation purposes, msdb in this case).

Wait random number of minutes

Why on earth would you want to do that? you ask. Say you have a job that is scheduled to start at the same time over a number of servers. This might be because you have an SQL Server Master/Target server environment (MSX/TSX) or you quite simply script a job and execute that script on several servers. You probably want to spread the load on your SAN and virtual machine host a bit. This is the exact reason I use this procedure. I frequently use MSX servers and I usually add a job step (executing this procedure) to wait a random number of minutes between 0 and 30.

You find the procedure here.

Analyzing the errorlog

How often do you do this? Look over each message (type) in the errorlog file and determine whether this is something you want to act on. Sure, some (but not all) of you have some monitoring solution in place, but are you 100% confident that it really will notify for all messages that you might find interesting? That there isn’t even one little message hiding in there that you would find valuable knowing about? Or how about messages that you typically don’t are about, but knowing that you have a high frequency can be valuable information?

So, this boils down to actually reading the errorlog file. Some of you probably already have scripts and tool that makes this easier than just reading every simple message from top to bottom. I wanted to share how I do it, and this is why I wrote my Analyze SQL Server logs article. Check it out. And, feedback is always welcome!

Will EMPTYFILE on primary ldf "doom" it somehow?

I just read a newgroup question whether doing SHRINKFILE with the EMPTYFILE option for the primary log file somehow cause ill effects.

Shrinkfile for the ldf will not move any data (log records) or so. For an ldf file it is basically a preparation to tell the engine that you are about to remove this file (ALTER DATABASE … REMOVE FILE).

Now, the first (primary) log file is special and cannot be removed. So, what if we do an EMPTYFILE on the primary log file. Will we end up in some limbo-state? I did a test and performed EMPTYFILE on the primary file. Nothing bad happened. I then did EMPTYFILE on the other log file and removed that file successfully. So it seems that this should not cause any havoc. Just pretend you never did that EMPTYFILE operation against the primary log file.

It isn’t doable to create a repro script which show shrinking and removing nf log files.It will require some engagement for you. The reason is that we never know from what file and where the nect virtual log file comes from. So, if you are about to run below, be prepared to read up on DBCC LOGINFO and other command, understand what VLF is, perhaps some operation need to be done everal times before what we expect will happen… And as always, use at own risk.

--Drop and create database named x
SET NOCOUNT ON
USE 
master
IF DB_ID('x'IS NOT NULL DROP DATABASE x
GO
CREATE DATABASE [x] ON  PRIMARY
NAME N'x'FILENAME N'C:\DemoDatabases\DbFiles\a\x.mdf' SIZE 10MBFILEGROWTH 3MB )
LOG ON
NAME N'x_log'FILENAME N'C:\DemoDatabases\DbFiles\a\x_log.ldf' SIZE 2MB FILEGROWTH 1MB)
,( 
NAME N'x_log2'FILENAME N'C:\DemoDatabases\DbFiles\a\x_log.ldf2' SIZE 2MB FILEGROWTH 1MB)
GO

--Get the database out of "auto-truncate" mode.
ALTER DATABASE SET RECOVERY FULL
BACKUP DATABASE 
TO DISK = 'nul'

--Fill up the log some
USE x
CREATE TABLE t(c1 INT IDENTITYc2 CHAR(300DEFAULT 'a')
GO
INSERT INTO DEFAULT VALUES
DELETE FROM 
t
GO 2000

--Investigate log
DBCC SQLPERF(logspace)
DBCC LOGINFO

--Empty log
BACKUP LOG TO DISK = 'nul'

--Investigate log
DBCC SQLPERF(logspace)
DBCC LOGINFO

--"Empty" primary log file
DBCC SHRINKFILE(2EMPTYFILE)

--Investigate log
DBCC SQLPERF(logspace)
DBCC LOGINFO

--Fill up the log some
INSERT INTO DEFAULT VALUES
DELETE FROM 
t
GO 2000

--Investigate log
DBCC SQLPERF(logspace)
DBCC LOGINFO

-- Do above several times and see
-- that 2 is still allocated from...

--Can we get rid of file 3?
BACKUP LOG TO DISK = 'nul'
DBCC SHRINKFILE(3EMPTYFILE)
--We might need to do above a few times
--until 3 is "clean" - no used VLFs

--Investigate log
DBCC SQLPERF(logspace)
DBCC LOGINFO

ALTER DATABASE x REMOVE FILE x_log2

--Might need to do some stuff to get rid of file physically
CHECKPOINT
BACKUP 
LOG TO DISK = 'nul'
GO
SELECT FROM sys.database_files

Heaps, forwarding pointers, ALTER TABLE REBUILD and non-clustered indexes

Let’s start with some background on forwarding pointers:

Forwarding pointers in heaps can be a mess to get rid of. A forwarding pointer happens when you modify a row so that the row doesn’t fit on the same page anymore. SQL Server moves the row to a new page and leaves a forwarding pointer where the old row used to be. This means that non-clustered indexes are not affected by the moving of the row – it can still point to the old location. So the modification is less costly than if all non-clustered indexes needed to reflect the row move. But having forwarding pointers can be bad for performance when you read data. A perhaps less known fact is that a scan over a table needs to follow forwarding pointers – essentially “jumping back and forth” a lot if you have many forwarding pointers. That can be really bad for performance (if you have table scans, of course). So, how do we get rid of forwarding pointers? Well, we can shrink the database file, but that is a little like curing a headache by jumping into a big meat grinder. We can export all data and re-import it – not very practical.

Greg Linwood reminded me that in SQL Server 2008, we can do ALTER TABLE … REBUILD. Now, I knew about this option, and every time I tell about it in class I’ve been thinking silently for myself “I need to test whether this is a way to get rid of fwd pointers”. (You generally talk about ALTER TABLE … REBUILD when you talk about enabling compression on a heap.) So, doing a REBUILD of a table using ALTER TABLE sounds promising. Will it get rid of forwarding pointers? Will it also rebuild all non-clustered indexes?

Quick answer for those who don’t care reading the TSQL script:
ALTER TABLE … REBUILD will remove forwarding pointers, but for some strange reason it will also rebuild all non-clustered indexes on that table.

See the TSQL code below. It was adapted from a challenge by Linchi Shea to produce a data loading script resulting in worst performance (where I immediately thought of forwarding pointers). See for instance http://sqlblog.com/blogs/linchi_shea/archive/2009/06/07/t-sql-exercise-simplest-data-loading-script-to-produce-worst-query-performance-june-7th-update.aspx.

USE tempdb
GO

IF EXISTS (SELECT FROM sysobjects WHERE name 'test')
DROP TABLE test
GO
CREATE TABLE test (
INT NOT NULL
,x2 INT NOT NULL
,CHAR(10) NOT NULL DEFAULT ('')
,CHAR(10) NOT NULL DEFAULT('')
)

DECLARE @rows INT = 666666@toKeep INT = 400000@diff INT

INSERT test (xx2)
SELECT TOP(@rows)
ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS r
,ROW_NUMBER() OVER (ORDER BY (SELECT 1)) % 10 AS s
FROM master..spt_values a CROSS JOIN master..spt_values b
GO
ALTER TABLE test ALTER COLUMN CHAR(892)
ALTER TABLE test ALTER COLUMN CHAR(100)
GO

DECLARE @rows INT = 666666@toKeep INT = 400000@diff INT
DELETE TOP(@rows @toKeep)
FROM test WHERE x2 IN(2468)
GO

CREATE INDEX x1 ON test(x)
CREATE INDEX x2 ON test(x2)

SELECT FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('test'), NULL, NULL, 'DETAILED')
--First  run (no nc indexes ): 387157 fwd records (out of 400000 rows), 96104 pages
--Second run (two nc indexes): 387157 fwd records (out of 400000 rows), 96105 pages

CHECKPOINT
DBCC DROPCLEANBUFFERS
DECLARE @t time SYSDATETIME()
ALTER TABLE test REBUILD
SELECT DATEDIFF(ms@tCAST(SYSDATETIME() AS time))
--First run, no non-clustered indexes, three subsequent executions (ms): 19351, 20683, 20275
--Second run, with two non-clustered indexes, three subsequent executions (ms): 31803, 35065, 37511

SELECT FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('test'), NULL, NULL, 'DETAILED')
--First  run, heap = "index" 0 (no nc indexes ): 0 fwd records, 50002 pages
--Second run, heap = "index" 0 (two nc indexes): 0 fwd records, 50003 pages
--But: avg_page_space_used_in_percent changes for the nc indexes after the rebuild.
--That I take as nc indexes are also rebuilt.

Do maintenance plans require SSIS?

To many, this is a non-issue, since SSIS is installed anyhow. But not everyody installs SSIS. I for instance prefer to keep my production systems as clean as possible and only have what I really need (a principle which is harder and harder to live after as years go by…). Also, not all Editions of SQL Server comes with SSIS.

I did a test some months ago for SQL Server 2005 with a recent sp and also on SQL Server 2008. SQL Server 2008 did require SSIS (I tested both without and with SSIS installed), where 2005 sp2 didn’t. I recently learned from Phil Brammer in MVP group that there has been progress. Here’s the run-down, you don’t need SSIS installed to execute maint plans:

SQL Server 2005 requires sp2.

SQL Server 2008 requires CU3 or sp1 (see http://support.microsoft.com/kb/961126/). I dodn’t test this combo (2008 sp1 without SSIS), but I have no reason to doubt that KB article.

Management Pack for SQL Agent Alerts

I have finished my article about suggested SQL Server Agent Alerts. Perhaps calling this a “Management Pack” is OTT, but hopefully it will be useful to some of you folks.

We are talking about having Agent sending email to us when SQL Server writes messages to EventLog. The functionality has been in the product since version 6.0, but I have always lacked some good “default setting”. I’m talking about some easy way to tell Agent to notify me for messages which I typically want to be notified for. (Many of you are already using various Management software that has this functionality, so this won’t be for you 🙂 ).

I very much welcome feedback, you find all information in the article:

http://karaszi.com/agent-alerts-management-pack