Managing the errorlog file

I frequently see recommendations to regularly run sp_cycle_errorlog, so that the errorlog doesn’t become huge. My main concern with that is that the errorlog contains valuable information.

When I do a health check on a SQL Server machine, I want a few months worth of errorlog information available. I typically use my own scripts for this, available here. Almost every time I go through the errorlog, I find valuable information. Some things you address, like find whatever it is that is attempting to login every minute. Other things you might not have control over, but the information is valuable to have.

So, if you run sp_cycle_errorlog every day or week, you end up with only a week worth, or a few weeks worth of errorlog file information.

Suggestion 1: Increase the number of errorlog files.

You probably want more than 6 history errorlog files. For instance, a client of mine told me that he was about to patch a server a few days before I was to visit that client. That patch procedure resulted in enough re-start of SQL Server so we ended up with only 4 days worth of errorlog files. Yes, this client had the default of 6 historic errorlog files. I typically increase this to 15. You can do this by right-clicking the “SQL Server Logs” folder under “Management” in SSMS. If you want to use T-SQL, you can use xp_instance_regwrite, as in:

EXEC xp_instance_regwrite
N’HKEY_LOCAL_MACHINE’
,N’Software\Microsoft\MSSQLServer\MSSQLServer’
,N’NumErrorLogs’, REG_DWORD, 15;

Suggestion 2: Set a limit for the size of the errorlog file.

But what about the size? Say that we have crash dumps, for instance. Or other things that start to happen very frequently. The good news is that as of SQL Server 2012, we can set a max size for the errorlog file. There is no GUI for this, so we have to manipulate the registry directly. Again, we can use xp_instance_regwrite. Below will limit the size to 30 MB:

EXEC xp_instance_regwrite
N’HKEY_LOCAL_MACHINE’
,N’Software\Microsoft\MSSQLServer\MSSQLServer’
,N’ErrorLogSizeInKb’, REG_DWORD, 30720;

With 15 files, you can patch of your SQL Server machine without aging out all old errorlog files. And with a max size of 30 MB, you keep each file manageable in size. And you keep the total size of errorlog files for that instance to 450 MB. Not enough to fill your disks. But enough to have historical information for when you are about to perform a health check on your SQL Server instance.

References: this by Jan Kåre and this by Paul Randal.

Can you restore from your backups? Are you sure?

 

A few days ago, we were doing restore practice drills with a client. I had tested the stuff before this, so the practice was more for the client’s DBAs to test various restore scenarios, with me being able to point to the right direction (when needed), supplement the run-book and stuff like that. Always fun, I like these drills!

Anyhow, This client does regular SQL Server backups to disk (full, and for some databases also log) at 19:00. They also snap the machines every night at 04:00. We don’t want to have dependencies on the machine snap, but it is nice to have in case a machine it totaled and we now can restore from such a snapshot. The issue is that this machine snapshot is seen as a full backup by SQL Server. We all know that a full backup do not affect the log backup chain, but the restore GUI doesn’t care about that!

So the restore GUI suggest that you restore from 04:00 full backup (which isn’t a restoreable backup as it was a snapshot) and then the subsequent log backups. What we need to do is to restore from earlier 19:00 full backup, and then all log backups – ignoring the 04:00 snapshot backup.

Fortunately, my client by themselves (without my intervention) did the restore using T-SQL commands, knowing what backup exists, and figuring out what to restore. But I also wanted them to test the GUI, just so they know how that look like. Of course, you can do a restore from 19:00 to 03:55, and script that to a query window. Then then from 04:00 to current time (or whatever) and script that too,. And then stitch these together. But just typing (with some copy-paste) the commands are much easier.

My point? Test your restores. Do not expect anything. A production situation is not the right time to try to figure these things and trying to cope with it.

About this snapshot software: The next version is expected to have an option to produce the snapshot as a COPY_ONLY backup. Isn’t that great? Now we expect the restore GUI to skip this COPY_ONLY backup, right? No, that was not that I saw. Having an option to produce the backup as COPY_ONLY finally allow us to implement differential backups, but it (from my tests) won’t help with the restore GUI issues. Btw, here is a related post.

Here’s a query that might be helpful if you want to see what type of backups are produced. (I appreciate feedback from anybody if you can see if a snapshot backup sets 1 in the is_snapshot column – I don’t have environment to test at the moment…)

 

SELECT TOP(100)
database_name
,CASE bs.TYPE
WHEN 'D' THEN 'Full'
WHEN 'I' THEN 'Differential'
WHEN 'L' THEN 'Log'
WHEN 'F' THEN 'File or filegroup'
WHEN 'G' THEN 'Differential file '
WHEN 'P' THEN 'Partial'
WHEN 'Q' THEN 'Differential partial'
END AS backup_type
,bs.is_copy_only
,bs.is_snapshot
,DATEDIFF(SECOND, bs.backup_start_date, bs.backup_finish_date) AS backup_time_sec
,mf.physical_device_name
,bs.database_name
FROM msdb.dbo.backupset AS bs
INNER JOIN msdb.dbo.backupmediafamily AS mf ON bs.media_set_id = mf.media_set_id
ORDER BY backup_finish_date DESC;

 

 

Ola Hallengrens Maintenance Solution now supports mirrored backup

You probably know that you can mirror a backup to several destinations, assuming you are on a supported edition (Enterprise or Developer). This is not the same as striping; you can compare striping to RAID 0, and mirroring to RAID 1.

Ola now supports mirroring in his maintenance solution, found here. A couple of examples:

EXECUTE dbo.DatabaseBackup
@Databases = 'USER_DATABASES',
@Directory = 'C:\Backup',
@MirrorDirectory = 'D:\Backup',
@BackupType = 'FULL',
@CleanupTime = 24,
@MirrorCleanupTime = 48

EXECUTE dbo.DatabaseBackup
@Databases = 'USER_DATABASES',
@Directory = 'C:\Backup,D:\Backup',
@MirrorDirectory = 'E:\Backup,F:\Backup',
@BackupType = 'FULL',
@CleanupTime = 24,
@MirrorCleanupTime = 48

Note that if any of the destinations are unanavilable, then the backup fails for all destinations. SQL Server do not produce the backup to the ones that are available. This has nothing to do with Ola’s solution, it is just how MS decided to implement backup mirroring.

 

Updated sp_indexinfo

It was time to give sp_indexinfo some love. The procedure is meant to be the “ultimate” index information procedure, providing lots of information about all indexes in a database or all indexes for a certain table. Here is what I did in this update:

  • Changed the second query that retrieves missing index information so it generates the index name (based on schema name, table name and column named – limited to 128 characters).
  • Re-arranged and shortened column names to make output more compact and more commonly used column moved to the right.
  • Uncommented some columns that were previously commented. (At least one, filter, has to be commented if you want to run this on 2005.)
  • Added support for columnstore indexes.
  • Decoded the type for columnstore indexes to col-store.

You find the procedure here. 

Does your 3:rd party backup software ruin your SQL Server backup plan?

Sounds scary, huh? And it surely can be! Here’s an example of one such case:

A client of mine had implemented a 3:rd party backup solution in their VM Ware environment. This shouldn’t affect the SQL Server backups. Or that was the plan. I won’t mention what backup software this is, since I’m sure that there are other software also doing strange things. And in the end, it is up to you to verify that your backup and restore strategy is solid. So let us just call this software “BACK”.

BACK performs a snapshot backup of the virtual machine. The client had scheduled this so it happened at about 4 am each day.

Our SQL Server backups were implements so that 7 pm we did a full backup and every hour we did a log backup.

Nothing strange here and these should be independent of each other, right? Wrong!

When looking at the backup history, I realized that SQL Server would see the snapshot backup as a full backup. OK, that will possibly affect out plan to go for differential backups, but it shouldn’t affect our transaction log backups. We all know that a database backup doesn’t break or affect the log backup chain.

But what I did find was that BACKUP performed a log backup immediately after the snapshot database backup. And the log backup was taken to the file name “nul”. Yes, the binary wasteland.

The end result of this was that the log backups that we performed were usable between 7 pm and 4 am. After that, the log backups are useless. No option to restore anything for the work performed during the working day. It would have been more honest if BACK would set the recovery model to simple instead. That would at least give us a more fair chance to catch the problem (sooner then we did).

Now, we did find an option in BACK to not truncate the log (or whatever they called the checkbox), but by default it did perform this disastrous log backup to nul.

The next step was to consider implementation of the differential backup plan. But time was running out for my assignment so we only managed a quick check. And from what we found out, it seemed that BACK doesn’t have an option to produce its snapshot backup to be seen as a COPY_ONLY backup. This means that it prohibits us to implements SQL Server differential backup and saving some 500 GB backup storage produced each week (for only one of the servers). Now, let me leave a disclaimer here since I didn’t have time to investigate this much as all, but this is what it looks like at the moment. I will dig deeper into the differential backup strategy next time I’m with this client.

The moral of the story? You already know that. Test your restore strategy. You might just be (unpleasantly) surprised!

Using whoami to check for instant file initialization

Read this if you want to read more about instant file initialization (IFI). In an earlier blog post, I posted a script that uses trace flags, created a dummy-database and then sp_readerrorlog to check for IFI.

Another option is to use the OS command whoami, as documented here. Below script uses whoami to check for IFI, or more precisely SQL Server having the “Perform Volume Maintenance Tasks” policy. It uses xp_cmdshell, meaning you have to be sysadmin in order to use it.


IF OBJECT_ID('tempdb..#res') IS NOT NULL DROP TABLE #res
SET NOCOUNT ON
GO

DECLARE
@was_xp_cmd_on tinyint
,@was_show_adv_on tinyint
,@is_ifi_enabled tinyint

--Exit if we aren't sysadmin
IF IS_SRVROLEMEMBER('sysadmin') <> 1
BEGIN
RAISERROR('You must be sysadmin to execute this script', 16, 1)
RETURN
END

--Save state for show advanced options
SET @was_show_adv_on =
(
SELECT CAST(value_in_use AS tinyint)
FROM sys.configurations
WHERE name = 'show advanced options'
)

--Turn on show advanced options, if neccesary
IF @was_show_adv_on = 0
BEGIN
EXEC sp_configure 'show advanced options', 1
RECONFIGURE
END

--Save state for xp_cmdshell
SET @was_xp_cmd_on =
(
SELECT CAST(value_in_use AS tinyint)
FROM sys.configurations
WHERE name = 'xp_cmdshell'
)

--Turn on xp_cmdshell, if neccesary
IF @was_xp_cmd_on = 0
BEGIN
EXEC sp_configure 'xp_cmdshell', 1
RECONFIGURE
END

CREATE TABLE #res (col VARCHAR(255))

INSERT INTO #res(col)
EXEC xp_cmdshell 'whoami /priv /nh'

SET @is_ifi_enabled =
(
SELECT CASE WHEN PATINDEX('%Enabled%', col) > 0 THEN 1 ELSE 0 END
FROM #res
WHERE col LIKE '%SeManageVolumePrivilege%'
)

IF @is_ifi_enabled = 1
SELECT 'Instant file initialization is enabled'
ELSE
SELECT 'Instant file initialization is NOT enabled'

--Reset state for xp_cmdshell
IF @was_xp_cmd_on = 0
BEGIN
EXEC sp_configure 'xp_cmdshell', 0
RECONFIGURE
END

--Reset state for show advanced options
IF @was_show_adv_on = 0
BEGIN
EXEC sp_configure 'show advanced options', 0
RECONFIGURE
END

How often do you rebuild your heaps?

Never? You are not alone. None of the maintenance solutions I use includes functionality to rebuild a heap, and that includes Maintanance Plans and Ola Hallengren’s maintenance solution.

“So why would you want to rebuild a heap, when it isn’t sorted in the first place?”, you ask. The answer is to get rid of Forwarding Pointers, and these can really hurt performance by adding lots more logical and physical reads, and random I/O. See for instance this from Kalen Delaney, this from Hugo Kornelis and this from the SQLCAT team.

SQL Server 2008 added the ALTER TABLE command, with the REBUILD clause. And this is what I’m using in my procedure rebuild_heaps which rebuilds all fragmented heaps on a SQL Server.

You find the procedure here: http://karaszi.com/rebuild-all-fragmented-heaps.

Setting max server memory

If there is one server setting that is close to universal to configure, then it is probably the “max server memory” setting. The setting is documented here. There are plenty of articles out there on this subject. The purpose for this article is for me to have somewhere to point when I get the question: “What value should I set this to?”. I usually refer to Jonathan Kehayias’ blog post when I get this question: http://www.sqlskills.com/blogs/jonathan/how-much-memory-does-my-sql-server-actually-need/, another place to look is the MS documentation for memory planing. For starters you want a simple formula to begin with, and then some hints on what to monitor if you want to fine-tune the value. Jonathan’s articles provide both. The simple formula for how much to reserve for the OS is:

1 GB
Plus 1 GB for every 4 GB in the machine, between 4 and 16 GB
Plus 1 GB for every 8 GB in the machine, above 16 GB

And here’s a TSQL script if you don’t want to do the math yourself. Note that you need to specify how much memory you have in the machine.


--Based on Jonathan Kehayias' blog post:
--http://www.sqlskills.com/blogs/jonathan/how-much-memory-does-my-sql-server-actually-need/

IF OBJECT_ID('tempdb..#mem') IS NOT NULL DROP TABLE #mem
GO

DECLARE
@memInMachine DECIMAL(9,2)
,@memOsBase DECIMAL(9,2)
,@memOs4_16GB DECIMAL(9,2)
,@memOsOver_16GB DECIMAL(9,2)
,@memOsTot DECIMAL(9,2)
,@memForSql DECIMAL(9,2)
,@CurrentMem DECIMAL(9,2)
,@sql VARCHAR(1000)

CREATE TABLE #mem(mem DECIMAL(9,2))

--Get current mem setting----------------------------------------------------------------------------------------------
SET @CurrentMem = (SELECT CAST(value AS INT)/1024. FROM sys.configurations WHERE name = 'max server memory (MB)')

--Get memory in machine------------------------------------------------------------------------------------------------
IF CAST(LEFT(CAST(SERVERPROPERTY('ResourceVersion') AS VARCHAR(20)), 1) AS INT) = 9
SET @sql = 'SELECT physical_memory_in_bytes/(1024*1024*1024.) FROM sys.dm_os_sys_info'
ELSE
IF CAST(LEFT(CAST(SERVERPROPERTY('ResourceVersion') AS VARCHAR(20)), 2) AS INT) >= 11
SET @sql = 'SELECT physical_memory_kb/(1024*1024.) FROM sys.dm_os_sys_info'
ELSE
SET @sql = 'SELECT physical_memory_in_bytes/(1024*1024*1024.) FROM sys.dm_os_sys_info'

SET @sql = 'DECLARE @mem decimal(9,2) SET @mem = (' + @sql + ') INSERT INTO #mem(mem) VALUES(@mem)'
PRINT @sql
EXEC(@sql)
SET @memInMachine = (SELECT MAX(mem) FROM #mem)

--Calculate recommended memory setting---------------------------------------------------------------------------------
SET @memOsBase = 1

SET @memOs4_16GB =
CASE
WHEN @memInMachine <= 4 THEN 0
WHEN @memInMachine > 4 AND @memInMachine <= 16 THEN (@memInMachine - 4) / 4
WHEN @memInMachine >= 16 THEN 3
END

SET @memOsOver_16GB =
CASE
WHEN @memInMachine <= 16 THEN 0
ELSE (@memInMachine - 16) / 8
END

SET @memOsTot = @memOsBase + @memOs4_16GB + @memOsOver_16GB
SET @memForSql = @memInMachine - @memOsTot

--Output findings------------------------------------------------------------------------------------------------------
SELECT
@CurrentMem AS CurrentMemConfig
, @memInMachine AS MemInMachine
, @memOsTot AS MemForOS
, @memForSql AS memForSql
,'EXEC sp_configure ''max server memory'', ' + CAST(CAST(@memForSql * 1024 AS INT) AS VARCHAR(10)) + ' RECONFIGURE' AS CommandToExecute
,'Assumes dedicated instance. Only use the value after you verify it is reasonable.' AS Comment

Edit 1 2014-03-06: Got the memory in the machine from sys.dm_os_sys_info, suggested by Ola Hallengren.

Edit 2 2014-03-20: Adjusted script to work on 2008R2 and lower, as suggested by Shanky. Also added current mem config to output. Changed output from PRINT to SELECT (to facilitate multi-server query window).

Edit 3 2014-03-22: Adjusted script to support 2005, as suggested by Steve Meder. Also changed to only one resultset.

Edit 4 2014-05-30: Fixed some bugs for 2005, reported by Lee Linares.

 

Do you clean up your Database Mail log tables?

Database Mail has a couple of log tables in the msdb database. These can become large over time. I’ve seen MSDB databases over 1 GB in size, where normal size is less than 50 MB (heavy usage of old SSIS deployment model excluded).

Unfortunately Maintenance Plans do not have built-in functionality for this, nor does Ola Hallengren’s excellent maintenance solution ( http://ola.hallengren.com/ ). All you have to do is to schedule an agent job to be executed, say, every week, having one T-SQL jobstep containing: 

DECLARE @DeleteOlder DATETIME

SET @DeleteOlder = DATEADD(MONTH, -1, CURRENT_TIMESTAMP)

EXECUTE msdb.dbo.sysmail_delete_mailitems_sp @sent_before = @DeleteOlder

EXECUTE msdb.dbo.sysmail_delete_log_sp @logged_before = @DeleteOlder

Above removes mail history older than one month. Adjust to your liking, using the values in the DATEADD function.

As always, remember to comment your job and to specify appropriate database for the T-SQL jobstep (for documentation purposes, msdb in this case).

Do you want improved performance?

Can you survive a few lost transactions if your server does a “hard shutdown”? If so, check out SQL Server 2014 and “Delayed Durability”.

A cornerstone in SQL Server’s transaction handling has up until 2014 been “durability” for a committed transaction. Durability is by the way the “D” in the ACID acronym: Atomicity, Consistency, Isolation and Durability.

Durability means that SQL Server has do perform a synchronous write to the LDF file for each transaction. This so that SQL Server can re-construct all committed transactions up until the point of a (potentially hard) shutdown. 

In SQL Server 2014, MS has planned for a database setting called “Delayed Durability”. Setting this means that SQL Server can bath writes to the ldf file, meaning a potentially significant improved performance for applications where you have many small transactions.

I did a quick test, using a bench from an earlier blog post of mine (http://sqlblog.com/blogs/tibor_karaszi/archive/2008/08/14/are-inserts-quicker-to-heap-or-clustered-tables.aspx) to test what difference I would see for that workload. Roughly (for 50000 rows, on a PC with single spinning disk HD):

All inserts in one transaction averaged about 0.3 seconds.

One transaction per row with Delayed Durability set to OFF approx 12 seconds.

One transaction per row with delayed durability set to Forced approx 1.2 seconds. 

As you can see, for this workload we got about a tenfold performance improvement by letting SQL Server batch the write operations to the ldf file. The question is how much improvement you get for your workload and if you can tolerate to lose some modifications in case of a hard shutdown?