Log Page Life Expectancy over time

You often see Page Life Expectancy referred to as an interesting performance monitor counter. And it can be! It indicates for how long a page is expected to stay in cache, from the time it was brought into cache.

But just looking at a snapshot value doesn’t say that much. It might be high, but that is because you haven’t had a high turnover of you pages for the past couple of hours. Or the other way around, you happen to look just after a very rare monster query. Furthermore, having a log can show you how PLE changes over time.

“The app was slow at two o’clock today.”

You look at the PLE log and you see how PLE dropped at that time, which indicates that something was executed that did lots of I/O. Or you don’t see PLE drop at all, and can assume that it probably was a blocking situation instead.

Some of you already have performance monitoring solutions that include PLE logging over time. For you that don’t, I created a very simple solution. It involved a table, a scheduled Agent job and queries to read the data. That’s it.

You find it at http://karaszi.com/log-page-life-expectancy-over-time .

Improvements and comments are of course appreciated. 🙂

Update for Books Online with functioning index (2016)

Good news, everybody! Microsoft recently released an update for the local Books Online, with a functioning index for the relational engine.

Earlier versions were missing those index bits, but not for everyone (a theory is that it depends on what server you hit when you downloaded the BOL bits). Basic steps to get them (see my earlier blog, linked to at the end of this, for details):

  • SSMS 2016
  • Help
  • Add and Remove Help Content. This opens Help Viewer 2.2 (HW).
  • If you have the old BOL bits, you should see that there is an update:
    • Remove them
    • Update
    • HW might hang at the end, force-finish it.
  • Add the relevant BOL bits (under Recommended Documentation), the ones under “SQL Server” is for 2012.
  • Update
  • HW might hang at the end, force-finish it.

There are other issues with BOL to hopefully be addressed. See my earlier blog post for details.

July SSMS 2016 changes transaction semantics

This can be a very nasty one if you’re not observant. The July update of SSMS now has the option to change XACT_ABORT setting for a query window using

  • Tools
  • Options
  • Query Execution
  • SQL Server
  • Advanced
  • SET XACT_ABORT
Now, exposing one more SET option can’t be bad, right? But Erland Sommarskog found out that this is checked by default. Now, using XACT_ABORT might now be a bad thing, au contraire. See this one of Erland’s series of error handle articles.
The problem is that it is a change. Quite of a sudden, your transaction semantics isn’t the same, and if you execute that script from somewhere else, or even an older SSMS, it doesn’t behave the same. For instance, if you don’t use TRY CATCH (which you should be using), then the batch will now stop and rollback on most errors, instead of allowing you to do the old eighties style IF @@ERROR <> 0 dance.
Personally, I SET my set commands instead of checking a checkbox in SSMS. But I don’t SET all set options there are, just the ones I want to change from default. I hope that this one can be reverted so it isn’t checked by default.

Books Online for SQL Server 2016

This one is for those of you who prefer to use a local SQL Server Documentation, a.k.a. Books Online (BOL), instead of using the web-based BOL. A local BOL is essential for me. Navigating the contents, switching between pages and searching is lightyears ahead and quicker in the local BOL.

(Update: Added bits about the missing index.)

(Update 2: there now is a functioning index.)

Erland Sommarskog (http://www.sommarskog.se/) is possibly even more adamant that I am about having a local BOL and he has been very helpful. Also thanks to Carla Sabotta and Jennifer Hubbard for the help and patience.

Getting a functional BOL for SQL Server 2016 is a bit more complicated than for 2014. In this earlier blog post I discussed how to get a proper Table of Contents for the 2014 BOL.

Some basics about the local BOL (from now on I call this BOL):

  • It uses HelpViewer.
  •    SQL Server 2012 and 2014 used HelpViewer 1.1.
  •    SQL server 2016 uses HelpViewer 2.2.
  • F1 from within SSMS will always take you to the web based BOL.
  • When you install the SQL Server 2016 engine (not SSMS) you get two program entries. Ignore these:
  •    “SQL Server Documentation” opens the old HelpViewer 1.1. There is no 2016 BOL for HelpViewer 1.1.
  •    “Manage Help Settings” takes us to the old Help Library Manager, which is for HelpViewer 1.1.

There is no stand-alone download for the 2016 books (which you would unpack on your harddrive, as we had to do for BOL 2014). You need to do it from inside the HelpViewer 2.2 program. Here’s how to do it:

From inside SQL Server Management Studio (SSMS):

  • “Help”
  • “Add and Remove Help Content”. This opens HelpViewer 2.2.
  • Make sure that “Online” is selected.
  • Add the SQL Server 2016 parts. Don’t do what I did and look for a “SQL Server 2016” header in bold typeface. The 2016 parts is right under your nose, under “Recommended Documentation”.
  • “Update”

Unfortunately there is currently a bug in HelpViewer 2.2 and it will freeze. See https://msdn.microsoft.com/library/mt654096.aspx on how to handle this. You only need to do these steps once, but you will probably need to end HelpViewer in the Task Manager whenever you update your content.

Now you should have the 2016 BOL. Selecting “Help”, “View Help” from SSMS will still open the web-based BOL, though (a web-browser inside SSMS). So to open the local BOL 2016 from inside SSMS, you will use:

  • “Help”
  • “Add and Remove Help Content”. (Yes, the same that you used to add the books.)

(Update) Note that the index isn’t there for the relational database engine, including T-SQL reference. If you for instance in the index type GROUP BY or @@SERVERNAME you will either get no hits, or be taken to the SQL Server 2012 documentation assuming you also installed that). We’ll see if MS will build an index for the relational database topics. My contacts at MS say that they do have an index, where I don’t. What do you see? Please let us know below if you, after installed BOL 2016 according to this blog post has a working index for the relational database engine. For instance if you get a 2016 hit for GROUP BY.(/Update)

(Update 2): I today, 2016-09-05, noticed that MS recently released an update for BOL 2016 and the index is there. Yay! (/Update2)

How about a program icon from which you can start BOL 2016 outside of SSMS? Create shortcut somewhere which point to below:
“C:\Program Files (x86)\Microsoft Help Viewer\v2.2\HlpViewer.exe” /catalogName SSMS16 /locale en-US

What if you don’t have SSMS installed, but still want a local BOL? There’s nothing from the SQL Server team to help you here. You would have to install HelpViewer 2.2 on your own (something I haven’t bothered to investigate).

See https://msdn.microsoft.com/en-US/library/mt736393(SQL.130).aspx and https://msdn.microsoft.com/en-us/library/mt703151(v=sql.1).aspx
(the later is a bit more generic).

Getting a functional Books Online

As you probably know, the SQL Server Documentation (aka Books Online or BOL) as of SQL Server 2012 takes us to a web-site. Searching and navigating a web-site when you want rapid and precise help is a horrible experience.

  • Fortunately, there’s still a local BOL.
  • Unfortunately, the local BOL for SQL Server 2014 has been broken for a long time.
  • Fortunately, there is a way to fix it.

The trick is to not do it how you are supposed to do it. At least not all steps. Here’s how to get a functional local BOL:

  1. Download the help files from https://www.microsoft.com/en-us/download/confirmation.aspx?id=42557
  2. Unpack the files somewhere
  3. Open the Manage Help Settings application
  4. “Choose online or local help”
  5.  Select “I want to use local help”
  6. OK
  7. Do not select “Install content from Online”. This is where you need to divert from how you (generally) are supposed to do it.
  8. “Install content from disk”
  9. Point to wherever you unpacked the files in step 2 above
  10. “Add” all books
  11. “Update”

So it is only a matter of downloading the files ourselves instead of letting the Help Library Manager (“Manage Help Settings”) tool do it.

If you wonder in what way the BOL is broken, then it is the Table of Content (TOC) which lacks a lot of subjects (for instance the T-SQL reference lacks bunch of things).

Restoreability and SSMS

I have written about this before, how SSMS generates restore commands that will fail. This post is about showing what it might look like using screenshots. If you always use T-SQL directly to do restore, then you won’t be affected by this. But if you expect to be able to perform restore using the restore dialog, then read on.

The problem
The issue is that SSMS base a restore sequence on backups that are impossible to restore from. There are two cases I have found:

Copy only backups
The purpose of the COPY_ONLY option for the backup command is to do an out-of-bands backup. I.e., you do an ad-hoc backup as a one-time thing, restore this on your test server and then delete the backup file. The copy only option is for you to say that you don’t want to affect your scheduled backups with this ad-hoc backup. We expect SSMS to not base the restore sequence on these backups – since it is likely that the backup file has been deleted.

Snapshot backups
If you perform a snapshot of a virtual machine (or “checkpoint” as Hyper-V calls it) then this snapshot might interact with VSS in Windows so you get a consistent state of your SQL Server databases. I.e., when you restore/apply such a snapshot, your databases should be just fine. This is great, as long as it doesn’t mess up the SQL Server backups that we produce. It won’t. But the restore GUI in SSMS can be messed up by such a snapshot.

The timeline dialog
The problem seems to occur in the timeline dialog. In the restore GUI, there’s a nifty timeline dialog where you can use a slider to graphically specify at what point in time you want to restore the database to. The GUI uses the backup history information in msdb and based on that figures out what restore operations to perform. This is actually really nice and friendly. As long as it works. The problem is when you specify an earlier point in time, it sometimes uses the incorrect full backup – a full backup which isn’t possible to restore from.

Examples:

RestoreabilityCopyOnlyOK.PNG

Above, the GUI correctly skips a copy only backup. A full copy only backup was produced after the full backup listed above, but the GUI is smart enough to not base the restore sequence on this copy only backup. This is how it should work.

RestoreabilityTimeline

RestoreabilityCopyOnlyNotOK

Above, the GUI incorrectly base the restore on a copy only backup. After using the timeline dialog to point to an earlier point in time, you can see that the GUI now has changed so it bases the restore on this potentially non-existing copy only backup. Not a nice situation to be in if the person doing the restore hasn’t practiced using the T-SQL RESTORE commands.

RestoreabilitySnapOK

Above, the GUI correctly skips a snapshot backup. A snapshot backup using VSS was produced after the full backup listed above, but the GUI is smart enough to not base the restore sequence on this snapshot backup. This is how it should work.

RestoreabilitySnapNotOK.PNG

RestoreabilitySnapNotOKTSQL

Above, the GUI incorrectly base the restore on a snapshot backup. After using the timeline dialog to point to an earlier point in time, you can see that the GUI now has changed so it bases the restore on the snapshot backup. This is immensely bad since the snapshot doesn’t exist in SQL Server. It is just a placeholder so that SQL Server is aware that a snapshot was performed at that point in time. Look at the RESTORE command it produces!

You might wonder how to tell if something is producing VSS snapshots of your machine? You can see that in the errorlog file. Here are a few lines from the errorlog on a machine where I used Hyper-V to produce a proper VSS snapshot (edited for readability):

2016-03-16 13:30:23.75      I/O is frozen on database Adventureworks2014.
2016-03-16 13:30:23.75      I/O is frozen on database theBackupTestTib.
2016-03-16 13:30:23.75      I/O is frozen on database master.
2016-03-16 13:30:29.33      I/O was resumed on database master.
2016-03-16 13:30:29.33      I/O was resumed on database theBackupTestTib.
2016-03-16 13:30:29.34      I/O was resumed on database Adventureworks2014.

Bottom line
You already know this. Practice doing restores – as realistically as possible and using different scenarios.

Disclaimer: Things might change. The tests I did was using SSMS 2014 for the copy only backups and for SSMS 2016 CTP 3.3 for snapshot backups. I have seen this behaviour since SQL Server 2012, though. I wish that this will be fixed in a future version of SSMS, but considering that my requests has gone unnoticed before, I don’t have much hopes. But if you do see something different, let me know and I can re-test. Just make sure to add as much details as possible.

I used my stored procedure at found here to perform the backups – made the T-SQL used for these tests a bit easier to read:

--Show that restore based on backup history idn't possible
--is the most recent full backup is missing, even if COPY_ONLY was used.

--Create the BackupDbWithTs procedure first:
--http://www.karaszi.com/sqlserver/util_backup_script_like_MP.asp

--We will do full backups, diff backups and log backups.
--In between, we will also do a COPY_ONLY full backup (think of it as an ad-hos backup subsequentially deleted).

---------------------------------------------------------------------------------------------
--Drop and create the test database
--NOTE: This will drop a database named theBackupTestTib is such exists!!!!
---------------------------------------------------------------------------------------------
IF DB_ID('theBackupTestTib') IS NOT NULL
DROP DATABASE theBackupTestTib
GO

CREATE DATABASE theBackupTestTib
GO

--Set to full recovery
ALTER DATABASE theBackupTestTib SET RECOVERY FULL
GO

---------------------------------------------------------------------------------------------
--Create a table so we can make a modification in the database between each backup
---------------------------------------------------------------------------------------------
IF OBJECT_ID('theBackupTestTib.dbo.t') IS NOT NULL DROP TABLE theBackupTestTib.dbo.t
GO

CREATE TABLE theBackupTestTib.dbo.t(c1 INT IDENTITY, c2 CHAR(30) DEFAULT 'Hello backup world')
GO

---------------------------------------------------------------------------------------------
--Declare and set variables.
--Adjust to your liking.
DECLARE
@i INT = 1
,@time_in_between VARCHAR(20) = '00:01:00'    --Time between each backup. 1 sec is fine to test thet restore GUI work.
--Set to higher if you want to use the timeline restore dialog and visually see your backups.
,@iterations INT = 2                      --Number of iterations. Adjust as you wish, but do at least two.
,@db SYSNAME = 'theBackupTestTib'
,@path VARCHAR(200) = 'R:\'                   --Where to put your backup files. Delete backup files after you are finished.

WHILE @i <= @iterations
BEGIN
INSERT INTO theBackupTestTib.dbo.t  DEFAULT VALUES;

EXEC MASTER..BackupDbWithTs @db, @path, 'FULL', 'N', 'N', 'N'; INSERT INTO theBackupTestTib.dbo.t DEFAULT VALUES; WAITFOR DELAY @time_in_between
EXEC MASTER..BackupDbWithTs @db, @path, 'LOG ', 'N', 'N', 'N'; INSERT INTO theBackupTestTib.dbo.t DEFAULT VALUES; WAITFOR DELAY @time_in_between
EXEC MASTER..BackupDbWithTs @db, @path, 'LOG ', 'N', 'N', 'N'; INSERT INTO theBackupTestTib.dbo.t DEFAULT VALUES; WAITFOR DELAY @time_in_between

--Perform the COPY_ONLY or snapshot backup now:
EXEC MASTER..BackupDbWithTs @db, @path, 'FULL', 'N', 'N', 'Y'; INSERT INTO theBackupTestTib.dbo.t DEFAULT VALUES; WAITFOR DELAY @time_in_between --COPY_ONLY

EXEC MASTER..BackupDbWithTs @db, @path, 'DIFF', 'N', 'N', 'N'; INSERT INTO theBackupTestTib.dbo.t DEFAULT VALUES; WAITFOR DELAY @time_in_between
EXEC MASTER..BackupDbWithTs @db, @path, 'LOG ', 'N', 'N', 'N'; INSERT INTO theBackupTestTib.dbo.t DEFAULT VALUES; WAITFOR DELAY @time_in_between
EXEC MASTER..BackupDbWithTs @db, @path, 'LOG ', 'N', 'N', 'N'; WAITFOR DELAY @time_in_between

SET @i = @i + 1
END

Misunderstandings about the COPY_ONLY backup option

The COPY_ONLY option for the backup command never ceases to cause confusion. What does it really do? And in what way does it affect your restore sequence? Or not?

There are two sides to this. Restoreability and how the GUI behaves:

Restoreability

If you specify COPY_ONLY for a full backup, it will not affect the following differential backups. I.e., the following differential backups will be based on  the last full backup which was not performed with COPY_ONLY. Another way of looking at this is that a full backup using COPY_ONLY will not reset the DIFF (also known as DCM) page, which is page 6 and repeated approximately every 4 GB in each database file.

If you specify COPY_ONLY for a log backup, it will not affect the following log backups. I.e., the log backup chain will be without the one(s) produced using COPY_ONLY. Another way of looking at this is that a log backup using COPY_ONLY does not empty (truncate) the log.

That’s it! COPY_ONLY for a full backup does not, in any way, affect the restoreability for log backups. Say you perform

FULL_A
LOG_A
LOG_B
FULL_B
LOG_C

You can restore FULL_A, LOG_A, LOG_B and LOG_C, regardless of whether you used COPY_ONLY for FULL_B or not! I see misunderstanding about this all the time.

The restore GUI

And then we have the restore GUI. I really want to like the GUI, but it… keep disappointing me. For various reasons. To the extent that it frequently becomes useless. I’ve blogged about it before. I know that many of you readers don’t use the GUI when you perform restore. But there are so many “accident” DBAs and less experienced DBA’s which expect to be able to use the restore GUI – and they should! But they can’t. Enough ranting, time for some details.

Prior to SQL Server 2012, the restore GUI didn’t include COPY_ONLY backups. The design reasoning was probably that you did an out-of-band backup and that backup file is no longer there (you deleted it after restoring to the test server, for instance). Sound reasoning, in my opinion.

But, as of 2012, the restore GUI includes COPY_ONLY backups. So, say you do:

FULL_A
LOG_A
LOG_B
FULL_B using COPY_ONLY
LOG_C

Also, say that the backup file for FULL_B was deleted. You use the GUI and specify that you want to the restore up until LOG_C. The restore GUI will now select FULL_B and LOG_C, which is a disaster since FULL_B was deleted (it was produced using COPY_ONLY). And you can’t make the restore GUI to base the restores on FULL_A. If you don’t have the practiced to type your restore commands, your organization (boss?) will not be happy with you now.

So, why did Microsoft do this change in 2012? My guess is it because of availability groups. If you perform a backup on a read-only replica, you need to use COPY_ONLY. Btw, we have the same situation if you did a VSS snapshot of the machine (which I blogged about before).

Bottom line? You know this already. Learn how things work, and practice your restore scenarios. Nothing new here.

(I typed everything above off the top of my head. If there is anything you don’t agree with, then please be very specific. I don’t mind discussions, I appreciate it. But to be fruitful, it need to be concrete. So, be specific, ideally including TSQL commands, and specifics on how you drive the GUI (if relevant). And, remember that things can change over time; service packs and new versions of SSMS – so please include those details as well. 🙂 )

We are now recommended to install cumulative updates

Microsoft just released a couple of CUs for SQL Server 2012. What is exiting is how Microsoft phrases their recommendations whether we should install them or not. Below is from the KB of one of those recently released CUs:

  • Microsoft recommends ongoing, proactive installation of CUs as they become available:
  • SQL Server CUs are certified to the same levels as service packs and should be installed at the same level of confidence.
  • Historical data shows that a significant number of support cases involve an issue that has already been addressed in a released CU.
  • CUs may contain added value over and above hotfixes. This includes supportability, manageability, and reliability updates.

Now, that is a pretty significant change from what they used to say. In addition, requiring the CU is much easier. You just go to MS Download, select whether you want 32 or 64 bit and then download the bits immediately.

Check it out yourself, go to for instance https://www.microsoft.com/en-us/download/details.aspx?id=50731.

Or check out how the KB for a new SQL Server CU: https://support.microsoft.com/en-us/kb/3120313 (see the “Notes for the update” section).

Managing the errorlog file

I frequently see recommendations to regularly run sp_cycle_errorlog, so that the errorlog doesn’t become huge. My main concern with that is that the errorlog contains valuable information.

When I do a health check on a SQL Server machine, I want a few months worth of errorlog information available. I typically use my own scripts for this, available here. Almost every time I go through the errorlog, I find valuable information. Some things you address, like find whatever it is that is attempting to login every minute. Other things you might not have control over, but the information is valuable to have.

So, if you run sp_cycle_errorlog every day or week, you end up with only a week worth, or a few weeks worth of errorlog file information.

Suggestion 1: Increase the number of errorlog files.

You probably want more than 6 history errorlog files. For instance, a client of mine told me that he was about to patch a server a few days before I was to visit that client. That patch procedure resulted in enough re-start of SQL Server so we ended up with only 4 days worth of errorlog files. Yes, this client had the default of 6 historic errorlog files. I typically increase this to 15. You can do this by right-clicking the “SQL Server Logs” folder under “Management” in SSMS. If you want to use T-SQL, you can use xp_instance_regwrite, as in:

EXEC xp_instance_regwrite
N’HKEY_LOCAL_MACHINE’
,N’Software\Microsoft\MSSQLServer\MSSQLServer’
,N’NumErrorLogs’, REG_DWORD, 15;

Suggestion 2: Set a limit for the size of the errorlog file.

But what about the size? Say that we have crash dumps, for instance. Or other things that start to happen very frequently. The good news is that as of SQL Server 2012, we can set a max size for the errorlog file. There is no GUI for this, so we have to manipulate the registry directly. Again, we can use xp_instance_regwrite. Below will limit the size to 30 MB:

EXEC xp_instance_regwrite
N’HKEY_LOCAL_MACHINE’
,N’Software\Microsoft\MSSQLServer\MSSQLServer’
,N’ErrorLogSizeInKb’, REG_DWORD, 30720;

With 15 files, you can patch of your SQL Server machine without aging out all old errorlog files. And with a max size of 30 MB, you keep each file manageable in size. And you keep the total size of errorlog files for that instance to 450 MB. Not enough to fill your disks. But enough to have historical information for when you are about to perform a health check on your SQL Server instance.

References: this by Jan Kåre and this by Paul Randal.

Can you restore from your backups? Are you sure?

 

A few days ago, we were doing restore practice drills with a client. I had tested the stuff before this, so the practice was more for the client’s DBAs to test various restore scenarios, with me being able to point to the right direction (when needed), supplement the run-book and stuff like that. Always fun, I like these drills!

Anyhow, This client does regular SQL Server backups to disk (full, and for some databases also log) at 19:00. They also snap the machines every night at 04:00. We don’t want to have dependencies on the machine snap, but it is nice to have in case a machine it totaled and we now can restore from such a snapshot. The issue is that this machine snapshot is seen as a full backup by SQL Server. We all know that a full backup do not affect the log backup chain, but the restore GUI doesn’t care about that!

So the restore GUI suggest that you restore from 04:00 full backup (which isn’t a restoreable backup as it was a snapshot) and then the subsequent log backups. What we need to do is to restore from earlier 19:00 full backup, and then all log backups – ignoring the 04:00 snapshot backup.

Fortunately, my client by themselves (without my intervention) did the restore using T-SQL commands, knowing what backup exists, and figuring out what to restore. But I also wanted them to test the GUI, just so they know how that look like. Of course, you can do a restore from 19:00 to 03:55, and script that to a query window. Then then from 04:00 to current time (or whatever) and script that too,. And then stitch these together. But just typing (with some copy-paste) the commands are much easier.

My point? Test your restores. Do not expect anything. A production situation is not the right time to try to figure these things and trying to cope with it.

About this snapshot software: The next version is expected to have an option to produce the snapshot as a COPY_ONLY backup. Isn’t that great? Now we expect the restore GUI to skip this COPY_ONLY backup, right? No, that was not that I saw. Having an option to produce the backup as COPY_ONLY finally allow us to implement differential backups, but it (from my tests) won’t help with the restore GUI issues. Btw, here is a related post.

Here’s a query that might be helpful if you want to see what type of backups are produced. (I appreciate feedback from anybody if you can see if a snapshot backup sets 1 in the is_snapshot column – I don’t have environment to test at the moment…)

 

SELECT TOP(100)
database_name
,CASE bs.TYPE
WHEN 'D' THEN 'Full'
WHEN 'I' THEN 'Differential'
WHEN 'L' THEN 'Log'
WHEN 'F' THEN 'File or filegroup'
WHEN 'G' THEN 'Differential file '
WHEN 'P' THEN 'Partial'
WHEN 'Q' THEN 'Differential partial'
END AS backup_type
,bs.is_copy_only
,bs.is_snapshot
,DATEDIFF(SECOND, bs.backup_start_date, bs.backup_finish_date) AS backup_time_sec
,mf.physical_device_name
,bs.database_name
FROM msdb.dbo.backupset AS bs
INNER JOIN msdb.dbo.backupmediafamily AS mf ON bs.media_set_id = mf.media_set_id
ORDER BY backup_finish_date DESC;