Mirroring: what happens if principal loses contact with both mirror and wittness?

Imagine a database mirroring setup where you have two LANs with a WAN link in between. Let’s call them site A and site B. Say that principal is currently in site A, and both mirror and witness are in site B. I.e., we are running database mirroring with a witness, and assuming we are running safety FULL (synchronous), we have auto-fail over.

Now, what is really fail over when it comes to mirroring? the simple answer is that the mirror will finish the recovery process (UNDO) and make the database available. So, what if the mirror and witness becomes isolated from the principal? In out setup, this would happen is site A is disconnected from site B. Well, the fail-over protocol is as follows:

If the mirror cannot contact the principal, it asks the witness if it has contact with the principal. If the witness responds and says it doesn’t have contact with the principal, then they consider this a quorum. They are in majority and fail over occurs. I.e., the mirror will make the database available (and essentially becomes the principal).

But, think about the poor principal now. It is isolated and knows nothing about what happens at the other end. The only logical thing to do is to make the database inaccessible. If that didn’t happen, then we would have a case where both sites had the database available, and we wouldn’t want that!

This is nothing new, and it is clearly stated in Ron Talmage’s great whitepaper on mirroring: http://technet.microsoft.com/en-us/library/cc917680.aspx.

This is easy to test. Setup mirroring (you can even do it between three instances in the same OS). Make sure you are running safety full and have a wittness. Now you check what Windows process ID the principal has:

SELECT SERVERPROPERTY(‘ProcessID’)

Based on above, you now use Task Manage to kill the *other* two SQL Servers. Make sure that SQL Server Agent isn’t running for those two, else agent will start those SQL Servers again. Management Studio will now show the database as “In Recovery”, and if you try to access it, for instance from a query window, you get below error:

Msg 955, Level 14, State 1, Line 1
Database Adventureworks2008 is enabled for Database Mirroring,
but the database lacks quorum: the database cannot be opened.
Check the partner and witness connections if configured.

How selective do we need to be for an index to be used?

You know the answer already: It depends. But I often see some percentage value quoted and the point of this post is to show that there is no such percentage value.

To get the most out of this blog post, you should understand the basic structure for an index, i.e. how the b+ tree look like. You should also understand the difference between a clustered and a non-clustered index. In essence, you should be able to visualize these structures and searches through them as you read the text. If you find that difficult, draw a few versions on a piece of paper and “navigate” through them by tracing through them with a pen or your finger. After a while, you will do this in your mind. For instance, check out the sections under this.

I’m not saying that we shouldn’t consider selectivity when designing indexes – of course we should! I’m not saying that one shouldn’t have some vague “feeling” about how much data to be return when making such decisions. What I will prove is that there is in reality no set percentage that the optimizer uses. The comment we usually see is something like:

“If we return more than 5% of the rows, then an index will not be used.”

Where did that 5% number came from? I can assure you that this is not some hard-wired number in the optimizer (except for an edge-case, see below). The optimizer aims at running the query with as low cost as possible. Considering the data access part (think WHERE clause and the condition), this is pretty much about reading as few pages as possible (few page-accesses).

Just to cut down a bit on the thread that might follow these types of blogs (“Hey, when I do this, your observations doesn’t match, your blog post is incorrect!”), let us first consider some special cases:

Clustered index
The clustered index *is* the data. If the search condition (SARG) is SEEKable, then SQL Server will obviously seek through a clustered index instead of scan it. Anything else would be stupid. There can be *other* search conditions that are more efficient, but we are considering one search condition at-a-time.

Non-clustered index that covers the query
This is pretty much the same argument as for above. Since all data is in the index (“covers the query”), not seeking it would be stupid. Again, there can be cheaper alternatives for any of the other search conditions, but we are considering one condition at-a-time.

The value is not known to the optimizer
This is what happens when you have a TSQL variable you compare against. Something like “colname = @v”. The optimizer has no knowledge of the contents of this variable. Either it uses density (where applicable, like “=”), as stored in the statistics information of the index. Where not applicable (like “>”, “<“, “BETWEEN” etc), then the optimizer actually do use some hard-wired percentage value. This value can change between versions so give it a spin of you want to know what value you have for your version/build number. Note that a variable is not the same thing as a parameter. SQL Server sniffs parameters (parameter sniffing). Read this for elaboration: http://msdn.microsoft.com/en-us/library/ee343986.aspx.

The search expression is not seek-able
I hope you know this already, but just to point it out. In most cases, having some calculation at the column side will void the ability to seek through the index. This should ideally be known to all T-SQL developers: Never do calculations at the column side! So, things to avoid are like “colname * 23 > 45” or “SOMEFUNCTION(colname) = 44”.

Hopefully by now we all understand that there are always special cases and exceptions. The more of Kalen’s books you have read, the more you understand this. What we are discussing here is the typical situation. OK? Fine. So, “Why is there no percentage value that the optimizer uses?”, you ask. Because the value will differ. In short, SQL Server wants to read as few pages as possible. In the most simple example, the alternative to an index seek is a table scan. So we will use this as basis for your discussion. There can be other alternatives to the table scan (using some other index for some other condition), but that doesn’t change the principal “it depends” concept.

In essence, it is all about the alternative. As I said, our example wil use a table scan as alternative. A table scan (or clustered index scan if it is a clustered table) means that SQL Server will look at every page and see what rows satisfies the search condition on each page.

My example has two different tables, both with 100,000 rows. These tables both have an integer column with consecutive increasing unique values, which also has a non-clustered index. I will see how selective I need to be when searching on this column in order for an index search to be done, compared to a table scan. I.e, find this percentage cut-off value.

The fewrows table only fit one row per data page. This means 100,000 data pages. My tests show that the cutoff for fewrows is about 31,000 rows. I.e., we need to be more selective than 31% for the index to be used.

The manyrows table fit 384 rows per page. This means 260 pages. My tests show that the cutoff for fewrows is about 155 rows. I.e., we need to be more selective than 0.16% for the index to be used.

You might end up with different exact numbers, depending on what you have in the statistics, the build number of your SQL Server etc. But what you will see that a similar pattern. A huge difference between the two.

It is all about the alternative
If I look at my indexes using sys.dm_db_index_physical_stats, I will see that the non-clustered index on the int column for the two tables are exactly the same (same number of pages in the index, etc). So, two indexes with the same characteristics have very different cut-off values. How can that be? It is because the alternative differs. The alternative for this example is a table scan. For the bigrows table, the alternative means reading 100,000 pages. But for the smallrows table, the alternative means reading only 260 pages. There can of course be other alternatives, like using some other index for some other search condition. This is, in the end, why we don’t have a set percentage value: it is all about the alternative!

Conclusion
The typical cases will of course fall somewhere between my more extreme examples. But my examples show that there is no set percentage value used by the optimizer. I showed that for my test, the percentage value can be as low as 0.15% or as high as 31%. What matter is the alternative!

T-SQL

USE tempdb
GO

IF OBJECT_ID('manyrows'IS NOT NULL DROP TABLE manyrows
IF OBJECT_ID('fewrows'IS NOT NULL DROP TABLE fewrows
GO

CREATE TABLE manyrows(c1 INT IDENTITY PRIMARY KEYc2 INTc3 INT)
CREATE TABLE fewrows(c1 INT IDENTITY PRIMARY KEYc2 INTc3 CHAR(4500))
GO

INSERT INTO manyrows
SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.OBJECT_IDAS c2AS c3
FROM sys.columns AS asys.columns AS b

INSERT INTO fewrows
SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.OBJECT_IDAS c2'hi' AS c3
FROM sys.columns AS asys.columns AS b

CREATE INDEX ON manyrows (c2)
CREATE INDEX ON fewrows (c2)

--Number of pages etc:
EXEC sp_indexinfo 'manyrows'
-- Data: 265 pages (2.07 MB)
-- Index x: 187 pages (1.46 MB)

EXEC sp_indexinfo 'fewrows'
-- Data: 100385 pages (784 MB)
-- Index x: 187 pages (1.46 MB)

SELECT OBJECT_NAME(OBJECT_ID), *, OBJECT_NAME(OBJECT_ID)
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'DETAILED')

--Run below with showplan:
SELECT FROM manyrows
WHERE c2 BETWEEN AND 155
--155 = ix search, 156 = ts
--Cut-off is 0.16%

SELECT FROM fewrows
WHERE c2 BETWEEN AND 31000
--31000 = ix search, 32000 = ts
--Cut-off is 31%

Restore database to the point of disaster

This is really basic, but so often overlooked and misunderstood. Basically, we have a database, and something goes south. Can we restore all the way up to that point? I.e., even if the last backup (db or log) is earlier than the disaster?
Yes, of course we can (unless for more extreme cases, read on), but many don’t realize/do that, for some strange reason.

This blog post was inspired from a thread in the MSDN forums, which exposed just this misunderstanding. Basically the scenario was that they do db backup and only log backup once a day. Now, doing log backup that infrequent is of course a bit weird, but that is beside the point. The point is that you can recover all the way up to the point of disaster. Of course, it depends on what the disaster is (don’t expect too much if the planet blows up, for instance).

Since “log backup only once a day” was mentioned, I will first elaborate a bit on frequency for database vs log backups. For the sake of discussion, say we do both db and log backup once a day. You say:
“What? Both db backup and log backup once a day – why would anybody do that way? Wouldn’t one do log backup more frequently than db backup?”
Yes, of course (but I actually see such weird implementations from time to time). But again, that doesn’t change the topic at hand, but I will first elaborate on this; just so we don’t see blurring comments later arguing this irrelevant argument.

So, lets first sort out two different cases:

A) Log backup before the db backup
1: db backup

2: log backup
3: db backup
crash
Here we will use backup 3 when we later will restore.

B) Db backup before log backup
1: db backup

2: db backup
3: log backup
crash
Here we will use backup 2 and 3 when we later will restore.

You see that A) and B) are really the same thing? What is relevant is that we have all log records available (in ldf file/log backups) since the db backup we chose to use as starting point for the restore. Actually, for A), we could might as well use backup 1 and 2 (and skip 3)!

“Hang on”, you say, “we’re not done yet. What about the modifications since the last log backup! Gotcha!”
No worries, this is where it gets interesting, and below is really the heart of the topic. Clearly, we need to get the log records out of the ldf file into a log backup (file). If we can do that, then we will call this backup number 4, and use as the last backup for our restore. After doing that restore, we have no data loss!

So, how do we produce a log backup after a disaster?
It depends on the disaster! Let’s discuss a few scenarios:

a) Planet Earth blows up.
No can do. I doubt that anyone of you has mirrored data centers on Moon or Mars; and also people stationed off-Earth for these situations. Of course, I’m being silly. But my point is that you can always have a disaster such that you can’t produce that last log backup. No matter how much you mirror: if the disaster takes out all mirrors, then you are toast. Remember that when you talk SLA’s. That fact is not popular, but it can’t be argued. It is all about limiting the risk exposure – not eliminating it. Anybody who believes we can eliminate risk exposure is dreaming. Agreed? Good. Let’s move on to (hopefully) more realistic scenarios:

b) Something happens with the data file(s), lost, corrupt or so. Ldf file is still there.
This is the easy case, but so often overlooked. What you do now is to backup the log of the damaged database, using the NO_TRUNCATE option. Something like:
BACKUP LOG dbname TO DISK = ‘C:\dbname.trn’ WITH NO_TRUNCATE
Yes, it really is that simple. Then restore backups from above, including this last log backup. Don’t believe me? Test it.

  1. Create database and table
  2. Insert some data
  3. Do db backup (1)
  4. Insert some more data
  5. Do log backup (2)
  6. Insert some more data
  7. Stop SQL Server
  8. Delete mdf file
  9. Start SQL Server
  10. Do log backup using NO_TRUNCATE (3)
  11. Restore 1, 2 and 3.

c) Something happens with the database. Ldf file is NOT still there.
Clearly, if the ldf file is really gone, we can’t do a log backup – how much as we might want to. Remember the old days, when redundancy for disks (RAID) wasn’t as common as today? “If there’s anywhere you want redundancy, it is for the transaction log files!”

d) Something happens with the data file(s), lost, corrupt or so. Ldf file is still there. The installation is toast – we can’t start SQL Server.
This seems a bit more nerve-wracking, right? Not to worry, just do the right steps and you will be fine. You probably ask now:

“But how can we backup the transaction log when our SQL Server won’t start?”

That is a good question. You need to get that ldf file to a healthy SQL Server, and make SQL Server believe this is the ldf file for a broken database on that instance. It is not really complicated. Just use a dummy database on that SQL Server as intermediate – to get the right meta-data into that SQL Server, so in turn it will allow you to produce this last log backup. I will show just that:

I have two instances on my machine (named “a” and “b”). I will create and damage a database on instance a, and then produce a log backup for that orphaned ldf file a different instance, b. I will pretend these are on two different machines, using separate folders for the database files “C:\a” and “C:\b”. Here’s the T-SQL, starting with instance a:

IF DB_ID('x'IS NOT NULL DROP DATABASE x
GO
CREATE DATABASE x
ON PRIMARY
(NAME N'x'FILENAME N'C:\a\x.mdf'SIZE 10MBFILEGROWTH 10MB)
LOG ON
(NAME N'x_log'FILENAME N'C:\a\x_log.ldf'SIZE 5MBFILEGROWTH 5MB)
GO
ALTER DATABASE SET RECOVERY FULL
CREATE TABLE 
x.dbo.t(c1 INT IDENTITY)
INSERT INTO x.dbo.t DEFAULT VALUES --1
BACKUP DATABASE TO DISK = 'C:\x.bak' WITH INIT
INSERT INTO x.dbo.t DEFAULT VALUES --2
BACKUP LOG TO DISK = 'C:\x1.trn' WITH INIT
INSERT INTO x.dbo.t DEFAULT VALUES --3
SELECT FROM x.dbo.t

–Stop SQL Server and delete below file
–C:\a\x.mdf
–Start SQL Server

–Oops, damaged database…:
SELECT FROM x.dbo.t

–Stop SQL Server, pretend installation is toast

Do we agree that we have a damaged database, and there has been done modifications since the last log backup? Ok, fine. We now pretend that SQL Server instance “a” doesn’t start anymore. So, I will try to produce a log backup from that ldf file on instance “b”:

IF DB_ID('x2'IS NOT NULL DROP DATABASE x2
IF DB_ID('x'IS NOT NULL DROP DATABASE x
GO
CREATE DATABASE x2
ON PRIMARY
(NAME N'x2'FILENAME N'C:\b\x2.mdf'SIZE 9MBFILEGROWTH 8MB)
LOG ON
(NAME N'x2_log'FILENAME N'C:\b\x2_log.ldf'SIZE 6MBFILEGROWTH 7MB)
GO

–Stop SQL Server and delete below files
–C:\b\x2.mdf
–C:\b\x2_log.ldf

–Copy the C:\a\x_log.ldf to C:\b\x2_log.ldf

–Start SQL Server

–Produce our last log backup:
BACKUP LOG x2 TO DISK = ‘C:\x2.trn’ WITH INITNO_TRUNCATE

–Restore the database, up to last transaction.
–Investigate logical file names for MOVE options first:
RESTORE FILELISTONLY FROM DISK = ‘C:\x.bak’

RESTORE DATABASE FROM DISK = ‘C:\x.bak’
WITH
NORECOVERY
,MOVE ‘x’ TO ‘C:\b\x.mdf’
,MOVE ‘x_log’ TO ‘C:\b\x_log.ldf’

RESTORE LOG FROM DISK = ‘C:\x1.trn’ WITH NORECOVERY
RESTORE LOG FROM DISK = ‘C:\x2.trn’ WITH RECOVERY

–Data there?
SELECT FROM x.dbo.t
–See? That wasn’t so difficult.

Note how I even named the dummy database differently on instance b, with different physical file names and different file sizes (all compared to what we had on instance a). Typically, you will use same database name and same filename, but I want to show that we don’t really have to know a whole lot about the damaged database in order to produce a log backup from the ldf file!

Case closed.

 

Are log records removed from ldf file for rollbacks?

Seems like a simple enough question, right? This question (but more targeted, read on) was raised in an MCT forum. While the discussion was on-going and and I tried to come up with answers, I realized that this question are really several questions. First, what is a rollback? I can see three different types of rollbacks (there might be more, of course):

  1. Regular rollback, as in ROLLBACK TRAN (or lost/terminated connection)
  2. Rollback done by restore recovery. I.e., end-time of backup included some transaciton which wasn’t committed and you restore using RECOVERY, so SQL Server need to rollback this transaction (UNDO).
  3. Rollback done by crash (restart) recovery. I.e. SQL Server was shut down with some open transaction.

I’m going to try to show whether log records are removed or still present for these three types of rollback situations. I will use the fn_dblog function. This isn’t documented or supported, but search the internet and you will find how to use it. The result isn’t documented either, of course, so we have to guess a bit what the various values mean…

The TSQL script has some common parts (the same part executed repeatedly, once for each test):

  1. Create a database
  2. Make sure it is in full recovery
  3. Do a database backup
  4. Create a table (myTable)
  5. Insert a row into myTable. This last operation generates 5 log records for the myTable table: one for the PFS page, two for IAM pages, one format page for the heap and the last one is a LOP_INSERT_ROWS.
  6. Start a transaction
  7. Insert one more row into myTable. We now have one more log record for myTable (LOP_INSERT_ROWS). Looking at the transaction id for this last insert, we see two log records (one LOP_BEGIN_XACT and the LOP_INSERT_ROWS). Note that this transaction is now open!

Here is the above mentioned first part of the script:

USE master
IF DB_ID(‘x’IS NOT NULL DROP DATABASE x
GO
CREATE DATABASE x
GO
ALTER DATABASE SET RECOVERY FULL
BACKUP DATABASE 
TO DISK = ‘C:\x.bak’ WITH INIT

USE x
CREATE TABLE myTable(c1 INT IDENTITY)
INSERT INTO myTable DEFAULT VALUES

SELECT 
FROM fn_dblog(NULL, NULL) WHERE AllocUnitName LIKE ‘%myTable%’
–5 rows

BEGIN TRAN
INSERT INTO 
myTable DEFAULT VALUES

SELECT 
FROM fn_dblog(NULL, NULL) WHERE AllocUnitName LIKE ‘%myTable%’
–6 rows

SELECT FROM fn_dblog(NULL, NULL) 
WHERE [Transaction ID] (SELECT TOP(1[Transaction ID] 
                          
FROM fn_dblog(NULL, NULL)
                          
WHERE AllocUnitName LIKE ‘%myTable%’
                          
ORDER BY [Current LSN] DESC)
–2 rows

Now, on to the different cases: 

1. Regular rollback, as in ROLLBACK TRAN.
It seems reasonable to me that SQL Server will just add some “rollback transaction” log record here. So, let’s try that (continuing on above first part)… We now have 7 log records for myTable, with an added LOP_DELETE_ROWS which undo the previously insert. And for our transaction ID, we have 4 rows, with added two rows being LOP_DELETE_ROWS (compensation log record) and a LOP_ABORT_XACT.

–Regular ROLLBACK
ROLLBACK TRAN

SELECT 
FROM fn_dblog(NULL, NULL) WHERE AllocUnitName LIKE ‘%myTable%’
–7 rows

SELECT FROM fn_dblog(NULL, NULL) 
WHERE [Transaction ID] (SELECT TOP(1[Transaction ID] 
                          
FROM fn_dblog(NULL, NULL)
                          
WHERE AllocUnitName LIKE ‘%myTable%’
                          
ORDER BY [Current LSN] DESC)
–4 rows

2. Restore recovery
Now, what does this really mean? Restoring a transaction log backup is a good example. SQL Server read log records from the transaction log backup file and writes them into the LDF file. This is what we call the “data copy” phase. Then SQL Server performs REDO (a.k.a. roll forward). And finally, SQL Server performs UNDO (roll back), assuming we don’t do the restore using the NORECOVERY option. Restoring from a database backup isn’t any differene except the log records are of course read from the database backup file.

Here it seems likely that SQL Server will wipe more or less anything from the LDF file as soon as the database is restored and brought on-line. Why? The log in this newly restored database can’t serve as a starting point for a restore operation for this database. You first need a database backup. So, no use hanging onto log records either! Let’s see if we can verify that:

–Restore recovery

–Do this from different connection
BACKUP LOG TO DISK = ‘C:\x.bak’
–We now have open transaction in database!

–Perform RESTORE
ROLLBACK
USE 
master
RESTORE DATABASE FROM DISK = ‘C:\x.bak’ WITH REPLACENORECOVERY
RESTORE LOG FROM DISK = ‘C:\x.bak’ WITH FILE = 2RECOVERY

–Rollback was done, and database berought online. 
–What log records do we have?

SELECT FROM fn_dblog(NULL, NULL) WHERE AllocUnitName LIKE ‘%myTable%’
–0 rows

–Use the transaction ID from earlier SELECT
SELECT FROM fn_dblog(NULL, NULL) 
WHERE [Transaction ID] ‘0000:00000203’
–0 rows

So, SQL Server will remove user-defined stuff from LDF file after restore recovery was performed. Makes sense.

3. Crash recovery (a.k.a. restart or startup recovery)
I couldn’t really guess here. So, lets give it a spin immediately and see:

–Crash recovery

–Do below from separate connection
SHUTDOWN WITH NOWAIT

— startup SQL Server and examine the log records:

SELECT FROM fn_dblog(NULL, NULL) WHERE AllocUnitName LIKE ‘%myTable%’
–5 rows

–Use the transaction ID from earlier SELECT
SELECT FROM fn_dblog(NULL, NULL) 
WHERE [Transaction ID] ‘0000:00000204’
–0 rows

Hmm, so SQL Server removes the log record after the rollback was performed. Makes sense.

But this got me thinking some more. How can this be done… physically? SQL Server would just “reverse” the head of the log a bit. But what if we have later transactions for other connections, which has been committed? SQL Server can’t ignore those, of course. These need to be kept in the LDF file for subsequent log backups. OTOH, I doubt that SQL Server will somehow physically delete things “in the middle” of an ldf file. Time for yet another test:

–Crash recovery, part two

–Do below from separate connection
–First something which creates more recent log records
CREATE TABLE y(c1 INTINSERT INTO y(c1VALUES(1)
SHUTDOWN WITH NOWAIT

— startup SQL Server and examine the log records:

SELECT FROM fn_dblog(NULL, NULL) WHERE AllocUnitName LIKE ‘%myTable%’
–7 rows

–Use the transaction ID from earlier SELECT
SELECT FROM fn_dblog(NULL, NULL) 
WHERE [Transaction ID] (SELECT TOP(1[Transaction ID] 
                          
FROM fn_dblog(NULL, NULL)
                          
WHERE AllocUnitName LIKE ‘%myTable%’
                          
ORDER BY [Current LSN] DESC)
–4 rows

Now the log records for our rolled back transaction are still there! So, just as when we did a regular rollback, SQL Server inserted a LOP_DELETE_ROWS to reflect the undo of the INSERT, and then a LOP_ABORT_XACT.

Conclusion
Isn’t it beautiful when it all makes sense? Here are my conclusions, whether log records are kept or removed from transaction log file (LDF) for various types of rollback scenarios:

  • Regular rollback. Log records are not removed. Compensation log records are logged, reflecting undo of the modifications, and then an LOP_ABORT_XACT is logged.
  • Restore recovery. Log records are removed.
  • Crash recovery. It depdends. If you have a transaction which is at the very head of the log, then those log records can be removed. If there are other, subsequent committed transactions, then compensation log records are logged, reflecting undo of the modifications, and then a LOP_ABORT_XACT is logged.

Adding a PK online?

I just read in a forum about a user who want to replikate a table, but the table doesn’t have a PK. The table is pretty large, and having the table not available while adding the PK is undesireable. The table has a clustered index already, and there are other columns which are known to be unique (presence of unique indexes).

What I wanted to test is whether we can just add the PK constraint using the ONLINE option. Show answer is “yes”. We can’t turn a unique index into a PK using some meta-data only operation, unfortunately. That would be the easiest step. But we can add a unique constraint using the ONLINE option – there’s even an example syntax for this in BOL. We can then remove the pre-existing unique index using ONLINE. Since we are using ONLINE, we need to be on Enterprise or Developer Edition.

I wanted to test this, and below is my test script:

USE tempdb
SET NOCOUNT ON
GO

IF OBJECT_ID('t'IS NOT NULL DROP TABLE t
GO
CREATE TABLE t(c1 INT NOT NULL, c2 CHAR(100))
CREATE UNIQUE CLUSTERED INDEX ON t(c1)

INSERT INTO t
SELECT TOP(5000000ROW_NUMBER() OVER(ORDER BY a.id), 'x'
FROM syscolumns AS a
CROSS JOIN syscolumns AS b
CROSS JOIN syscolumns AS c
GO

-----------------------------------------------------
--Now try to add a PK "online"...:
-----------------------------------------------------

--Add a nullable identity?
ALTER TABLE ADD c3 INT IDENTITY NULL
--Msg 8147, Level 16, State 1, Line 1
--Could not create IDENTITY attribute on nullable column 'c3', table 't'.
GO

--Add a PK using ONLINE?
--Prepare a new connection with following INSERTs
--to verify it can run simultaneously:
--INSERT INTO t(c1, c2) VALUES(5000001, 't')
--INSERT INTO t(c1, c2) VALUES(5000001, 't')
--INSERT INTO t(c1, c2) VALUES(5000002, 't')
--GO
--INSERT INTO t(c1, c2) VALUES(5000003, 't')

--Above prepared? OK, execute below and jump to
--other window to verify it is online
ALTER TABLE ADD CONSTRAINT PK_t PRIMARY KEY NONCLUSTERED (c1WITH(ONLINE = ON)
GO

--Verify the indexes using my own sp_indexinfo
EXEC sp_indexinfo 't'

Is there an overhead to RPC events?

I recently read a discussion whether RPC events add overhead compared to just submitting text. I got curious and did some testing, which I want to share.

Background, SQL events
Using most APIs, if you just submit a query to SQL Server, you will get what we call an SQL event (for instance in Profiler an SQL:BatchCompleted). The client app submits some text, SQL Server parses the text, and either creates an execution pland or re-uses a cached plan (if such exists, based on checksum of the query text submitted). Now, this is very straight forward. But there’s also problems, mainly risk for SQL Injection and bad query plan re-use. Your query probably has a WHERE clause, and if you just concatenate the values you will have in the WHERE clause, then you will submit different text each time. Different text means different execution plans – bad plan re-use. Also, if you have some malicious user or are under hacker-attack, then somebody can inject some query or part of query inside what you think would be only the search value (this is a classic).

Stored procedures as RPC events
Now, consider if we are using stored procedures. We might want to pass the proc name, and the values for each parameter, where we pass the parameter values as binary data. Do this right (for instance in ADO, you configure your command object’s CommandType property as being CommandType.StoredProcedure (instead of CommandType.Text which is default). This causes the batch be shown in Profiler as an RPC event, for instance RPC:Completed. So, consider executing the same procedure either by constructing a string such as “EXEC myProc @p1 = ‘20080223’, @p2 = ‘Stockholm'” as text (SQL event), or passing it as a stored procedure using parameter objects (RPC event). We expect RPC event to be more efficient, right? We’ll get to that in a moment.

Queries as RPC events
Now, we can also submit queries, without using stored procedures, so they are passed as RPC events. We do this by adding at least one parameter to the command object. This causes ADO.NET to use an RPC event and execute sp_executesql, parameterizing accoring to the parameter object for the command object. This causes your code to be safe from SQL injection and will allow for better plan re-use. I.e., in most cases a very good thing – second best to use stored procedures!

Non-parameterized queries as RPC events
Now, consider if we don’t want to parameterize our queries (we don’t have a WHERE clause, we just don’t want to, we use some programming layer which doesn’t allow for it, or we are just plan curious – like now). Can we still get RPC events? Yes, by adding a dummy parameter to the command object. We don’t have to use this command object in the query – just the existence causes an RPC event instead of SQL event. This allow us to compare SQL events to RPC event and look at only the overhead for the sp_executesql part.

Overhead for using sp_executesql
Below is some VB.NET code which does just that. For the first two, we use exactly the same query multiple executions and compare timing for doing them as SQL events or RPC events. For the second two, we change what produceID we search for (but not parameterizing the search condition) which causes different execution plans for each execution. I verified by counting execution plans that the first two uses the same execution pland and the second two generates a new plan for each iteration in the loop. Here’s the VB.NET code:

Imports System.Data
Imports System.Data.SqlClient
Module Module1

Sub Main()

Dim As Int32iterations As Int32 20000
Dim tickcount As Int64
Dim ProductID As Int32 43664
Dim ProdStatus As Int16
Dim sql As String

sql "SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = "
Dim cn As New SqlConnection("SERVER=localhost\a;DATABASE=AdventureWorks2008;INTEGRATED SECURITY=TRUE")
cn.Open()

Dim utilCmd As New SqlCommand("DBCC FREEPROCCACHE"cn)
utilCmd.ExecuteNonQuery()

'Definitions
'**********************************************************************************************
'Same query every execution, SQL:BatchCompleted
Dim cmdAllText As New SqlCommand(sql "43664"cn)
cmdAllText.CommandType CommandType.Text

'Same query every execution, RPC:Completed
Dim cmdParmText As New SqlCommand(sql "43664"cn)
cmdParmText.CommandType CommandType.Text
cmdParmText.Parameters.Add
("@dummy"SqlDbType.Int)
cmdParmText.Parameters("@dummy").Value -9999

'Different query every execution, SQL:BatchCompleted
Dim cmdAllText2 As New SqlCommand()
cmdAllText2.Connection cn
cmdAllText2.CommandType 
CommandType.Text

'Different query every execution, RPC:Completed
Dim cmdParmText2 As New SqlCommand()
cmdParmText2.Connection cn
cmdParmText2.CommandType 
CommandType.Text
cmdParmText2.Parameters.Add
("@dummy"SqlDbType.Int)
cmdParmText2.Parameters("@dummy").Value -9999

'Execution
'**********************************************************************************************
Console.WriteLine("Same query every execution, SQL:BatchCompleted")
tickcount Environment.TickCount
For To iterations
ProdStatus 
cmdAllText.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)

Console.WriteLine("Same query every execution, RPC:Completed")
tickcount Environment.TickCount
For To iterations
ProdStatus 
cmdParmText.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)

Console.WriteLine("Different query every execution, SQL:BatchCompleted")
tickcount Environment.TickCount
For To iterations
cmdAllText2.CommandText 
sql i.ToString()
ProdStatus cmdAllText2.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)

Console.WriteLine("Different query every execution, RPC:Completed")
tickcount Environment.TickCount
For To iterations
cmdParmText2.CommandText 
sql i.ToString()
ProdStatus cmdParmText2.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)

Console.ReadLine()

End Sub

Sub PrintOuput(ByVal iterations As Int32ByVal tickcount As Int64)
Console.WriteLine("Number of executions: " iterations.ToString)
Console.WriteLine("Total exec time (ms): " tickcount)
Console.WriteLine("Exec time per query (ms): " tickcount iterations)
Console.WriteLine("")
End Sub

End Module

And here’s a typical execution result from above code:

Same query every execution, SQL:BatchCompleted
Number of executions: 20000
Total exec time (ms): 1357
Exec time per query (ms): 0,06785

Same query every execution, RPC:Completed
Number of executions: 20000
Total exec time (ms): 1513
Exec time per query (ms): 0,07565

Different query every execution, SQL:BatchCompleted
Number of executions: 20000
Total exec time (ms): 2402
Exec time per query (ms): 0,1201

Different query every execution, RPC:Completed
Number of executions: 20000
Total exec time (ms): 14446
Exec time per query (ms): 0,7223

As you can see, there is an overhead to use RPC events and sp_executesql. I.e., if we don’t parameterize our command object, then we only pay with no gain. The overhead is definitely noticeable were we generate a new plan for each execution. An interesting aspect is that if I remove the call to DBCC FREEPROCCACHE, (so we get new plan for each iteration in the loop, but plans can be re-used from prior execution of the VB code), then this is much cheaper and closer to the others (about 5.7 seconds instead of 14 seconds). So, we do pay extra for sp_executesql especially when we generate a new plan.

Is above a weird example?
Yes, it is, and I want to emphasize that. Typically, you will use sp_executesql when you actually parameterize your queries. That will cause better plan re-use (and protect from SQL injection). Above is only to show whether sp_executesql has a cost – without considering the benefits.

How about using stored procedures?
Say you are using stored procedures, and wonder how different ways to execute a procedure compares? Let’s have a look. We can do it three ways (probably more, but below are the ones I’m interested in):

  1. Build the proc command and pass it as a string. No parameter objects, no sp_executesql. You are still open to SQL injection. Don’t do this, we only want to show it for comparsion reasons. Profiler will show an SQL:BatchCompleted event with TextData something like “EXEC dbo.GetOrderStatusRes @SalesOrderID = 43664”.
  2. Use parameters for your command objects, but you forget to define the command object of type CommandType.StoredProcedure. Profiler will show an RPC:Completed event with TextData something like “exec sp_executesql N’EXEC dbo.GetOrderStatusRes @SalesOrderID = @theID’,N’@theID int’,@theID=43664”.
  3. The right way. Define the command object as of type CommandType.StoredProcedure, and of course add parameter object to the command object. Profiler will show an RPC:Completed event with TextData something like:
    “declare @p2 tinyint
    set @p2=5
    exec dbo.GetOrderStatusOut @SalesOrderID=43664,@Status=@p2 output
    select @p2”
    Note that the Profiler tool adds the declare, set and select parts, it is not part of what is submitted from the client app. The client app actually submit the paramaters as binary values. Profiler add this to be nice to us, so we can copy that text and execute it, read it, etc.

Anyhow, here’s the VB code:

Imports System.Data
Imports System.Data.SqlClient
Module Module1

Sub Main()

Dim As Int32iterations As Int32 20000
Dim tickcount As Int64
Dim ProductID As Int32 43664
Dim ProdStatus As Int16
Dim sql As String

Dim cn As New SqlConnection("SERVER=localhost\a;DATABASE=AdventureWorks2008;INTEGRATED SECURITY=TRUE")
cn.Open()

'Create the procedures we will execute
Dim utilCmd As New SqlCommand(""cn)
utilCmd.CommandText "IF OBJECT_ID('dbo.GetOrderStatusRes') IS NOT NULL DROP PROC dbo.GetOrderStatusRes" vbCrLf
utilCmd.CommandText 
+"IF OBJECT_ID('dbo.GetOrderStatusOut') IS NOT NULL DROP PROC dbo.GetOrderStatusOut"
utilCmd.ExecuteNonQuery()
utilCmd.CommandText "CREATE PROC dbo.GetOrderStatusRes @SalesOrderID int AS SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = @SalesOrderID"
utilCmd.ExecuteNonQuery()
utilCmd.CommandText "CREATE PROC dbo.GetOrderStatusOut @SalesOrderID int, @status tinyint OUT AS SET @status = (SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = @SalesOrderID)"
utilCmd.ExecuteNonQuery()

'Run each once so it is in cache
utilCmd.CommandText "EXEC dbo.GetOrderStatusRes @SalesOrderID = 43664"
utilCmd.ExecuteScalar()
utilCmd.CommandText "DECLARE @s tinyint EXEC dbo.GetOrderStatusOut @SalesOrderID = 43664, @status = @s OUT"
utilCmd.ExecuteNonQuery()

utilCmd.CommandText "DBCC FREEPROCCACHE"

'Definitions
'**********************************************************************************************
'Exec proc as string, SQL:BatchCompleted
Dim cmdAllText As New SqlCommand("EXEC dbo.GetOrderStatusRes @SalesOrderID = " ProductID.ToString(), cn)
cmdAllText.CommandType CommandType.Text

'Parameterize, but still as text, RPC:Completed with sp_executesql
Dim cmdParmText As New SqlCommand("EXEC dbo.GetOrderStatusRes @SalesOrderID = @theID"cn)
cmdParmText.CommandType CommandType.Text
cmdParmText.Parameters.Add
("@theID"SqlDbType.Int)
cmdParmText.Parameters("@theID").Value ProductID

'Parameterize as stored procedure, RPC:Completed
Dim cmdRpc As New SqlCommand("dbo.GetOrderStatusOut"cn)
cmdRpc.CommandType CommandType.StoredProcedure
Dim prmSalesOrderID As SqlParameter = New SqlParameter("@SalesOrderID"SqlDbType.Int)
prmSalesOrderID.Value ProductID
cmdRpc.Parameters.Add
(prmSalesOrderID)
Dim prmStatus As SqlParameter = New SqlParameter("@Status"SqlDbType.TinyInt)
prmStatus.Direction ParameterDirection.Output
cmdRpc.Parameters.Add(prmStatus)

'Execution
'**********************************************************************************************
utilCmd.ExecuteNonQuery()
Console.ReadLine()

Console.WriteLine("Exec proc as string, SQL:BatchCompleted")
tickcount Environment.TickCount
For To iterations
ProdStatus 
cmdAllText.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)
Console.ReadLine()

utilCmd.ExecuteNonQuery()
Console.WriteLine("Parameterize, but still as text, RPC:Completed with sp_executesql")
tickcount Environment.TickCount
For To iterations
ProdStatus 
cmdParmText.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)
Console.ReadLine()

utilCmd.ExecuteNonQuery()
Console.WriteLine("Parameterize as stored procedure, RPC:Completed")
tickcount Environment.TickCount
For To iterations
cmdRpc.ExecutteNonQuery
()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)

Console.ReadLine()

End Sub
Sub 
PrintOuput(ByVal iterations As Int32ByVal tickcount As Int64)
Console.WriteLine("Number of executions: " iterations.ToString)
Console.WriteLine("Total exec time (ms): " tickcount)
Console.WriteLine("Exec time per query (ms): " tickcount iterations)
Console.WriteLine("")
End Sub

End Module

And here’s the result from a typical execution:

Exec proc as string, SQL:BatchCompleted
Number of executions: 20000
Total exec time (ms): 1810
Exec time per query (ms): 0,0905

Parameterize, but still as text, RPC:Completed with sp_executesql
Number of executions: 20000
Total exec time (ms): 1700
Exec time per query (ms): 0,085

Parameterize as stored procedure, RPC:Completed
Number of executions: 20000
Total exec time (ms): 1388
Exec time per query (ms): 0,0694

We can see that doing it the “proper” way is cheapest, but there’s not a big difference between the three. The first alternative is not good, though, since we aren’t protected from SQL injection. And since you then will be using parameter object anyhow, just go ahead and define the CommandType as stored procedure while you’re at it.

Using sa as owner for jobs and databases

This blog is not about avoiding logging in using the sa login. Hopefully we all know about this, and work towards avoidning this practice.

Instead I want to talk about using sa, but not to login (authenticate), but as owner for jobs and databases. I want keep these thing de-individualized – so we avoid things like person A leaving the company and we don’t dare to remove that login/Windows account. We can of course create some SQL login or Windows login especially for this purpose and use that. But sa is already there. Another advantage is that sa always has the same sid number (makes moving databases across instances a bit easier).

The way Agent work is that if the owner is member of sysadmin server role, then it won’t attempt any imersonation for the job steps. I.e., Agent won’t use SETUSER (2000 and earlier) or EXECUTE AS USER = (2005 or later). This means that Agent will never actually authenticate using sa (Agent will always authenticate using a Windoes authentication – and then verify that it is sysadmin). I.e., we can change password for sa, disable sa, or even run in Windows Only mode.

And, just to be obvious: If the job should be owned by some individual, in order for operating in a proper security context, then we should use that individual as owner and not sa!

How do you handle job and database ownership? Do you have situations where the owner does matter, details?

Be careful with constraints calling UDFs

You might just not get what you think. I would be surprised if this hasn’t been blogged already, but if so, it would be worth repeating. Here’s the deal (example from a forum,, slightly re-worked):

I want the values in one column to be unique, assuming the value in another column is 1. Can I use an UDF for that?

On the surface, yes. You can write an UDF to wich you pass the value which should be conditionally unique and in that UDF check how many rows has this value AND othercolumn = 1. If more than 1 row, then function returns 0, else 1 (or something else to signal “OK” or “Not OK”). Now, you can call this function in a CHECK constraint. Something like CHECK(myFunction(uniqueCol) = 1). this will on the surface do its job, as long as you INSERT into the table. But if you update a row and only set the otherColumn for some row from 0 to 1, then the check constraint will not be checked. The optimizer is smart enough to understand that the update doesn’t change anything that we refer to in our CHECK constraint, so why bother checking the constraint? End result here is that the constraint doesn’t do what we want it to do. Use a trigger instead (or some other method). Here’s a repro:

USE tempdb
GO
IF OBJECT_ID('t'IS NOT NULL DROP TABLE t
IF OBJECT_ID('t_uq'IS NOT NULL DROP FUNCTION t_uq
GO

CREATE TABLE t(c0 INTc1 NVARCHAR(50), c2 bit)
GO

CREATE FUNCTION t_uq(@c1 NVARCHAR(50))
RETURNS bit
AS
BEGIN
DECLARE 
@ret bit
IF (SELECT COUNT(*) FROM WHERE c1 @c1 AND c2 1) > 1
SET @ret 0
ELSE
SET 
@ret 1
RETURN @ret
END
GO

ALTER TABLE ADD CONSTRAINT t_c CHECK(dbo.t_uq(c11)

INSERT INTO t(c0c1c2VALUES(1'a'0--OK
INSERT INTO t(c0c1c2VALUES(2'a'0--OK
INSERT INTO t(c0c1c2VALUES(3'b'1--OK
INSERT INTO t(c0c1c2VALUES(4'b'1--Fails

--So far so good, but watch now:

UPDATE SET c2 WHERE c0 2
--No error, the constraint doesn't do its job!

--We have invalid data:
SELECT FROM t

Do you perform log backup for the model database?

Are you stupid, you might think… But stop and think for a while. Model is no different from other databases. And by default it is in full recovery model. So as soon as you do your first database backup (you do backup your system databases, right?) the log for model will start filling up and autogrow. “But, hey, I don’t do any modifications in model!”, you probably say now. Fair, but other things happens in each database from time to time. Bottom line is that ldf file for model will start growing after a while . Perhaps not huge, but I find it “un-neat” to have a model with 3 MB mdf file and 20 MB ldf file.

Personally I prefer to have model in simple recovery since I feel that is a better default recovey model. An alternative is to regurarly set model in simple recovery and back to full recovery (schduled job).