Performance – Page 3 – Tibor Karaszi's SQL Server blog

Do you want improved performance?

Tibor Karaszi/ 2014-02-132018-06-05/ Performance

Can you survive a few lost transactions if your server does a “hard shutdown”? If so, check out SQL Server 2014 and “Delayed Durability”.

A cornerstone in SQL Server’s transaction handling has up until 2014 been “durability” for a committed transaction. Durability is by the way the “D” in the ACID acronym: Atomicity, Consistency, Isolation and Durability.

Durability means that SQL Server has do perform a synchronous write to the LDF file for each transaction. This so that SQL Server can re-construct all committed transactions up until the point of a (potentially hard) shutdown.

In SQL Server 2014, MS has planned for a database setting called “Delayed Durability”. Setting this means that SQL Server can bath writes to the ldf file, meaning a potentially significant improved performance for applications where you have many small transactions.

I did a quick test, using a bench from an earlier blog post of mine (http://sqlblog.com/blogs/tibor_karaszi/archive/2008/08/14/are-inserts-quicker-to-heap-or-clustered-tables.aspx) to test what difference I would see for that workload. Roughly (for 50000 rows, on a PC with single spinning disk HD):

All inserts in one transaction averaged about 0.3 seconds.

One transaction per row with Delayed Durability set to OFF approx 12 seconds.

One transaction per row with delayed durability set to Forced approx 1.2 seconds.

As you can see, for this workload we got about a tenfold performance improvement by letting SQL Server batch the write operations to the ldf file. The question is how much improvement you get for your workload and if you can tolerate to lose some modifications in case of a hard shutdown?

Restricting memory using Resource Governor

Tibor Karaszi/ 2011-04-192018-06-05/ Performance

You might know that Resource Governor (RG) can be used to allow you to affect resource allocation inside SQL Server. RG was introduced with SQL Server 2008 and requires Enterprise Edition or similar/higher. There are two things you can govern using RG: cpu and memory.

For introductory information on RG, see for instance these BOL topics.

A misconception I see from time to time is that the memory setting is to restrict the Buffer Pool, BP, (mainly used for caching pages from disk). This seems difficult since a page in BP has a cost, but isn’t owned by whoever brought it into cache or who referenced it last time. So, it seems difficult with current implementation of cache handling and ageing to govern the BP memory for RG. What RG does govern is “working memory” for a query. Think memory used for sorting, locks, hashing and such.

We just had such a discussion in the MSDN forums, and I decided to do a test to see if we could show that RG do not govern the BP. Brief outline of the script

Restrict the BP to 300 MB
Create two databases (a and b)
Create a table with size of 255MB in each database
Create two logins with name a and b, which will be used to execute queries
The logins end up with two different resource pools (aPool and bPool), where aPool has max mem 80% and bPool has max mem 20%
A loop which generates queries that are executed using xp_cmdshell and SQLCMD
The loop first scan over the first approx 9% of data, first login a in database a, then in login b in database b
Next iteration is then done over the following 9 % of data
After 11 iteration, we’ve done most data
Finally check how much memory in the BP each database is using

If RG were to govern the BP, we expect to see database a using lots of more pages than database b. That not what we are seeing. You will probably see that database b is using slightly more memory than a. That is because you done exactly the same queries over exactly the same data for each database and the most recent database you queried was database b (pushing out some of the pages with same cost that database a had in cache). You might want to execute the loop a few times to possibly see things even out a bit.

Seeing is believing, they say, and to me this show that Resource Governor do not govern the Buffer Pool.

If you believe there are faults in how the script does things, feel free to discuss. If you have some alternative script you want us to try, I suggest you open a thread in the MSDN forum (for example) and there post a script which is pretty immediately executable for us(possibly with modifications of file path and server name). Unless the changes you propose are so minor that can be immediately explained in your text.

Use script at own risk.

USE MASTER SET NOCOUNT ON


--Configure Buffer Pool for max 300 MB memory

EXEC sp_configure 'max server memory', 300

RECONFIGURE
--Drop and create databases a and b

IF DB_ID('a') IS NOT NULL DROP DATABASE a

IF DB_ID('b') IS NOT NULL DROP DATABASE b

GO

CREATE DATABASE a

ON  PRIMARY

(NAME = N'a', FILENAME = N'R:\a.mdf' , SIZE = 250MB , FILEGROWTH = 5MB )

LOG ON

( NAME = N'a_log', FILENAME = N'R:\a_log.ldf' , SIZE = 30MB , FILEGROWTH = 5MB)
CREATE DATABASE b

ON  PRIMARY

(NAME = N'b', FILENAME = N'R:\b.mdf' , SIZE = 250MB , FILEGROWTH = 5MB )

LOG ON

( NAME = N'b_log', FILENAME = N'R:\b_log.ldf' , SIZE = 30MB , FILEGROWTH = 5MB)

GO

ALTER DATABASE a SET RECOVERY SIMPLE

ALTER DATABASE b SET RECOVERY SIMPLE
--Create tables and populate with data, approx 250 MB each

CREATE TABLE a..t(c1 INT IDENTITY PRIMARY KEY CLUSTERED, c2 INT, c3 CHAR(100))

GO

INSERT INTO a..t (c2, c3)

SELECT TOP (90000) ROW_NUMBER() OVER(ORDER BY a.OBJECT_ID), 'hello'

FROM sys.columns AS a CROSS JOIN sys.columns AS b

GO 25

CREATE TABLE b..t(c1 INT IDENTITY PRIMARY KEY CLUSTERED, c2 INT, c3 CHAR(100))

GO

INSERT INTO b..t (c2, c3)

SELECT TOP (90000) ROW_NUMBER() OVER(ORDER BY a.OBJECT_ID), 'hello'

FROM sys.columns AS a CROSS JOIN sys.columns AS b

GO 25
--Logins

IF SUSER_ID('a') IS NOT NULL DROP LOGIN a

IF SUSER_ID('b') IS NOT NULL DROP LOGIN b

GO

CREATE LOGIN a WITH PASSWORD = 'pwd'

CREATE LOGIN b WITH PASSWORD = 'pwd'
--Users

USE a

IF DATABASE_PRINCIPAL_ID('a') IS NOT NULL DROP USER a

GO

CREATE USER a

GO

GRANT SELECT ON t TO a

GO

USE b

IF DATABASE_PRINCIPAL_ID('b') IS NOT NULL DROP USER b

GO

CREATE USER b

GO

GRANT SELECT ON t TO b

GO
--Configure RG:

USE MASTER

ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = NULL);

ALTER RESOURCE GOVERNOR RECONFIGURE;

IF EXISTS(SELECT * FROM sys.resource_governor_workload_groups WHERE name = 'aGroup')

DROP WORKLOAD GROUP aGroup

IF EXISTS(SELECT * FROM sys.resource_governor_workload_groups WHERE name = 'bGroup')

DROP WORKLOAD GROUP bGroup

IF EXISTS(SELECT * FROM sys.resource_governor_resource_pools WHERE name = 'aPool')

DROP RESOURCE POOL aPool

IF EXISTS(SELECT * FROM sys.resource_governor_resource_pools WHERE name = 'bPool')

DROP RESOURCE POOL bPool
CREATE RESOURCE POOL aPool WITH (MAX_MEMORY_PERCENT = 80);

CREATE RESOURCE POOL bPool WITH (MAX_MEMORY_PERCENT = 20);
CREATE WORKLOAD GROUP aGroup USING aPool;

CREATE WORKLOAD GROUP bGroup USING bPool;
IF OBJECT_ID ('dbo.MyClassifier') IS NOT NULL DROP FUNCTION dbo.MyClassifier;

GO
CREATE FUNCTION dbo.MyClassifier ()

RETURNS SYSNAME WITH SCHEMABINDING

AS

BEGIN

DECLARE @GroupName SYSNAME;

IF SUSER_SNAME() = 'a'

SET @GroupName = 'aGroup';

ELSE IF SUSER_SNAME() = 'b'

SET @GroupName = 'bGroup';

ELSE SET @GroupName = 'Default';

RETURN @GroupName;

END;

GO
ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = dbo.MyClassifier);

GO

ALTER RESOURCE GOVERNOR RECONFIGURE;

GO
--Allow xp_cmdshell

EXEC sp_configure 'xp_cmdshell', 1

RECONFIGURE
--Execute queries as A and B to read bunch of data

--We have 2,250,000 in the tables

--We will construct this command: SELECT COUNT(c2) FROM a..t WHERE c1 BETWEEN 1 AND 200000

--The start and end values will differ, with ranges in 200,000 increments. 11 loops will add up to 2,000,000.

DECLARE @v INT = 1, @sql VARCHAR(2000)

WHILE @v <= 2200000

BEGIN

SET @sql = 'SQLCMD /Slocalhost\a /Ua /Ppwd /Q"SELECT COUNT(c2) FROM a..t WHERE c1 BETWEEN ' + CAST(@v AS VARCHAR(20)) + ' AND ' + CAST(@v + 199999 AS VARCHAR(20))+ '"'

EXEC xp_cmdshell @sql, no_output

SET @sql = 'SQLCMD /Slocalhost\a /Ub /Ppwd /Q"SELECT COUNT(c2) FROM b..t WHERE c1 BETWEEN ' + CAST(@v AS VARCHAR(20)) + ' AND ' + CAST(@v + 199999 AS VARCHAR(20))+ '"' 

EXEC xp_cmdshell @sql, no_output 

SET @v = @v + 200000

END
--Check how many pages from each db in BP:

SELECT  

DB_NAME(b.database_id) AS database_name

,(COUNT(*) * 8192) / (1024 * 1024)) AS buffer_count_MB

FROM  sys.dm_os_buffer_descriptors AS b 

GROUP BY  b.database_id

ORDER BY database_name
EXEC sp_configure 'xp_cmdshell', 0

RECONFIGURE
/*--CleanupDROP DATABASE aDROP DATABASE bALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = NULL);ALTER RESOURCE GOVERNOR RECONFIGURE;DROP WORKLOAD GROUP aGroupDROP WORKLOAD GROUP bGroupDROP RESOURCE POOL aPoolDROP RESOURCE POOL bPool*/

How selective do we need to be for an index to be used?

Tibor Karaszi/ 2010-04-012022-02-14/ Indexes, Performance

You know the answer already: It depends. But I often see some percentage value quoted and the point of this post is to show that there is no such percentage value.

To get the most out of this blog post, you should understand the basic structure for an index, i.e. how the b+ tree look like. You should also understand the difference between a clustered and a non-clustered index. In essence, you should be able to visualize these structures and searches through them as you read the text. If you find that difficult, draw a few versions on a piece of paper and “navigate” through them by tracing through them with a pen or your finger. After a while, you will do this in your mind. For instance, check out the sections under this.

I’m not saying that we shouldn’t consider selectivity when designing indexes – of course we should! I’m not saying that one shouldn’t have some vague “feeling” about how much data to be return when making such decisions. What I will prove is that there is in reality no set percentage that the optimizer uses. The comment we usually see is something like:

“If we return more than 5% of the rows, then an index will not be used.”

Where did that 5% number came from? I can assure you that this is not some hard-wired number in the optimizer (except for an edge-case, see below). The optimizer aims at running the query with as low cost as possible. Considering the data access part (think WHERE clause and the condition), this is pretty much about reading as few pages as possible (few page-accesses).

Just to cut down a bit on the thread that might follow these types of blogs (“Hey, when I do this, your observations doesn’t match, your blog post is incorrect!”), let us first consider some special cases:

Clustered index
The clustered index *is* the data. If the search condition (SARG) is SEEKable, then SQL Server will obviously seek through a clustered index instead of scan it. Anything else would be stupid. There can be *other* search conditions that are more efficient, but we are considering one search condition at-a-time.

Non-clustered index that covers the query
This is pretty much the same argument as for above. Since all data is in the index (“covers the query”), not seeking it would be stupid. Again, there can be cheaper alternatives for any of the other search conditions, but we are considering one condition at-a-time.

The value is not known to the optimizer
This is what happens when you have a TSQL variable you compare against. Something like “colname = @v”. The optimizer has no knowledge of the contents of this variable. Either it uses density (where applicable, like “=”), as stored in the statistics information of the index. Where not applicable (like “>”, “<“, “BETWEEN” etc), then the optimizer actually do use some hard-wired percentage value. This value can change between versions so give it a spin of you want to know what value you have for your version/build number. Note that a variable is not the same thing as a parameter. SQL Server sniffs parameters (parameter sniffing). Read this for elaboration: http://msdn.microsoft.com/en-us/library/ee343986.aspx.

The search expression is not seek-able
I hope you know this already, but just to point it out. In most cases, having some calculation at the column side will void the ability to seek through the index. This should ideally be known to all T-SQL developers: Never do calculations at the column side! So, things to avoid are like “colname * 23 > 45” or “SOMEFUNCTION(colname) = 44”.

Hopefully by now we all understand that there are always special cases and exceptions. The more of Kalen’s books you have read, the more you understand this. What we are discussing here is the typical situation. OK? Fine. So, “Why is there no percentage value that the optimizer uses?”, you ask. Because the value will differ. In short, SQL Server wants to read as few pages as possible. In the most simple example, the alternative to an index seek is a table scan. So we will use this as basis for your discussion. There can be other alternatives to the table scan (using some other index for some other condition), but that doesn’t change the principal “it depends” concept.

In essence, it is all about the alternative. As I said, our example wil use a table scan as alternative. A table scan (or clustered index scan if it is a clustered table) means that SQL Server will look at every page and see what rows satisfies the search condition on each page.

My example has two different tables, both with 100,000 rows. These tables both have an integer column with consecutive increasing unique values, which also has a non-clustered index. I will see how selective I need to be when searching on this column in order for an index search to be done, compared to a table scan. I.e, find this percentage cut-off value.

The fewrows table only fit one row per data page. This means 100,000 data pages. My tests show that the cutoff for fewrows is about 31,000 rows. I.e., we need to be more selective than 31% for the index to be used.

The manyrows table fit 384 rows per page. This means 260 pages. My tests show that the cutoff for fewrows is about 155 rows. I.e., we need to be more selective than 0.16% for the index to be used.

You might end up with different exact numbers, depending on what you have in the statistics, the build number of your SQL Server etc. But what you will see that a similar pattern. A huge difference between the two.

It is all about the alternative
If I look at my indexes using sys.dm_db_index_physical_stats, I will see that the non-clustered index on the int column for the two tables are exactly the same (same number of pages in the index, etc). So, two indexes with the same characteristics have very different cut-off values. How can that be? It is because the alternative differs. The alternative for this example is a table scan. For the bigrows table, the alternative means reading 100,000 pages. But for the smallrows table, the alternative means reading only 260 pages. There can of course be other alternatives, like using some other index for some other search condition. This is, in the end, why we don’t have a set percentage value: it is all about the alternative!

Conclusion
The typical cases will of course fall somewhere between my more extreme examples. But my examples show that there is no set percentage value used by the optimizer. I showed that for my test, the percentage value can be as low as 0.15% or as high as 31%. What matter is the alternative!

T-SQL

USE tempdb GO

IF OBJECT_ID('manyrows') IS NOT NULL DROP TABLE manyrows IF OBJECT_ID('fewrows') IS NOT NULL DROP TABLE fewrows GO


CREATE TABLE manyrows(c1 INT IDENTITY PRIMARY KEY, c2 INT, c3 INT)

CREATE TABLE fewrows(c1 INT IDENTITY PRIMARY KEY, c2 INT, c3 CHAR(4500))

GO
INSERT INTO manyrows

SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.OBJECT_ID) AS c2, 5 AS c3

FROM sys.columns AS a, sys.columns AS b
INSERT INTO fewrows

SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.OBJECT_ID) AS c2, 'hi' AS c3

FROM sys.columns AS a, sys.columns AS b
CREATE INDEX x ON manyrows (c2)

CREATE INDEX x ON fewrows (c2)
--Number of pages etc:

EXEC sp_indexinfo 'manyrows'

-- Data: 265 pages (2.07 MB)

-- Index x: 187 pages (1.46 MB)
EXEC sp_indexinfo 'fewrows'

-- Data: 100385 pages (784 MB)

-- Index x: 187 pages (1.46 MB)
SELECT OBJECT_NAME(OBJECT_ID), *, OBJECT_NAME(OBJECT_ID)

FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'DETAILED')
--Run below with showplan:

SELECT * FROM manyrows

WHERE c2 BETWEEN 1 AND 155

--155 = ix search, 156 = ts

--Cut-off is 0.16%
SELECT * FROM fewrows

WHERE c2 BETWEEN 1 AND 31000

--31000 = ix search, 32000 = ts

--Cut-off is 31%

Is there an overhead to RPC events?

Tibor Karaszi/ 2010-01-112018-06-05/ Performance, Transact-SQL

I recently read a discussion whether RPC events add overhead compared to just submitting text. I got curious and did some testing, which I want to share.

Background, SQL events
Using most APIs, if you just submit a query to SQL Server, you will get what we call an SQL event (for instance in Profiler an SQL:BatchCompleted). The client app submits some text, SQL Server parses the text, and either creates an execution pland or re-uses a cached plan (if such exists, based on checksum of the query text submitted). Now, this is very straight forward. But there’s also problems, mainly risk for SQL Injection and bad query plan re-use. Your query probably has a WHERE clause, and if you just concatenate the values you will have in the WHERE clause, then you will submit different text each time. Different text means different execution plans – bad plan re-use. Also, if you have some malicious user or are under hacker-attack, then somebody can inject some query or part of query inside what you think would be only the search value (this is a classic).

Stored procedures as RPC events
Now, consider if we are using stored procedures. We might want to pass the proc name, and the values for each parameter, where we pass the parameter values as binary data. Do this right (for instance in ADO, you configure your command object’s CommandType property as being CommandType.StoredProcedure (instead of CommandType.Text which is default). This causes the batch be shown in Profiler as an RPC event, for instance RPC:Completed. So, consider executing the same procedure either by constructing a string such as “EXEC myProc @p1 = ‘20080223’, @p2 = ‘Stockholm'” as text (SQL event), or passing it as a stored procedure using parameter objects (RPC event). We expect RPC event to be more efficient, right? We’ll get to that in a moment.

Queries as RPC events
Now, we can also submit queries, without using stored procedures, so they are passed as RPC events. We do this by adding at least one parameter to the command object. This causes ADO.NET to use an RPC event and execute sp_executesql, parameterizing accoring to the parameter object for the command object. This causes your code to be safe from SQL injection and will allow for better plan re-use. I.e., in most cases a very good thing – second best to use stored procedures!

Non-parameterized queries as RPC events
Now, consider if we don’t want to parameterize our queries (we don’t have a WHERE clause, we just don’t want to, we use some programming layer which doesn’t allow for it, or we are just plan curious – like now). Can we still get RPC events? Yes, by adding a dummy parameter to the command object. We don’t have to use this command object in the query – just the existence causes an RPC event instead of SQL event. This allow us to compare SQL events to RPC event and look at only the overhead for the sp_executesql part.

Overhead for using sp_executesql
Below is some VB.NET code which does just that. For the first two, we use exactly the same query multiple executions and compare timing for doing them as SQL events or RPC events. For the second two, we change what produceID we search for (but not parameterizing the search condition) which causes different execution plans for each execution. I verified by counting execution plans that the first two uses the same execution pland and the second two generates a new plan for each iteration in the loop. Here’s the VB.NET code:

Imports System.Data Imports System.Data.SqlClient Module Module1


Sub Main()
Dim i As Int32, iterations As Int32 = 20000

Dim tickcount As Int64

Dim ProductID As Int32 = 43664

Dim ProdStatus As Int16

Dim sql As String
sql = "SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = "

Dim cn As New SqlConnection("SERVER=localhost\a;DATABASE=AdventureWorks2008;INTEGRATED SECURITY=TRUE")

cn.Open()
Dim utilCmd As New SqlCommand("DBCC FREEPROCCACHE", cn)

utilCmd.ExecuteNonQuery()
'Definitions

'**********************************************************************************************

'Same query every execution, SQL:BatchCompleted

Dim cmdAllText As New SqlCommand(sql + "43664", cn)

cmdAllText.CommandType = CommandType.Text
'Same query every execution, RPC:Completed

Dim cmdParmText As New SqlCommand(sql + "43664", cn)

cmdParmText.CommandType = CommandType.Text

cmdParmText.Parameters.Add("@dummy", SqlDbType.Int)

cmdParmText.Parameters("@dummy").Value = -9999
'Different query every execution, SQL:BatchCompleted

Dim cmdAllText2 As New SqlCommand()

cmdAllText2.Connection = cn

cmdAllText2.CommandType = CommandType.Text
'Different query every execution, RPC:Completed

Dim cmdParmText2 As New SqlCommand()

cmdParmText2.Connection = cn

cmdParmText2.CommandType = CommandType.Text

cmdParmText2.Parameters.Add("@dummy", SqlDbType.Int)

cmdParmText2.Parameters("@dummy").Value = -9999
'Execution

'**********************************************************************************************

Console.WriteLine("Same query every execution, SQL:BatchCompleted")

tickcount = Environment.TickCount

For i = 1 To iterations

ProdStatus = cmdAllText.ExecuteScalar()

Next

tickcount = Environment.TickCount - tickcount

Call PrintOuput(iterations, tickcount)
Console.WriteLine("Same query every execution, RPC:Completed")

tickcount = Environment.TickCount

For i = 1 To iterations

ProdStatus = cmdParmText.ExecuteScalar()

Next

tickcount = Environment.TickCount - tickcount

Call PrintOuput(iterations, tickcount)
Console.WriteLine("Different query every execution, SQL:BatchCompleted")

tickcount = Environment.TickCount

For i = 1 To iterations

cmdAllText2.CommandText = sql + i.ToString()

ProdStatus = cmdAllText2.ExecuteScalar()

Next

tickcount = Environment.TickCount - tickcount

Call PrintOuput(iterations, tickcount)
Console.WriteLine("Different query every execution, RPC:Completed")

tickcount = Environment.TickCount

For i = 1 To iterations

cmdParmText2.CommandText = sql + i.ToString()

ProdStatus = cmdParmText2.ExecuteScalar()

Next

tickcount = Environment.TickCount - tickcount

Call PrintOuput(iterations, tickcount)
Console.ReadLine()
End Sub
Sub PrintOuput(ByVal iterations As Int32, ByVal tickcount As Int64)

Console.WriteLine("Number of executions: " & iterations.ToString)

Console.WriteLine("Total exec time (ms): " & tickcount)

Console.WriteLine("Exec time per query (ms): " & tickcount / iterations)

Console.WriteLine("")

End Sub
End Module

And here’s a typical execution result from above code:

Same query every execution, SQL:BatchCompleted
Number of executions: 20000
Total exec time (ms): 1357
Exec time per query (ms): 0,06785

Same query every execution, RPC:Completed
Number of executions: 20000
Total exec time (ms): 1513
Exec time per query (ms): 0,07565

Different query every execution, SQL:BatchCompleted
Number of executions: 20000
Total exec time (ms): 2402
Exec time per query (ms): 0,1201

Different query every execution, RPC:Completed
Number of executions: 20000
Total exec time (ms): 14446
Exec time per query (ms): 0,7223

As you can see, there is an overhead to use RPC events and sp_executesql. I.e., if we don’t parameterize our command object, then we only pay with no gain. The overhead is definitely noticeable were we generate a new plan for each execution. An interesting aspect is that if I remove the call to DBCC FREEPROCCACHE, (so we get new plan for each iteration in the loop, but plans can be re-used from prior execution of the VB code), then this is much cheaper and closer to the others (about 5.7 seconds instead of 14 seconds). So, we do pay extra for sp_executesql especially when we generate a new plan.

Is above a weird example?
Yes, it is, and I want to emphasize that. Typically, you will use sp_executesql when you actually parameterize your queries. That will cause better plan re-use (and protect from SQL injection). Above is only to show whether sp_executesql has a cost – without considering the benefits.

How about using stored procedures?
Say you are using stored procedures, and wonder how different ways to execute a procedure compares? Let’s have a look. We can do it three ways (probably more, but below are the ones I’m interested in):

Build the proc command and pass it as a string. No parameter objects, no sp_executesql. You are still open to SQL injection. Don’t do this, we only want to show it for comparsion reasons. Profiler will show an SQL:BatchCompleted event with TextData something like “EXEC dbo.GetOrderStatusRes @SalesOrderID = 43664”.
Use parameters for your command objects, but you forget to define the command object of type CommandType.StoredProcedure. Profiler will show an RPC:Completed event with TextData something like “exec sp_executesql N’EXEC dbo.GetOrderStatusRes @SalesOrderID = @theID’,N’@theID int’,@theID=43664”.
The right way. Define the command object as of type CommandType.StoredProcedure, and of course add parameter object to the command object. Profiler will show an RPC:Completed event with TextData something like:
“declare @p2 tinyint
set @p2=5
exec dbo.GetOrderStatusOut @SalesOrderID=43664,@Status=@p2 output
select @p2”
Note that the Profiler tool adds the declare, set and select parts, it is not part of what is submitted from the client app. The client app actually submit the paramaters as binary values. Profiler add this to be nice to us, so we can copy that text and execute it, read it, etc.

Anyhow, here’s the VB code:

Imports System.Data Imports System.Data.SqlClient Module Module1


Sub Main()
Dim i As Int32, iterations As Int32 = 20000

Dim tickcount As Int64

Dim ProductID As Int32 = 43664

Dim ProdStatus As Int16

Dim sql As String
Dim cn As New SqlConnection("SERVER=localhost\a;DATABASE=AdventureWorks2008;INTEGRATED SECURITY=TRUE")

cn.Open()
'Create the procedures we will execute

Dim utilCmd As New SqlCommand("", cn)

utilCmd.CommandText = "IF OBJECT_ID('dbo.GetOrderStatusRes') IS NOT NULL DROP PROC dbo.GetOrderStatusRes" & vbCrLf

utilCmd.CommandText += "IF OBJECT_ID('dbo.GetOrderStatusOut') IS NOT NULL DROP PROC dbo.GetOrderStatusOut"

utilCmd.ExecuteNonQuery()

utilCmd.CommandText = "CREATE PROC dbo.GetOrderStatusRes @SalesOrderID int AS SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = @SalesOrderID"

utilCmd.ExecuteNonQuery()

utilCmd.CommandText = "CREATE PROC dbo.GetOrderStatusOut @SalesOrderID int, @status tinyint OUT AS SET @status = (SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = @SalesOrderID)"

utilCmd.ExecuteNonQuery()
'Run each once so it is in cache

utilCmd.CommandText = "EXEC dbo.GetOrderStatusRes @SalesOrderID = 43664"

utilCmd.ExecuteScalar()

utilCmd.CommandText = "DECLARE @s tinyint EXEC dbo.GetOrderStatusOut @SalesOrderID = 43664, @status = @s OUT"

utilCmd.ExecuteNonQuery()
utilCmd.CommandText = "DBCC FREEPROCCACHE"
'Definitions

'**********************************************************************************************

'Exec proc as string, SQL:BatchCompleted

Dim cmdAllText As New SqlCommand("EXEC dbo.GetOrderStatusRes @SalesOrderID = " & ProductID.ToString(), cn)

cmdAllText.CommandType = CommandType.Text
'Parameterize, but still as text, RPC:Completed with sp_executesql

Dim cmdParmText As New SqlCommand("EXEC dbo.GetOrderStatusRes @SalesOrderID = @theID", cn)

cmdParmText.CommandType = CommandType.Text

cmdParmText.Parameters.Add("@theID", SqlDbType.Int)

cmdParmText.Parameters("@theID").Value = ProductID
'Parameterize as stored procedure, RPC:Completed

Dim cmdRpc As New SqlCommand("dbo.GetOrderStatusOut", cn)

cmdRpc.CommandType = CommandType.StoredProcedure

Dim prmSalesOrderID As SqlParameter = New SqlParameter("@SalesOrderID", SqlDbType.Int)

prmSalesOrderID.Value = ProductID

cmdRpc.Parameters.Add(prmSalesOrderID)

Dim prmStatus As SqlParameter = New SqlParameter("@Status", SqlDbType.TinyInt)

prmStatus.Direction = ParameterDirection.Output

cmdRpc.Parameters.Add(prmStatus)
'Execution

'**********************************************************************************************

utilCmd.ExecuteNonQuery()

Console.ReadLine()
Console.WriteLine("Exec proc as string, SQL:BatchCompleted")

tickcount = Environment.TickCount

For i = 1 To iterations

ProdStatus = cmdAllText.ExecuteScalar()

Next

tickcount = Environment.TickCount - tickcount

Call PrintOuput(iterations, tickcount)

Console.ReadLine()
utilCmd.ExecuteNonQuery()

Console.WriteLine("Parameterize, but still as text, RPC:Completed with sp_executesql")

tickcount = Environment.TickCount

For i = 1 To iterations

ProdStatus = cmdParmText.ExecuteScalar()

Next

tickcount = Environment.TickCount - tickcount

Call PrintOuput(iterations, tickcount)

Console.ReadLine()
utilCmd.ExecuteNonQuery()

Console.WriteLine("Parameterize as stored procedure, RPC:Completed")

tickcount = Environment.TickCount

For i = 1 To iterations

cmdRpc.ExecutteNonQuery()

Next

tickcount = Environment.TickCount - tickcount

Call PrintOuput(iterations, tickcount)
Console.ReadLine()
End Sub

Sub PrintOuput(ByVal iterations As Int32, ByVal tickcount As Int64)

Console.WriteLine("Number of executions: " & iterations.ToString)

Console.WriteLine("Total exec time (ms): " & tickcount)

Console.WriteLine("Exec time per query (ms): " & tickcount / iterations)

Console.WriteLine("")

End Sub
End Module

And here’s the result from a typical execution:

Exec proc as string, SQL:BatchCompleted
Number of executions: 20000
Total exec time (ms): 1810
Exec time per query (ms): 0,0905

Parameterize, but still as text, RPC:Completed with sp_executesql
Number of executions: 20000
Total exec time (ms): 1700
Exec time per query (ms): 0,085

Parameterize as stored procedure, RPC:Completed
Number of executions: 20000
Total exec time (ms): 1388
Exec time per query (ms): 0,0694

We can see that doing it the “proper” way is cheapest, but there’s not a big difference between the three. The first alternative is not good, though, since we aren’t protected from SQL injection. And since you then will be using parameter object anyhow, just go ahead and define the CommandType as stored procedure while you’re at it.

Heaps, forwarding pointers, ALTER TABLE REBUILD and non-clustered indexes

Tibor Karaszi/ 2009-08-282018-06-05/ Maintenance, Performance

Let’s start with some background on forwarding pointers:

Forwarding pointers in heaps can be a mess to get rid of. A forwarding pointer happens when you modify a row so that the row doesn’t fit on the same page anymore. SQL Server moves the row to a new page and leaves a forwarding pointer where the old row used to be. This means that non-clustered indexes are not affected by the moving of the row – it can still point to the old location. So the modification is less costly than if all non-clustered indexes needed to reflect the row move. But having forwarding pointers can be bad for performance when you read data. A perhaps less known fact is that a scan over a table needs to follow forwarding pointers – essentially “jumping back and forth” a lot if you have many forwarding pointers. That can be really bad for performance (if you have table scans, of course). So, how do we get rid of forwarding pointers? Well, we can shrink the database file, but that is a little like curing a headache by jumping into a big meat grinder. We can export all data and re-import it – not very practical.

Greg Linwood reminded me that in SQL Server 2008, we can do ALTER TABLE … REBUILD. Now, I knew about this option, and every time I tell about it in class I’ve been thinking silently for myself “I need to test whether this is a way to get rid of fwd pointers”. (You generally talk about ALTER TABLE … REBUILD when you talk about enabling compression on a heap.) So, doing a REBUILD of a table using ALTER TABLE sounds promising. Will it get rid of forwarding pointers? Will it also rebuild all non-clustered indexes?

Quick answer for those who don’t care reading the TSQL script:
ALTER TABLE … REBUILD will remove forwarding pointers, but for some strange reason it will also rebuild all non-clustered indexes on that table.

See the TSQL code below. It was adapted from a challenge by Linchi Shea to produce a data loading script resulting in worst performance (where I immediately thought of forwarding pointers). See for instance http://sqlblog.com/blogs/linchi_shea/archive/2009/06/07/t-sql-exercise-simplest-data-loading-script-to-produce-worst-query-performance-june-7th-update.aspx.

USE tempdb GO


IF EXISTS (SELECT * FROM sysobjects WHERE name = 'test')

DROP TABLE test

GO

CREATE TABLE test (

x INT NOT NULL

,x2 INT NOT NULL

,y CHAR(10) NOT NULL DEFAULT ('')

,z CHAR(10) NOT NULL DEFAULT('')

)
DECLARE @rows INT = 666666, @toKeep INT = 400000, @diff INT
INSERT test (x, x2)

SELECT TOP(@rows)

ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS r

,ROW_NUMBER() OVER (ORDER BY (SELECT 1)) % 10 AS s

FROM master..spt_values a CROSS JOIN master..spt_values b

GO

ALTER TABLE test ALTER COLUMN y CHAR(892)

ALTER TABLE test ALTER COLUMN z CHAR(100)

GO
DECLARE @rows INT = 666666, @toKeep INT = 400000, @diff INT

DELETE TOP(@rows - @toKeep)

FROM test WHERE x2 IN(2, 4, 6, 8)

GO
CREATE INDEX x1 ON test(x)

CREATE INDEX x2 ON test(x2)
SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('test'), NULL, NULL, 'DETAILED')

--First  run (no nc indexes ): 387157 fwd records (out of 400000 rows), 96104 pages

--Second run (two nc indexes): 387157 fwd records (out of 400000 rows), 96105 pages
CHECKPOINT

DBCC DROPCLEANBUFFERS

DECLARE @t time = SYSDATETIME()

ALTER TABLE test REBUILD

SELECT DATEDIFF(ms, @t, CAST(SYSDATETIME() AS time))

--First run, no non-clustered indexes, three subsequent executions (ms): 19351, 20683, 20275

--Second run, with two non-clustered indexes, three subsequent executions (ms): 31803, 35065, 37511
SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('test'), NULL, NULL, 'DETAILED')

--First  run, heap = "index" 0 (no nc indexes ): 0 fwd records, 50002 pages

--Second run, heap = "index" 0 (two nc indexes): 0 fwd records, 50003 pages

--But: avg_page_space_used_in_percent changes for the nc indexes after the rebuild.

--That I take as nc indexes are also rebuilt.

Response time vs. resource consumption

Tibor Karaszi/ 2009-06-152018-06-05/ Performance

I see a trend towards more and more focusing on response time; and less and less on resource usage (resource consumption). I’ve even seen blanket statements such as the only thing that matters is response time. I do not agree. I feel that by being a good citizen and consume as few resources and possible, we contribute to the overall welfare of the system.

For instance, I’m fortunate to have some 8 km (5 miles) to my main client. I can take the car, which often is about 15 minutes or I can bicycle, which is about 20 minutes. For many reasons, I prefer to bicycle. The thing here is that I compromise a little bit and accept a few more minutes when going by bicycle, but I feel I’m a better citizen and contribute to a more sustainable society. But not only that: ever so often, the traffic is congested, and now the car takes some 40-45 minutes (bicycle still 20 minutes). By using the bicycle I both consume less resources and I also have a higher degree of predictability. Now, is this analogy appropriate to database performance? I don’t know, perhaps to some extent… But let me give you a database example, from real life, followed by a TSQL example:

I have a client who had this query which used to be quick (enough) and suddenly was very slow. Been there before, we know this can be just about anything. Anyhow, it was pretty quick for me to find the reason. The query had an ORDER BY and a FAST hint. The FAST hint tells SQL Server to return rows to the client as fast as possible, but possibly with a higher overall cost. The developer who added that FAST hint didn’t really think that hard about it, and just “threw it in there”. It sounds good, doesn’t it? There was a non-clustered index (non-covering) on the sort column and also some restrictions (WHERE clause).

With the FAST hint, SQL Server used the index on the sort column to drive the query and for each row it did a “bookmark lookup” to fetch the row. This means a page access for each row, but rows can be returned to the client application immediately (think streaming). Without the fast hint, SQL Server first sorted the relevant rows into a worktable and then returned the rows in sorted order from that worktable. So we have a tradeoff between reading a lot of pages (possibly some from cache) or doing some work up-front to sort data and then just read that sorted worktable sequentially.

The worrying part here is that with a small table, it will fit in cache and the difference between the two might not be that drastic (either way). But as table grew larger, it won’t fit in cache anymore and as we see logical I/O turning into physical I/O things really go south for the query with the FAST hint. This is what happened to my client. Table grew and a query which had OK response time suddenly was a disaster. If that FAST hint wasn’t there in the first place, my client wouldn’t have this slowness in the application over the two weeks it took until I had time to look over it and remove the FAST hint (I also added a couple of indexes, but that is beside the point).

Seeing is believing, right? At the end of this blog post, you will find TSQL that pretty much mimics my client’s case. It populates a table with 800,000 rows and there’s a non-clustered index on the sort column. We then try some variations to check response time, CPU seconds usage, I/O and duration. I measured response time using TSQL (as seen in the script). I also measured response time and the other metrics using Profiler.

The size of the table (clustered index on identity column) is 133MB and the non-clustered index to on the sort column is 11MB. This is a small table, but that makes things more manageable; and by setting the max server memory to a low value (60MB), we can still see the effect of logical vs. physical I/O.

We first run the query and have a filter that restricts to 4,000 rows out of 800,000 rows. Note that there’s no index on the filter column.

The query without a FAST hint was very consistent. The response time was either 0.05 seconds (without clearing cache first) or 1.9 seconds (if we clear cache first). This was regardless of if we configured with 500MB or 50MB memory for sp_configure ‘max server memory’.
The query with FAST hint was OK with memory setting of 500MB, so the table would fit in cache: 1.6 seconds to 4.4 seconds (depending on whether we empty cache before execution). But when we lower memory setting (think “large table”), the execution time jumped up to 73 seconds. That is a factor of between 48 and 1460.

Things got a bit different when we removed the WHERE clause to return all rows:

Query without FAST hint took between 10 seconds and 23 seconds (depending on whether we empty cache first) for a memory setting of 500MB. Lowering memory to 60MB made this one take between 23 and 41 seconds. Note that I here got some error messages from SQL Server regarding insufficient memory in the internal memory pool (possibly SQL Server now did some fall-back strategy for the query execution, which added to execution time).
The query with the FAST hint outperformed the one without for a memory setting of 500MB, with execution time between 2.2 and 5.6 seconds. Note that I configured SSMS to discard results so there is no processing of the returned 800,000 rows included here. With a memory setting of 60MB, we again bumped up execution time to some 74 seconds.

Here are the full numbers:

ms	ms Profiler	ms cpu	io	fast hint	memconf	cache clean	rows returned
1930	2023	202	18782	0	500	1	4000
53	60	110	18768	0	500	0	4000
4403	4497	2075	2695310	1	500	1	4000
1616	1622	1622	2551439	1	500	0	4000
1930	1977	171	18768	0	60	1	4000
56	59	94	18768	0	60	0	4000
72426	72479	10888	5513944	1	60	1	4000
72663	72728	10983	5521626	1	60	0	4000
23336	23391	2105	31738	0	500	1	800000
10263	10269	2559	31574	0	500	0	800000
5663	5703	2386	2695368	1	500	1	800000
2226	2235	2028	2551439	1	500	0	800000
40966	40975	2620	31654	0	60	1	800000
22906	22913	2714	31629	0	60	0	800000
73676	73709	11045	5512080	1	60	1	800000
74513	74557	11778	5522556	1	60	0	800000

For the sake of completeness, I should add that having a good supporting index for the restriction (for the queries that had a restriction) made the query go equally fast regardless of memory config or FAST hint (in fact the FAST hint was irrelevant with a good index).

Here’s the T-SQL if you want to play with it. As always, don’t execute anything if you don’t understand the code and the consequences of executing it!

EXEC sp_configure 'max server memory', 500 RECONFIGURE GO


USE master

GO

IF DB_ID('TestDb') IS NOT NULL DROP DATABASE TestDb

GO

CREATE DATABASE [TestDb]

ON

PRIMARY

(NAME = N'TDb', FILENAME = N'C:\TDb.mdf'

,SIZE= 100MB, MAXSIZE = 200MB, FILEGROWTH = 30MB )

LOG ON

(NAME = N'TDb_l', FILENAME = N'C:\TDb_l.ldf'

,SIZE = 200MB, MAXSIZE = 500MB, FILEGROWTH = 20MB )

GO
USE testDb
CREATE TABLE t(c1 INT IDENTITY PRIMARY KEY CLUSTERED, c2 INT, c3 INT, filler CHAR(150))
INSERT INTO t (c2, c3, filler)

SELECT TOP(800000) 1, 1, 'hi'

FROM sys.columns AS a

CROSS JOIN sys.columns AS b

CROSS JOIN sys.columns AS c
UPDATE t SET c2 = c1 % 20, c3 = c1 % 200
CREATE NONCLUSTERED INDEX x ON t(c2)
--Size of table and indexes

EXEC sp_indexinfo t

--Can be found at http://www.karaszi.com/SQLServer/util_sp_indexinfo.asp
IF OBJECT_ID('tmp') IS NOT NULL DROP TABLE tmp

GO

CREATE TABLE tmp

(seq tinyint IDENTITY(1,1) PRIMARY KEY NOT NULL

,ms INT NOT NULL

,ms_profiler INT NULL

,ms_cpu INT NULL

,io_ INT NULL

,fast_hint bit NOT NULL

,memconf smallint NOT NULL

,cache_clean bit NOT NULL

,rows_returned INT NOT NULL)

GO
-----------------------------------------------------------------------------------------

--Create procedures

IF OBJECT_ID('emptyCache') IS NOT NULL DROP PROC emptyCache

GO

CREATE PROC emptyCache AS

BEGIN

CHECKPOINT

DBCC DROPCLEANBUFFERS

END

GO
IF OBJECT_ID('do_it') IS NOT NULL DROP PROC do_it

GO

CREATE PROC do_it

@fast_hint bit, @memconf smallint, @cacheclean bit, @rows_returned INT

WITH RECOMPILE

AS

BEGIN

DECLARE @dt datetime SET @dt = GETDATE()
IF @fast_hint = CAST(0 AS bit)

IF @rows_returned = 4000

SELECT * FROM t WHERE c3 = 16 ORDER BY c2

ELSE --return all rows

SELECT * FROM t ORDER BY c2

ELSE --add FAST hint

IF @rows_returned = 4000

SELECT * FROM t WHERE c3 = 16 ORDER BY c2 OPTION(FAST 20)

ELSE --return all rows

SELECT * FROM t ORDER BY c2 OPTION(FAST 20)
INSERT INTO tmp(ms, fast_hint, memconf, cache_clean, rows_returned)

VALUES(DATEDIFF(ms, @dt, GETDATE()), @fast_hint, @memconf, @cacheclean, @rows_returned)

END

GO
TRUNCATE TABLE tmp

-----------------------------------------------------------------------------------------

--Return 4000 rows

-----------------------------------------------------------------------------------------
--500 MB memory

EXEC sp_configure 'max server memory', 500 RECONFIGURE

GO

--Without FAST

EXEC emptyCache

GO

EXEC do_it @fast_hint = 0, @memconf = 500, @cacheclean = 1, @rows_returned = 4000

GO

EXEC do_it @fast_hint = 0, @memconf = 500, @cacheclean = 0, @rows_returned = 4000

GO

--... with FAST

EXEC emptyCache

GO

EXEC do_it @fast_hint = 1, @memconf = 500, @cacheclean = 1, @rows_returned = 4000

GO

EXEC do_it @fast_hint = 1, @memconf = 500, @cacheclean = 0, @rows_returned = 4000

GO
--50 MB memory

EXEC sp_configure 'max server memory', 60 RECONFIGURE

GO

--Without FAST

EXEC emptyCache

GO

EXEC do_it @fast_hint = 0, @memconf = 60, @cacheclean = 1, @rows_returned = 4000

GO

EXEC do_it @fast_hint = 0, @memconf = 60, @cacheclean = 0, @rows_returned = 4000

GO

--... with FAST

EXEC emptyCache

GO

EXEC do_it @fast_hint = 1, @memconf = 60, @cacheclean = 1, @rows_returned = 4000

GO

EXEC do_it @fast_hint = 1, @memconf = 60, @cacheclean = 0, @rows_returned = 4000

GO
------------------------------------------------------------------------------------

--Return all 800,000 rows

------------------------------------------------------------------------------------
--500 MB memory

EXEC sp_configure 'max server memory', 500 RECONFIGURE

GO

--Without FAST

EXEC emptyCache

GO

EXEC do_it @fast_hint = 0, @memconf = 500, @cacheclean = 1, @rows_returned = 800000

GO

EXEC do_it @fast_hint = 0, @memconf = 500, @cacheclean = 0, @rows_returned = 800000

GO

--... with FAST

EXEC emptyCache

GO

EXEC do_it @fast_hint = 1, @memconf = 500, @cacheclean = 1, @rows_returneed = 800000

GO

EXEC do_it @fast_hint = 1, @memconf = 500, @cacheclean = 0, @rows_returned = 800000

GO
--50 MB memory

EXEC sp_configure 'max server memory', 60 RECONFIGURE

GO

--Without FAST

EXEC emptyCache

GO

EXEC do_it @fast_hint = 0, @memconf = 60, @cacheclean = 1, @rows_returned = 800000

GO

EXEC do_it @fast_hint = 0, @memconf = 60, @cacheclean = 0, @rows_returned = 800000

GO

--... with FAST

EXEC emptyCache

GO

EXEC do_it @fast_hint = 1, @memconf = 60, @cacheclean = 1, @rows_returned = 800000

GO

EXEC do_it @fast_hint = 1, @memconf = 60, @cacheclean = 0, @rows_returned = 800000

GO

Match those types!

Tibor Karaszi/ 2009-04-282022-02-14/ Performance

This is a recommendation I believe is worth repeating from time to time: Make sure you match data types when you write TSQL code. Else you in most cases end up with an implicit data type conversion. And in worst case, this conversion is performed at the column side – not the literal side of your query. What does that mean? Consider below:

WHERE Col = Val

Now, say that the types for above don’t match. Val might be some parameter (to a stored procedure, for instance), a variable or a written value (literal). In any case, when SQL Server need to do some operation )like comparison like here) involving several values which aren’t of the same type, then one of the values need to be converted so it is of the same type as the other. Which one is determined by the rules for “Data Type Precedence“.

What we don’t want is a conversion at the column side. Why? I would argue that an implicit conversion in the first place in many cases mean I misunderstood something about the types in my system and am on my way of producing a bug in my code. But having a more strict language is not the topic for this particular post (check out this). My point here is that it is bad for performance reasons. Just yesterday and today I was involved in a thread on the MSDN forum. Here’s the repro script from that thread (slightly adjusted by me):

USE tempdb GO IF OBJECT_ID('t') IS NOT NULL DROP TABLE t GO CREATE TABLE t ( c1 int IDENTITY(10000001,1) NOT NULL ,c2 char(8) NULL ,c3 datetime NULL ) GO INSERT INTO t(c3) SELECT TOP 3000000 '20080203' FROM master..spt_values t1 CROSS JOIN master..spt_values t2 CROSS JOIN master..spt_values t3 UPDATE t SET c2 = CAST(c1 AS char(8)) CREATE UNIQUE CLUSTERED INDEX x ON t(c2) GO

SELECT * FROM t WHERE c2 = N'10000009' SELECT * FROM t WHERE c2 = N'10000008'

Don’t bother too much about the first section. The relevant part here is that we have char(8) column with values which are unique (enforced by a unique index), on a table with three million rows.

Now, before we dig deeper: What is the problem with above queries? The type do not match! The type in the table is char(8), but literal is Unicode (the N before the string). Since nchar has higher precedence than char, then the column side needs to be converted to the value side. Not good. Try the code on SQL Server 2000 and you will see disaster. For me, STATISTICS IO gave 10755 logical reads for both queries, with Clustered Index Scan (which is the same as table scan). Hoover on the Clustered Index Scan icon and you will find “WHERE:(Convert([t].[c2])=[@1])”. See how the conversion is done at the column side? This, my friend, mean that SQL Server cannot seek an index to find the rows. Ouch! That’s just because we were a bit sloppy… Compare to below:

SELECT * FROM t WHERE c2 = '10000009' SELECT * FROM t WHERE c2 = '10000008'

You will find for both two above we have Clustered Index Seek with only three logical reads. That is a good improvement.

OK, enough about SQL Server 2000, lets discuss 2008/2005. At first I couldn’t see any difference between the two original queries (N’10000009′ and N’10000008′). Both seemed to be instant, and the same execution plan for both. I even saved the execution plan as XML and used WinDiff to compare the xml files. No difference. But the poster said the “9” query was slower than the “8” query. I executed to text and had ctrl-E pressed for a while. And now I could see a difference. The flicker was quicker for the “8” version vs. the “9” version. I used Profiler and confirmed. The “9” version had consistently about 90000 microsecond duration where the “8” version had about 0 duration. This was also reflected in CPU usage, as shown by Profiler. I hoped to use Extended Events to find something deeper about this, but that didn’t give me anything. Perhaps I didn’t capture the relevant events, quite simply. X/Event experts are welcome to suggest event session configurations. 🙂

My guess for the difference is in the execution plan. You will see an index seek for both. That is good, since SQL Server nowadays potentially can use an index even if you end up with an implicit conversion at the column side. But you will see a few more operators in addition to the index seek. See that Constant Scalar followed by a Compute Scalar? My guess is that the built-in internal function used inside the Compute Scalar operator is more expensive for N’10000009′ compared to N’10000008′. You can see this function if you use SET STATISTICS PROFILE ON. See the GetRangeThroughConvert() function. My guess is that the answer is the cost for this function. (Search the Internet for things like “Constant Scan” and “GetRangeThroughConvert” for other blog post on this topic.)

The moral of the story? Here we have a cost which can be relevant in some cases, but we will only see this if we happen to execute our query with the right values and are very very observant. But we can eliminate all this mess this by using the right type for the column in the first place!

Do you have Instant File Initialization?

Tibor Karaszi/ 2009-03-092018-06-05/ Install and upgrade, Performance

You might ask yourself: What is Instant File Initialization and why does it matter?

What Instant File Initialization is:
Instant File Initialization allow SQL Server to allocate storage (space from disks) very very quickly. As you probably know, when you delete files they are not physically deleted from the disk – they are only marked as deleted in the file system allocation structures. This is why various undelete programs can recover deleted files. So imagine a process requiring disk space. The disk area given to the process from the file system and operating system could potentially contains some secret information from deleted files. This is why the file system/OS insist to zero out the storage before the process can actually see it. (I’m not OS / file system expert so anyone is free to correct me here.) That is, unless the one allocating the storage has some special privileges.

When does the zeroing out of disk space occur?
Whenever SQL Server need disk space, such as:

Create database
Add file to database
Restore (if the restore process includes database creation)
File growth (both manual and auto-grow)
Backup [edit 2011-08-26: Not sure how this got here, AFAIK shouldn’t be here]

Can we avoid the slow zeroing out space?
Yes, but only if you are on SQL Server 2005 or higher and for some operations: creation and allocation of data database files (mdf and ndf). The SQL Server service account need to have appropriate permissions from the OS. To be more precise, it need to have a privilege called SE_MANAGE_VOLUME_NAME. This is by default granted to Administrators. Do you run your SQL Server as an account being member of Administrators? I hope not. Did you grant this permission to the service account?

How do I grant this privilege?
This is easy. Add the SQL Server service account to the “Perform Volume Maintenance Tasks” security policy.

Does it matter?
You be the judge. Just to give you an idea, I created a database with a data file of various size (I had the log file at 1MB for all tests in order for it to influence the least). I timed it both with and without Instant File Initialization. I ran it on my desktop machine which has a RAID0 of two 7200RPM disks:

Size

without IFI

with IFI

1GB

10.3 s

0.3 s

10GB

128 s

1.3 s

50GB

663 s

4.5 s

The difference is roughly a factor of 100!

When does it hurt?
Basically every time disk storage is allocated. But let us focus of the occasions where you can do anything about it, i.e., when you can have Instant File Initialization. Such occasions include:

Whenever a database is created. Space need to be allocated for the data file(s).
Whenever a data file is manually expanded in size.
Whenever you have auto-grow for a data file. Note that potentially some poor user will now sit and wait for the auto-grow to complete.
When you start SQL Server. Why? Tempdb need to be re-created.
When you perform restore, if the destination database not already exists with matching database file structure.

How can I tell if I have Instant File Initialization?
I find it easiest to just create a database and time it, using some reasonable size for your data file, like 5GB. Actually, run two test: One with 5GB data file and really small log file. And then another with very small data file and 5GB log file. Remember that we never have Instant File Initialization for log files. For instance, run below and you will see (adjust the file path for the database files). You need to adapt your code for file path name, possibly database name and the datetime handling if you are lower then SQL Server 2008:

DECLARE @t time(3) = SYSDATETIME() CREATE DATABASE IFI_test_ld ON PRIMARY (NAME = N'IFI_test', FILENAME = N'C:\IFI_test\IFI_test_ld.mdf', SIZE = 5GB, FILEGROWTH = 100MB) LOG ON (NAME = N'IFI_test_log', FILENAME = N'C:\IFI_test\IFI_test_ld.ldf', SIZE = 1MB, FILEGROWTH = 10MB) SELECT DATEDIFF(ms, @t, CAST(SYSDATETIME() AS time(3))) AS LargeDataFile SET @t = SYSDATETIME() CREATE DATABASE IFI_test_ll ON PRIMARY (NAME = N'IFI_test', FILENAME = N'C:\IFI_test\IFI_test_ll.mdf', SIZE = 3MB, FILEGROWTH = 100MB) LOG ON (NAME = N'IFI_test_log', FILENAME = N'C:\IFI_test\IFI_test_ll.ldf', SIZE = 5GB, FILEGROWTH = 10MB) SELECT DATEDIFF(ms, @t, CAST(SYSDATETIME() AS time(3))) AS LargeLogFile

Are numbers for above two about the same? If yes, then you don’t have Instant File Initialization. If the one with large data file is much quicker, then you do have Instant File Initialization. And now you also know approx how long it takes to allocate 1 GB with of data and log file for your SQL Server.

John Samson blogged about an alternative way to check, involving trace flags.

I’m curious: Did you have Instant File Initialization?

Does the Resource Governor govern?

Tibor Karaszi/ 2008-10-222018-06-05/ Performance

Two weeks ago, we did the “SQL 2008 Summit” roadshow here in Sweden. We did 4 cities around Sweden in 4 days (http://www.expertzone.se/sql2k8/). It was a bit exhaustive, but even more fun – being able to travel and spend some time with persons wish I could meet more often (Kalen), others I meet regularly but only at workplace (Roger, Patrik, Anna) and yet other persons I just met (George, Per).

One of my presentations was on Resource Governor (RG), and I has this super-simple demo meaning to show CPU throttling. I classified connections to one of two Workload Groups based on login name. One group used a Resource Pool with max CPU at 10% and the other a Resource Pool with max CPU at 90%. Since I have two CPU cores, I started two execution loops for each login. An execution loop uses SQLCMD to login using the appropriate loginID and execute a proc which loops and outputs a counter using RAISERROR and NOWAIT (so we see something happening in the console).

For two of my presentations it worked just as expected. For two presentations it didn’t: the CPU usage looked very very strange – nowhere near what we expected. So, during the final day, I managed to spend some time with Mikael Colliander from MS Sweden. First we couldn’t reproduce the strange behavior, but after several restart, re-boot etc. we could. We now finally got to look at what scheduler each connection was using and there was the answer. One connection (ending up in the pool with max 10% CPU) was alone on one scheduler meaning alone on one CPU! The other three connections (one one on 10% CPU and two on max 90% CPU) was using the other CPU. So for the CPU where we had only one connection (belonging to the pool to be capped at 10% CPU) we just had no contention. So this connection could use everything on that CPU since nobody else was assigned to the CPU.

Now when I understand why this happened, it doesn’t look that strange. But I think we need to be very careful when we monitor resource usage for our connections and are using resource governor. The more CPUs we have the less chance we will see the (perhaps expected) distribution of load.

Execution plan re-use, sp_executesql and TSQL variables

Tibor Karaszi/ 2008-08-292018-06-05/ Performance, Transact-SQL

Let me start by saying that the contents of this post is not very advanced. If you have read the excellent paper “Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005”, https://technet.microsoft.com/en-us/library/cc966425.aspx and understood it, you would already know below, and much more…

I was reading a thread in the open newsgroups today (.tools, posted by Mike), about an advice that the SQL Server 2005 Database Engine Tuning Advisor apparently gave. It advices to replace sp_executesql usage with declaring TSQL variables and using those in the WHERE clause. Translated to the Adventureworks database, it advices that instead of 1) below, we should use 2) below.

1)EXEC sp_executesql N'SELECT FirstName, LastName, PersonType, Title FROM Person.Person WHERE LastName = @P1', N'@P1 nvarchar(50)', 'Diaz'2) DECLARE @P1 NVARCHAR(50) SET @P1 = 'Diaz' SELECT FirstName, LastName, PersonType, Title FROM Person.Person WHERE LastName = @P1

Now, I could not reproduce this (make DTA give me the same advice, to replace 1) with 2) ). Perhaps I misunderstood the poster in the group, it is because I’m running SQL Server 2008 DTA and engine, I’m not looking in the right place for that advice, my data isn’t representative, I’m running the DTA with some other settings, etc.. But say that DTA does indeed give such an advice, would would it do that? To be honest, I don’t know. It can hardly have enough information to determine whether 1) or 2) is the best choice.

In short: Say we have an index on the LastName column and the name we look for can either be a very common name, like perhaps “Smith”, or a not so common name, like “Karaszi”. For the more common name, a table scan might be the best choice, where for the not-so-common name, an index seek it probably best thing. OK, a very simple example, but serves well for this discussion.

Back to the difference between 1) and 2). There are potentially very important differences between the two:

For 1), SQL Server will determine a plan based on the contents of the parameter when the plan is created. I.e., it can determine selectivity based on that and determine whether it is a good idea to use an index or not. The plan is then cached and can be re-used. But what if we for the first execution pass in something which is very selective, but typically we are not very selective? Or the other way around? I,e, the plan for one case might not be optimal for some other case. This is where we have new optimizer hints in 2005 like OPTIMIZE FOR and the RECOMPILE hints. I will not go into details about these here, since I’m focusing on the differences between 1) and 2). See the white paper I mentioned, and of course Books Online, for more details.

For 2), SQL Server (or rather: the optimizer) has no knowledge of the contents of the variable when the plan is produced. So it cannot use the statistics to determine selectivity. In above case, it instead uses density (stored with the statistics, assuming such exists for the column). Density is basically the 1/ number of unique values for the column(s). This might be a good representation for a typical lastname, but perhaps not too good for a very common or a very uncommon lastname. Now, in 2005, we have hints for these situations as well (RECOMPILE), but again, that is not the point here.

In order for DTA to give the best advice here, it would have to know about the distribution over the data for that column and also have many executions of that query to see if “all” executions are using a typical value (sp_executesql might be better) or if the stored density value is a good representation for “all” queries that are passed from the clients. I very much doubt that DTA has this level of sophistication. Basically, I don’t know why it advices this. There might be other aspects, like “avoid dynamic SQL” (which whether that holds for this case we could argue in another post), but DTA is about performance, not best practices.

Bottom line is that these things are not simple and we should be very cautious with “rules of thumbs”.

Here’s some TSQL code to demonstrate the differences between 1) and 2). As always, only execute after you read and understood the code!

--Create a copy of the person table --We will have lots of "Diaz" and very few "Gimmi" IF OBJECT_ID('p') IS NOT NULL DROP TABLE p GO SELECT * INTO p FROM Person.Person


CREATE INDEX x ON dbo.p(LastName)
--Create lots of Diaz

INSERT INTO p

SELECT  BusinessEntityID + 30000, PersonType, NameStyle, Title, FirstName, MiddleName, N'Diaz', Suffix, EmailPromotion, AdditionalContactInfo, Demographics, rowguid, ModifiedDate

FROM p
--Make sure we have up-to-date statistics

UPDATE STATISTICS p WITH FULLSCAN
--Verify execution plan and I/O cost

--for table scan with low selectivity

--and index seek with high selectivity

SET STATISTICS IO ON
DBCC FREEPROCCACHE

SELECT * FROM p WHERE LastName = 'Diaz'

--20183 rows, table scan, 7612 pages
DBCC FREEPROCCACHE

SELECT * FROM p WHERE LastName = 'Gimmi'

--1 row, index seek, 3 pages
------------------------------------------------------------

--sp_execute alternative

------------------------------------------------------------
--Table scan will be used for both because of execution plan re-use

DBCC FREEPROCCACHE

EXEC sp_executesql N'SELECT FirstName, LastName, PersonType, Title

FROM p

WHERE LastName = @P1',

N'@P1 nvarchar(50)', 'Diaz'

--20183 rows, table scan, 7612 pages
EXEC sp_executesql N'SELECT FirstName, LastName, PersonType, Title

FROM p

WHERE LastName = @P1',

N'@P1 nvarchar(50)', 'Gimmi'

--1 row, table scan, 7612 pages
--Other way around

--Index search will be used for both because of execution plan re-use

DBCC FREEPROCCACHE

EXEC sp_executesql N'SELECT FirstName, LastName, PersonType, Title

FROM p

WHERE LastName = @P1',

N'@P1 nvarchar(50)', 'Gimmi'

--1 row, index seek, 3 pages
EXEC sp_executesql N'SELECT FirstName, LastName, PersonType, Title

FROM p

WHERE LastName = @P1',

N'@P1 nvarchar(50)', 'Diaz'

--20183 rows, index seek, 20291 pages
------------------------------------------------------------------

--Alternative using variable

------------------------------------------------------------------

DBCC FREEPROCCACHE

DECLARE @P1 NVARCHAR(50)

SET @P1 = 'Diaz'

SELECT FirstName, LastName, PersonType, Title

FROM p

WHERE LastName = @P1

GO

--20183 rows, index seek, 20291 pages
DBCC FREEPROCCACHE

DECLARE @P1 NVARCHAR(50)

SET @P1 = 'Gimmi'

SELECT FirstName, LastName, PersonType, Title

FROM p

WHERE LastName = @P1

GO

--1 rows, index seek, 1 pages
--Same plan even though very different selectivity

--and emptying plan cache in between
--Estimated 33 rows for both above.

--See if that is drawn from statistics density?
--Formula for density: 1/#OfUniqueValuesInColumn

SELECT 1./COUNT(DISTINCT LastName) FROM p

--0.00082918739
--Does that match density in index statistics?

DBCC SHOW_STATISTICS('p', 'x')

--Yes
--How many rows in the table?

SELECT COUNT(*) FROM p

--39944
--So how many rows would we estimate based on density?

SELECT 0.00082918739 * 39944

--Yep, 33 rows.
--I.e, for the variable alternative, SQL Server has no

--knowledge of contents of those variables so it must use density instead.

--Clean up: IF OBJECT_ID('p') IS NOT NULL DROP TABLE p