Query parameterization settings

You have probably seen the recommendation to turn on the “optimize for ad-hoc workloads” setting. You might even have seen a more recent recommendation to set the database setting parameterization to forced (instead of the default which is simple). The aim of this post is to briefly describe each and then do some test with various settings.

I will not dive into each sub-area of this topic (plan caching, parameter sniffing, each setting, etc). There is loads of information out there, use your favourite search engine if there are areas mentioned here that you aren’t familiar with.

If you are unlucky you have application that submits queries as text to SQL Server and that text doesn’t use stored procedures, nor is it parameterized by the application. In short, the end result is that SQL server will recognize the query by producing a hash of the full text including your search arguments etc. The end result is typically lots and lots of execution plans for the same query. You have very little plan re-use (unless your users search for the very same things), and you litter the plan cache wasting precious memory.

Optimize for ad hoc workloads

Above setting exists at the instance level and since a few versions back also at the database level. SQL Server will not cache a plan for the first execution. It does save a hash of the text representation, though, so it can recognize if you execute the exact same text again and for second execution a plan will be saved. This can save memory. It will not save CPU.

Parameterization forced

This is a database setting that will make SQL Server parameterize most queries. This means that what you search for (etc) will be broken out as parameters and the same plan will be used even if you don’t search for the same values. This can save CPU and possibly also memory.

But you can end up with getting a plan which is good for some values but bad for some other values. This is what we refer to as a “parameter sniffing” problem. There are no free lunches. There has been some work done in this area in SQL Server 2022, but it is not as simple as “solving the problem”. See for instance this post by Brent Ozar.

How do these settings relate and putting some numbers on it

I want to actually tests things and see for myself instead of just reading some theoretical reasoning. This is risky since my very simple test isn’t representative for your query load. Keep that in mind. Having said that, here’s what I did:

I executed a query 30000 times. The query uses two search condition on the same table, and returns 0 rows. I’m interested in the plan aspect, not the query execution aspect.

Ideally I wanted to not return a resultset, but assigning the returned value to a variable caused the query to not be parameterized with forced parameterization. We don’t want to return 30,000 resultsets in SSMS, it takes ages, so the execution part was done using SQLCMD and some suitable command-line switches to minimize the output. Feel free to contact me if you have improvements to the test-bench I was using and I will possibly amend this blog post.

The plan generation is pretty cheap: full optimization with early termination (good enough plan found).

Here are my findings

Optimize for ad-hoc workloadsParameterization forcedTime (seconds)Memory used for plans (MB)
OffOff421640
OnOff3913
OffOn3.2470
OnOn3.33.4

Some things are less surprising:

  • Less surprising
    • The huge savings in memory with optimize for ad-hoc on
    • The reduced execution time with forced parametrization on.
  • More surprising
    • The large memory usage with optimize for ad-hoc off and forced parametrization on. Possibly some text representation of the query or some shell-plan causing this?
    • How well optimize for ad-hoc and forced parametrization play together

Anyhow, there you have it. Remember that this was a very simplified test. Feel free to grab the SQL below and tweak it. Let me know if you find something interesting.


/*
Create a proc that generates lots of execution plans
Each execution is in itself very cheap
Takes about 1.2 second per 1000 executions, i.e. 30000 executions takes about 40 seconds or less (for me)

We will be using this table: 
SELECT * FROM Person.PersonPhone

And this query something like either of below two:
DBCC FREEPROCCACHE
SELECT TOP(1) BusinessEntityId FROM Person.PersonPhone WHERE PhoneNumberTypeID = 1 AND ModifiedDate = '20120523'
DECLARE @b int SET @b = (SELECT TOP(1) BusinessEntityId FROM Person.PersonPhone WHERE PhoneNumberTypeID = 1 AND ModifiedDate = '20120523')

The first we use if we want to test forced parameterization, which I couldn't get to work without actually returning a resultset.
Run this from SQLCMD, not SSMS since SSMS will have a huge overhead in printing header for all resultsets returned (or configure SSMS to discard the result).

The second query can be used from SSMS, but it won't parameterize even when the database setting is forced.

The plan is simple enough and very cheap.
*/



/**********************************************************************************************
Setup
**********************************************************************************************/

USE Adventureworks
GO

DROP INDEX IF EXISTS PhoneNumberTypeID ON Person.PersonPhone
DROP INDEX IF EXISTS ModifiedDate ON Person.PersonPhone
GO

CREATE INDEX PhoneNumberTypeID ON Person.PersonPhone(PhoneNumberTypeID)
CREATE INDEX ModifiedDate ON Person.PersonPhone(ModifiedDate)
GO

/**********************************************************************************************
A version thet doesn't return a resultset, can be used from SSMS. 
it will NOT be parameterized with the database setting "Forced Parameterization"
**********************************************************************************************/
CREATE OR ALTER PROC CreateAdHocPlansNoResultset
 @noOfPlans int = 20000
AS
DECLARE 
 @i int = 1
,@sql nvarchar(1000)
,@dt datetime2(1) = SYSDATETIME()
,@ModifiedDate char(8) = CONVERT(char(8), DATEADD(YEAR, -10, GETDATE()), 112)

SET NOCOUNT ON

WHILE @i <= @noOfPlans
BEGIN
 SET @sql = 'DECLARE @b int SET @b = (SELECT TOP(1) BusinessEntityId FROM Person.PersonPhone WHERE PhoneNumberTypeID = ' + CAST(@i AS varchar(5)) + ' AND ModifiedDate = ''' + @ModifiedDate + ''')'
-- PRINT @sql
 EXEC(@sql)
 SET @i +=1
END

SELECT DATEDIFF(ms, @dt, SYSDATETIME())
GO

/**********************************************************************************************
--A version that DOES return an (empty) resultset, and that will be parameterized.
**********************************************************************************************/
CREATE OR ALTER PROC CreateAdHocPlansWithResultset
 @noOfPlans int = 20000
AS
DECLARE 
 @i int = 1
,@sql nvarchar(1000)
,@dt datetime2(1) = SYSDATETIME()
,@ModifiedDate char(8) = CONVERT(char(8), DATEADD(YEAR, -10, GETDATE()), 112)

SET NOCOUNT ON

WHILE @i <= @noOfPlans
BEGIN
 SET @sql = 'SELECT TOP(1) BusinessEntityId FROM Person.PersonPhone WHERE PhoneNumberTypeID = ' + CAST(@i AS varchar(5)) + ' AND ModifiedDate = ''' + @ModifiedDate + ''''
-- PRINT @sql
 EXEC(@sql)
 SET @i +=1
END

SELECT DATEDIFF(ms, @dt, SYSDATETIME())
GO


/**********************************************************************************************
Test

Run with below combinations:
	Optimize	Force
a	Off			Off
b	On			Off
c	Off			On
d	On			On

Time each execution, and check memory usage for plans
**********************************************************************************************/

--a
EXEC sp_configure 'optimize for', 0
RECONFIGURE
ALTER DATABASE Adventureworks SET PARAMETERIZATION SIMPLE
GO
--42 seconds, 1640 MB memory usage

--b
EXEC sp_configure 'optimize for', 1
RECONFIGURE
ALTER DATABASE Adventureworks SET PARAMETERIZATION SIMPLE
GO
--39 seconds,   13 MB memory usage

--c
EXEC sp_configure 'optimize for', 0
RECONFIGURE
ALTER DATABASE Adventureworks SET PARAMETERIZATION FORCED
GO
-- 3.2 seconds, 470 MB memory usage (the actual text respresentation of each query, I believe)

--d
EXEC sp_configure 'optimize for', 1
RECONFIGURE
ALTER DATABASE Adventureworks SET PARAMETERIZATION FORCED
GO
-- 3.3 seconds, 3.4 MB memory usage


/*
Run the tests and check the execution time, with above combination settings. 
Use either SQLCMD (adjust servername) or straight TSQL from SSMS, but the query wont be parameterized.
SQLCMD /S <servername> -d Adventureworks /Q"DBCC FREEPROCCACHE EXEC CreateAdHocPlansWithResultset @NoOfPlans = 30000" -h-1
*/
DBCC FREEPROCCACHE EXEC CreateAdHocPlansNoResultset @NoOfPlans = 30000


--Investigate plan cache

/*
I have my own below proc:
EXEC sp_showplans
*/

--Break-down per how many time the plans has been executed
SELECT 
 c.usecounts
,COUNT(*) AS antal
,SUM(c.size_in_bytes / (1024 * 1024.0)) AS size_in_MB
FROM sys.dm_exec_cached_plans AS c
GROUP BY c.usecounts
ORDER BY usecounts ASC

--Break-down per database
SELECT 
 c.usecounts
,DB_NAME(CAST(a.value AS sysname))
,COUNT(*) AS antal
,SUM(c.size_in_bytes / (1024 * 1024.0)) AS size_in_MB
FROM sys.dm_exec_cached_plans AS c
 CROSS APPLY sys.dm_exec_plan_attributes(c.plan_handle) AS a
 WHERE a.attribute = 'dbid'
GROUP BY c.usecounts, DB_NAME(CAST(a.value AS sysname))
ORDER BY usecounts ASC

Turning off identity cache comes with performance penalty

Have you ever experienced a sudden jump an identity column in SQL Server? Yeah, SQL Server caches the identity values and if you have a hard shutdown, you have consumed a number of values from that cache.

Should you care about the gap? In most cases: no. The identity value should be meaningless. In many cases I think that it is just an aesthetic issue to not have these gaps. (I’ve seen cases where you do run into problems because of the gap, I should add – but not frequently.)

For the SEQUENCE object, we have the CACHE option to specify how many values to cache. I.e., max values we can jump if we have a hard shutdown.

For identity, we have the IDENTITY CACHE database scoped configuration, introduced in SQL Server 2017. Caching on or off. On is default. We also have trace flag 272, at the instance level.

However, disabling the caching isn’t free. I’ve seen numerous posts on disabling the cache and not getting gaps. But I didn’t find any elaboration on performance. Hence this article. If you’ve done performance testing for this or know about such article, please leave a comment!

The problem is as always how to set up a realistic test case. What would the typical table structure and insert pattern look like? How long is a piece of string? So I decided to skip that part and go for the most simplified case I could:

A table with an identity columns and one more int column. (The reason for that second column is to hopefully avoid any special handling for identity-only tables in SQL Server.) The insert is an insert with a subselect. I.e., all rows in one transaction.

You know this already: your mileage will vary!

Anyhow, for my test, I consistently had about 5 times better performance with identity caching turned on. IMO, this is at least enough to show that caching identity values *can* be have performance aspect to; not turn it off “just because”.

--Test performance penalty for lowering identity cache
SET NOCOUNT ON
USE master

DROP DATABASE IF EXISTS Identity_test
GO

CREATE DATABASE Identity_test
 ON  PRIMARY 
( NAME = N'Identity_test', FILENAME = N'C:\DemoDatabases\DbFiles\a\Identity_test.mdf' , SIZE = 500MB , MAXSIZE = UNLIMITED, FILEGROWTH = 64MB )
 LOG ON 
( NAME = N'Identity_test_log', FILENAME = N'C:\DemoDatabases\DbFiles\a\Identity_test_log.ldf' , SIZE = 500MB , MAXSIZE = 2048GB , FILEGROWTH = 64MB )
GO

USE Identity_test
DROP TABLE IF EXISTS t

--Our identity table
CREATE TABLE t(c1 int IDENTITY(1,1), c2 int)

--Where we will log execution times
CREATE TABLE timing(run int, descr varchar(50), ms int)


ALTER DATABASE SCOPED CONFIGURATION SET IDENTITY_CACHE = ON
GO
WAITFOR DELAY '00:00:01'
GO
DECLARE @a time(7) = SYSDATETIME()
INSERT INTO t
SELECT TOP(1000000) 1 FROM sys.columns AS a CROSS JOIN sys.columns AS b
INSERT INTO timing (run, descr, ms) 
VALUES(1, 'Cache on', DATEDIFF(ms, @a, CAST(SYSDATETIME() AS time(7))))
TRUNCATE TABLE t
GO

WAITFOR DELAY '00:00:01'
GO

ALTER DATABASE SCOPED CONFIGURATION SET IDENTITY_CACHE = OFF
GO
WAITFOR DELAY '00:00:01'
GO
DECLARE @a time(7) = SYSDATETIME()
INSERT INTO t
SELECT TOP(1000000) 1 FROM sys.columns AS a CROSS JOIN sys.columns AS b
INSERT INTO timing (run, descr, ms) 
VALUES(2, 'Cache off', DATEDIFF(ms, @a, CAST(SYSDATETIME() AS time(7))))
TRUNCATE TABLE t
GO


ALTER DATABASE SCOPED CONFIGURATION SET IDENTITY_CACHE = ON
GO
WAITFOR DELAY '00:00:01'
GO
DECLARE @a time(7) = SYSDATETIME()
INSERT INTO t
SELECT TOP(1000000) 1 FROM sys.columns AS a CROSS JOIN sys.columns AS b
INSERT INTO timing (run, descr, ms) 
VALUES(3, 'Cache on', DATEDIFF(ms, @a, CAST(SYSDATETIME() AS time(7))))
TRUNCATE TABLE t
GO

WAITFOR DELAY '00:00:01'
GO

ALTER DATABASE SCOPED CONFIGURATION SET IDENTITY_CACHE = OFF
GO
WAITFOR DELAY '00:00:01'
GO
DECLARE @a time(7) = SYSDATETIME()
INSERT INTO t
SELECT TOP(1000000) 1 FROM sys.columns AS a CROSS JOIN sys.columns AS b
INSERT INTO timing (run, descr, ms) 
VALUES(4, 'Cache off', DATEDIFF(ms, @a, CAST(SYSDATETIME() AS time(7))))
TRUNCATE TABLE t
GO


ALTER DATABASE SCOPED CONFIGURATION SET IDENTITY_CACHE = ON
GO
WAITFOR DELAY '00:00:01'
GO
DECLARE @a time(7) = SYSDATETIME()
INSERT INTO t
SELECT TOP(1000000) 1 FROM sys.columns AS a CROSS JOIN sys.columns AS b
INSERT INTO timing (run, descr, ms) 
VALUES(5, 'Cache on', DATEDIFF(ms, @a, CAST(SYSDATETIME() AS time(7))))
TRUNCATE TABLE t
GO

WAITFOR DELAY '00:00:01'
GO

ALTER DATABASE SCOPED CONFIGURATION SET IDENTITY_CACHE = OFF
GO
WAITFOR DELAY '00:00:01'
GO
DECLARE @a time(7) = SYSDATETIME()
INSERT INTO t
SELECT TOP(1000000) 1 FROM sys.columns AS a CROSS JOIN sys.columns AS b
INSERT INTO timing (run, descr, ms) 
VALUES(6, 'Cache off', DATEDIFF(ms, @a, CAST(SYSDATETIME() AS time(7))))
--TRUNCATE TABLE t
GO


--Result?
SELECT * FROM timing ORDER BY run

SELECT 
 AVG(CASE descr WHEN 'cache on' THEN ms ELSE NULL END) AS cache_on
,AVG(CASE descr WHEN 'cache off' THEN ms ELSE NULL END) AS cache_off
,CAST(
   CAST(AVG(CASE descr WHEN 'cache off' THEN ms ELSE NULL END) AS decimal(9,2)) / 
   AVG(CASE descr WHEN 'cache on' THEN ms ELSE NULL END) 
  AS decimal(9,2)) AS ratio
FROM timing

Scalar functions and improvements in SQL Server 2019

Now that SQL server 2019 is released I want to discuss one of the more interesting improvements, at least from a performance perspective. I believe that this alone can result in greatly improved performance and reduced resource usage. For some installations, I should add.

Just a couple of quick notes:

  • This post is about scalar UDFs that does SELECT. A forthcoming post will be about scalar functions that doesn’t do SELECT.
  • Jump towards the end to see a table with the performance differences. That is why you are here, after all! Then you can come back and read the introduction paragraphs.

For a couple of decades, “we all” have known that scalar functions are bad. Really bad. To the extent that they are verboten in many installations. Why are they so bad? For two reasons:

  1. The function is called once per row. Imagine if you have a SELECT in there, and the function is called 100,000 times. You have now executed that SELECT 100,000 times.
  2. But also the overhead of calling the function is significant. So, even if it does something very simple like adding two numbers the overhead of calling it many times can be huge.

Don’t believe me? Read on.

I don’t want to get too theoretical regarding how this work, digging into execution plans etc. But the big thing here in SQL server 2019 is that SQL server can in-line those functions for you! I won’t go into the pre-requisites for this, just check out the product documentation for those things. You will also see that this requires database compatibility level 150 (2019) and that the database option for this isn’t turned off.

I’m using the AdventureworksDW database, and two tables:

  • DimProduct, 606 rows, 6 MB.
  • FactResellerSalesXL, approx 11,000,000 rows. I have three versions of this.
    • Row-based clustered index. 2.48 GB.
    • Same as above but with page compression: 0.68 GB. I don’t use this table for my tests here.
    • Same as the first with a clustered columnstore index: 0,48 GB

UDF with I/O

Let’s start with a UDF that performs I/O. It is dead-simple, it will just summarize the sales amount from the fact table for a certain product. Here’s the first version:

CREATE OR ALTER FUNCTION ProdSalesAmt(@ProductKey int)
RETURNS int
AS
BEGIN
DECLARE @ret int
SET @ret =
(
SELECT SUM(d.UnitPrice * d.OrderQuantity)
FROM dbo.FactResellerSalesXL AS d
WHERE d.ProductKey = @ProductKey
)
IF @ret IS NULL
SET @ret = 0
RETURN @ret
END
GO

You probably say that the function can be simplified. So here is such a simplified version. The idea is to get rid of all procedural code constructs.:

CREATE OR ALTER FUNCTION ProdSalesAmt_Simple(@ProductKey int)
RETURNS int
AS
BEGIN
RETURN (
SELECT ISNULL(SUM(d.UnitPrice * d.OrderQuantity), 0)
FROM dbo.FactResellerSalesXL AS d
WHERE d.ProductKey = @ProductKey
)
END
GO

Some of you know the trick to not use a scalar UDF but an Inline Table function instead. The calling query would have to use a CROSS APPLY, so the optimization isn’t transparent to the client code. Anyhow, here is such a version:

CREATE OR ALTER FUNCTION ProdSalesAmt_TVF(@ProductKey int)
RETURNS table
AS
RETURN (
SELECT ISNULL(SUM(d.UnitPrice * d.OrderQuantity), 0) AS SumSales
FROM dbo.FactResellerSalesXL AS d
WHERE d.ProductKey = @ProductKey
)
GO

I’m going to test with different compatibility levels for the database:

ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 140
ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 150

And I’m also going to test with and without in-lining turned on.

ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING  = ON 
ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING  = OFF 

Automatic inlining requires compatibility level 150 and also that inining is turned on. I did not see any difference between turning off iniling one or the other way, so I will just report it as inlining turned on or off.

But what if we didn’t use a function in the first place? Sure, functions are a neat programming construct. But neat programming constructs and performance doesn’t necessarily go hand-in-hand – we learned the hard way. So, I will also use a simple join as well as a correlated subquery.

So here are the queries I will use:

-- Q1 Basic scalar function
 SELECT p.ProductKey, p.EnglishProductName, dbo.ProdSalesAmt(p.ProductKey) AS SumSales
 FROM DimProduct AS p
 ORDER BY SumSales

-- Q2 Simplified scalar function
 SELECT p.ProductKey, p.EnglishProductName, dbo.ProdSalesAmt_Simple(p.ProductKey) AS SumSales
 FROM DimProduct AS p
 ORDER BY SumSales

-- Q3 Inline TVF instead of scalar function
 SELECT p.ProductKey, p.EnglishProductName, s.SumSales AS SumSales
 FROM DimProduct AS p
  OUTER APPLY dbo.ProdSalesAmt_TVF(p.ProductKey) AS s
 ORDER BY SumSales

-- Q4 Join instead of calling a function
 SELECT p.ProductKey, p.EnglishProductName, SUM(f.UnitPrice* f.OrderQuantity) AS SumSales
 FROM DimProduct AS p
  LEFT OUTER JOIN FactResellerSalesXL AS f ON f.ProductKey = p.ProductKey
 GROUP BY p.ProductKey, p.EnglishProductName
 ORDER BY SumSales

-- Q5 Correlated subquery
 SELECT p.ProductKey, p.EnglishProductName, 
 ( SELECT SUM(d.UnitPrice * d.OrderQuantity) 
   FROM dbo.FactResellerSalesXL AS d
   WHERE d.ProductKey = p.ProductKey ) AS SumSales
 FROM DimProduct AS p
 ORDER BY SumSales

-- Q6 Join using a table with a columnstore index
 SELECT p.ProductKey, p.EnglishProductName, SUM(f.UnitPrice* f.OrderQuantity) AS SumSales
 FROM DimProduct AS p
  LEFT OUTER JOIN FactResellerSalesXL_CCI AS f ON f.ProductKey = p.ProductKey
 GROUP BY p.ProductKey, p.EnglishProductName
 ORDER BY SumSales

I measured three things. The execution time as shown in SSMS. I also used an Extended Event trace to capture duration and logical reads. I will limit significant figures for duration and logical reads to 3.

QueryDescriptionInliningTimeDurationLogical reads
Q1Basic UDFN3:23204,000200,000,000
Q1Basic UDFY0:3232,10035,600,000
Q2Simplified UDFY0:3232,10035,600,000
Q3Inline TVF0:3233,20035,600,00
Q4Join0:00692331,000
Q5Join corr sub0:00708331,000
Q6Join col-store0:001534,960

Conclusions

Let us focus on the time it takes, the “Time” column above. This is the end-user perspective, and of is what matters in the very end.

Without inlining, the end-user would have time to grab a cup of coffee and chat with a friend while waiting. This is the pre-2019 behavior, assuming you didn’t do any optimization.

With inlining, the user can swap to the email program and check for new mails while waiting. We had to do manual work prior to 2019 to get from three and a half minute to half a minute. This meant re-writing the function and the SQL that called the function. SQL server 2019 basically does that for us! Note that I didn’t see any difference with simplifying the scalar function to only contain one SELECT.

But we could be smarter than that. For this simple example we could either do a join or a correlated subquery. It might not be that simple in all situations, but there can still be room for query optimizations, evidently. Going from half a minute to half a second is a big thing. Now the end-user will barely notice the execution time.

And imagine if the dev/dba created a columnstore index to support the query. Now we basically have instant result.

Also consider the resource consumption. I’m thinking about the logical reads column here. I didn’t measure CPU time for this test, but I would expect that it to some extent reflect the execution time and logical reads.

I will discuss a scalar UDF that doesn’t do SELECT (I/O) in a later blog post.

Disclaimers

You can do many more combinations than what I did. For instance scalar inlining with columnstore index. Etc. My goal here was not to test all possible permutations. Please give the others a try if you want!

This was only one example. I’m sure that there are examples where inlining doesn’t work that well. And perhaps also where the benefits are even grater. My aim here was to give it a try, with an example with was simple enough and something that I have seen in the wild (similar cases, that is).

Managing tempdb

This post is all about the tempdb database. It is not about capacity planning, like size or number of data files – there is plenty of info about that out there. Instead, this is about managing it, in the sense of changing initial file size, adding a file, removing a file etc. SSMS isn’t obvious for some of these things for the tempdb database, and for some things SSMS is flat out wrong.

I tested all operations on SQL server 2017 and SSMS 17.8.1.

 

Background

Tempdb is re-created at start-up. Or, rather re-initiated (in lack of better words). SQL Server do not delete the files and create new ones, that would take ages for large files if you don’t have Instant File Initialization (which you never have for log files). So how does SQL Server know what files to create, the logical and physical names, size etc? The answer is sys.master_files.

 

sys.master_files

This is the “template” information for the tempdb files. I will refer to sys.master_files as the template below (not to be confused with model which is template for other things, but not the file aspects of tempdb). Here you see how the tempdb files will be created/initialized at next start-up. Don’t confuse this with tempdb.sys.database_files, which show the current state for the tempdb files. Below two queries will show you the current state for the tempdb files and also the template state:

--Current
SELECT
'tempdb' AS db_name_
,file_id
,name
,physical_name
,size * 8/1024 AS size_MB
,type_desc
,CASE WHEN is_percent_growth = 1 THEN CAST(growth AS varchar(3)) + ' %' ELSE CAST(growth * 8/1024 AS varchar(10)) + ' MB' END AS growth
,max_size * 8/1024 AS max_size_MB
FROM tempdb.sys.database_files
ORDER BY type, file_id

--Template
SELECT
DB_NAME(database_id) AS db_name_
,file_id
,name
,physical_name
,size * 8/1024 AS size_MB
,type_desc
,CASE WHEN is_percent_growth = 1 THEN CAST(growth AS varchar(3)) + ' %' ELSE CAST(growth * 8/1024 AS varchar(10)) + ' MB' END AS growth
,max_size * 8/1024 AS max_size_MB
FROM master.sys.master_files
WHERE DB_NAME(database_id)  = 'tempdb'
ORDER BY db_name_, type, file_id

 

Increasing current and template file size to larger than what it currently is

This is pretty straight-forward, both using T-SQL directly and also using SSMS. Here’s what it looks like in SSMS:

(The word “Initial” for the header “Initial Size (MB)” is pure nonsense. What you see is the current file size, picked up from sys.database_files. I am happy to see that the word “Initial” is removed in SSMS 18.0.)

To increase the file size you just type the desired, larger, size for the tempdb file. The T-SQL command to do this is also very straight forward, example:

ALTER DATABASE tempdb 
 MODIFY FILE ( NAME = N'tempdev', SIZE = 5GB )

ALTER DATABASE tempdb 
 MODIFY FILE ( NAME = N'temp2', SIZE = 5GB )

Note that SSMS will generate the file size in KB fpr the T-SQL command, but the T-SQL command accepts KB, MB, GB and even TB.

This operation increases the current file size as well as the template size.

 

Decreasing current file size

To reduce the current size, you can attempt a shrink operation, for instance using such as below T-SQL command:

USE tempdb
DBCC SHRINKFILE(tempdev, 100)

The second parameter is the desired size in MB. This will change the current size as well as the template size. In my experience, you should be prepared for a limited success to do shrink file a tempdb file reducing the current size. In many cases, you will have to re-start SQL Server so it will be created with the template size. Or, hunt down whatever is using it and take it from there.

 

Changing template file size to smaller than what it currently is

One way is to change the size in the GUI to a smaller size and it will try to make the current size smaller as well as change the template size. If you don’t want to perform the shrink operation, then specify a larger size than the current size, script the command to a T-SQL query window and in there just specify whatever size you want. It will generate a command such as:

ALTER DATABASE tempdb 
 MODIFY FILE ( NAME = N'tempdev', SIZE = 100MB )

If the specified size is smaller than the current file size, then it will only change the template, not the current size.

 

Adding file

This is dead-easy. In SSMS, just press the “Add” button and specify the desired name, size etc. This will generate T-SQL such as:

ALTER DATABASE tempdb 
 ADD FILE ( NAME = N'tempdb3', FILENAME = N'R:\SqlData\a\tempdb_mssql_3.ndf' , SIZE = 5GB , FILEGROWTH = 200MB )

This will add it immediately and also to the template (both sys.database_files and sys.master_files).

 

Remove file

This is simple enough if the file is empty. In SSMS you select the file and press the “Remove” button.  It will generate T-SQL such as:

ALTER DATABASE tempdb 
 REMOVE FILE tempdb3

But if the file is current used you will get an error message and nothing was changed: not the current state, nor the template.

You can always attempt to shrink first using the EMPTYFILE option:

USE tempdb
DBCC SHRINKFILE(tempdb3, EMPTYFILE)

If you’re lucky, then the file was emptied and you can now use the ALTER command with the REMOVE FILE option. If not, then you are back to where you started.

You can try to re-start SQL Server and see if that allow you to remove the file. If not, then perhaps using the “failsafe” startup switch: /f (see Erin Stellato’s blog about this: https://www.sqlskills.com/blogs/erin/remove-files-from-tempdb/ ). Not be connected from SSMS when executing the commands might also help.

 

Changing other file attributes

If you want to change things like autogrow or maxsize, then just use SSMS or T-SQL directly. Here’s an example T-SQL command to change both autogrow and maxsize:

ALTER DATABASE tempdb 
 MODIFY FILE ( NAME = N'tempdev', MAXSIZE = 500GB , FILEGROWTH = 500MB )

 

Moving a file

This operation requires a re-start. SSMS doesn’t allow you to do this, so use T-SQL directly, such as:

ALTER DATABASE tempdb 
 MODIFY FILE (NAME = tempdev, FILENAME = 'R:\SqlData\b\tempdb.mdf')

Warning: Henrik reminded me in the comments section to add a warning here. “Make sure that R:\SqlData\b is present and that SQL Server has sufficient rights to create files here!” He is of course correct. In worst case you end up with SQL Server refusing to start if it cannot create tempdb where you have specified. In such case you can try the -f startup switch, perhaps a topic for another blog post.

Now, this is a sneaky one. It does change both sys.database_files and sys.master_files. So, looking only at the catalog views, you will be fooled that the file was moved immediately. But that didn’t happen, check the disk. Fortunately, the command comes with a message such as:

The file "tempdev" has been modified in the system catalog.
The new path will be used the next time the database is started.

That makes it pretty obvious. Note that the old file will not be removed when you restart your SQL Server. Go delete that file, after verifying that the new file was indeed created (if the “old” one is still the current one, then you won’t be able to delete it as long as SQL Server is running).

 

Summary

Managing tempdb isn’t that hard. As soon as you understand that we have the current state (tempdb.sys.master_files) and the template (sys.master_files), it is easier to understand. And, yes, SSMS is a bit quirky for these operations. The reason, methinks, is that it uses the same GUI as for other databases, which doesn’t have the aspect of a “template”.

Did I miss any operation? Let me know and I’ll might add it.

 

Agent no longer terminates on error

It is funny how you keep going, assuming things are as they used to be. Sometimes.

I had this thing with SQL Server agent and T-SQL job step types. It used to be that such a job step would just stop if an error occurs. This isn’t how other tools like SSMS or SQLCMD behaves. So to get the behavior that we are used to, we would schedule our SQL using a CmdExec job step and call out to SQLCMD.EXE.

I usually point to Ola Hallengren’s great maintenance scripts as an example of this. I recently noticed that he nowadays uses T-SQL jobstep when he call his stored procedures. That made me curious, and he told me in an email conversation that Agent doesn’t stop on error anymore. Since SQL Server 2012! Hehe, so it has been a while.

I wanted to test this, of course. Not that I doubt Ola, but seeing is believing. So here’s the T-SQL I use in a job step to test this:

RAISERROR('10', 10, 1)
RAISERROR('11', 11, 1)
RAISERROR('12', 12, 1)
RAISERROR('13', 13, 1)
RAISERROR('14', 14, 1)
RAISERROR('15', 15, 1)
RAISERROR('16', 16, 1)
RAISERROR('17', 17, 1)
RAISERROR('18', 18, 1)
PRINT 'Do we get to here?'
SELECT 1/0
PRINT 'What about after division by zero?'

If you run above from SSMS you probably see the messages out of sequence. You don’t see that in the output from the agent job step. I don’t know why that is. I added WITH NOWAIT for the RAISERROR messages (the old trick to flush the output buffer), but that din’t change the order SSMS displays the messages. SQL Server agent do show the messages in the expected order, however, so I won’t dig further in that.

Bottom line is that Agent doesn’t stop on (normal) errors anymore. I tested this on SQL Server 2017, but I don’t doubt Ola when he says that he tested this back to 2012.

Why can’t we have column alias in ORDER BY?

You probably think something like “What! Of course we can!”. And so did I when I read the title for a post about this in a forum. Let us first establish that you can have a column alias in ORDER BY:

[sql]SELECT
p.ProductID AS pid
,p.ListPrice AS pr
FROM Production.Product AS p
ORDER BY pr[/sql]

The problem was that a table name was also in the ORDER BY clause. And this is when it becomes weird. The problem isn’t the column alias, it is the table alias. Basically the desire is to write something like:

SELECT
p.ProductID AS pid
,p.ListPrice AS pr
FROM Production.Product AS p
ORDER BY p.pr

Above was apparently allowed in SQL Server 2005 2000 (I never did it myself, however). (Update: Erland Sommarskog caught my error, this was allowed in 2000, not 2005.) In order to understand why above is backwards, we need to consider the logical query flow. The first part of a query which is executed is the FROM clause (logically, the optimizer can do as it pleases as long as we get the same result). In above example, I renamed the table (for the rest of the query) from “Production.Product” to “p”. That means that we cannot refer to “Product” or “Production.Product” anymore in the query. However, SQL has imperfections and the topic of this blog is an attempt from MS to rectify such earlier imperfections (some of them), i.e., improving the product.

Continuing the logical query flow (WHERE, GROUP BY, HAVING), we towards the end find the SELECT clause. Here is where you can rename the columns (a.k.a. column alias) for the rest of the query. After the SELECT list, it is meaningless to talk about different tables (where the data originally comes from), all we have is the result of the query (so far). This is why, we can in the ORDER BY refer to a column alias (generally); ORDER BY comes logically after the SELECT list.

However, the original problem was that one also wanted to include a table alias, which shouldn’t be allowed. So, why is it allowed, in the first place – consider it makes no sense from a logical standpoint? For that we need to look as somethining weird, which is allowed (both in T-SQL and ANSI SQL) and that can be somewhat useful:

SELECT
ProductID AS pid
,ListPrice AS pr
FROM Production.Product
ORDER BY ModifiedDate

We don’t need to return something to the client app in order to sort over it. Sure, that can be useful, but from a logical perspective, this is spooky. We sort over something which doesn’t exist! So, the imperfections of the SQL language is the reason things starts to break down. Basically, what we do is to sort over something as if we would have included it in the SELECT list. We are saying, in the ORDER BY clause something like

“Hey, dbms! Can you please pretend we included Product.ModifiedDate in the SELECT list so we can sort over it. Bur don’t return it to the client app, I don’t want to see that column.”

Where it really breaks down is when we combine table and column alias. Let me repeat the problematic query:

SELECT
p.ProductID AS pid
,p.ListPrice AS pr
FROM Production.Product AS p
ORDER BY p.pr

The problem is that it is meaningless to specify a table alias here, the column does exist in the SELECT list. The only reason we can refer to something table-something at all, is to let the DBMS pretend we included something additional from the base tables, so we could sort over it. Such as (allowed):

SELECT
ProductID AS pid
,ListPrice AS pr
FROM Production.Product AS p
ORDER BY p.ModifiedDate

One could of course argue that below would be more logical (not allowed):

SELECT
ProductID AS pid
,ListPrice AS pr
FROM Production.Product AS p
ORDER BY Product.ModifiedDate

Arguing that one form of two above is more logical than the other is pretty irrelevant, houever. Both are weird constructs, and the language is what it is. Anyone feel like checking the ANSI SQL standard which of above two (first, second or both) are allowed/preferred? No me, I’m off to grab some breakfast!

 

Adding a PK online?

I just read in a forum about a user who want to replikate a table, but the table doesn’t have a PK. The table is pretty large, and having the table not available while adding the PK is undesireable. The table has a clustered index already, and there are other columns which are known to be unique (presence of unique indexes).

What I wanted to test is whether we can just add the PK constraint using the ONLINE option. Show answer is “yes”. We can’t turn a unique index into a PK using some meta-data only operation, unfortunately. That would be the easiest step. But we can add a unique constraint using the ONLINE option – there’s even an example syntax for this in BOL. We can then remove the pre-existing unique index using ONLINE. Since we are using ONLINE, we need to be on Enterprise or Developer Edition.

I wanted to test this, and below is my test script:

USE tempdb
SET NOCOUNT ON
GO

IF OBJECT_ID('t'IS NOT NULL DROP TABLE t
GO
CREATE TABLE t(c1 INT NOT NULL, c2 CHAR(100))
CREATE UNIQUE CLUSTERED INDEX ON t(c1)

INSERT INTO t
SELECT TOP(5000000ROW_NUMBER() OVER(ORDER BY a.id), 'x'
FROM syscolumns AS a
CROSS JOIN syscolumns AS b
CROSS JOIN syscolumns AS c
GO

-----------------------------------------------------
--Now try to add a PK "online"...:
-----------------------------------------------------

--Add a nullable identity?
ALTER TABLE ADD c3 INT IDENTITY NULL
--Msg 8147, Level 16, State 1, Line 1
--Could not create IDENTITY attribute on nullable column 'c3', table 't'.
GO

--Add a PK using ONLINE?
--Prepare a new connection with following INSERTs
--to verify it can run simultaneously:
--INSERT INTO t(c1, c2) VALUES(5000001, 't')
--INSERT INTO t(c1, c2) VALUES(5000001, 't')
--INSERT INTO t(c1, c2) VALUES(5000002, 't')
--GO
--INSERT INTO t(c1, c2) VALUES(5000003, 't')

--Above prepared? OK, execute below and jump to
--other window to verify it is online
ALTER TABLE ADD CONSTRAINT PK_t PRIMARY KEY NONCLUSTERED (c1WITH(ONLINE = ON)
GO

--Verify the indexes using my own sp_indexinfo
EXEC sp_indexinfo 't'

Is there an overhead to RPC events?

I recently read a discussion whether RPC events add overhead compared to just submitting text. I got curious and did some testing, which I want to share.

Background, SQL events
Using most APIs, if you just submit a query to SQL Server, you will get what we call an SQL event (for instance in Profiler an SQL:BatchCompleted). The client app submits some text, SQL Server parses the text, and either creates an execution pland or re-uses a cached plan (if such exists, based on checksum of the query text submitted). Now, this is very straight forward. But there’s also problems, mainly risk for SQL Injection and bad query plan re-use. Your query probably has a WHERE clause, and if you just concatenate the values you will have in the WHERE clause, then you will submit different text each time. Different text means different execution plans – bad plan re-use. Also, if you have some malicious user or are under hacker-attack, then somebody can inject some query or part of query inside what you think would be only the search value (this is a classic).

Stored procedures as RPC events
Now, consider if we are using stored procedures. We might want to pass the proc name, and the values for each parameter, where we pass the parameter values as binary data. Do this right (for instance in ADO, you configure your command object’s CommandType property as being CommandType.StoredProcedure (instead of CommandType.Text which is default). This causes the batch be shown in Profiler as an RPC event, for instance RPC:Completed. So, consider executing the same procedure either by constructing a string such as “EXEC myProc @p1 = ‘20080223’, @p2 = ‘Stockholm'” as text (SQL event), or passing it as a stored procedure using parameter objects (RPC event). We expect RPC event to be more efficient, right? We’ll get to that in a moment.

Queries as RPC events
Now, we can also submit queries, without using stored procedures, so they are passed as RPC events. We do this by adding at least one parameter to the command object. This causes ADO.NET to use an RPC event and execute sp_executesql, parameterizing accoring to the parameter object for the command object. This causes your code to be safe from SQL injection and will allow for better plan re-use. I.e., in most cases a very good thing – second best to use stored procedures!

Non-parameterized queries as RPC events
Now, consider if we don’t want to parameterize our queries (we don’t have a WHERE clause, we just don’t want to, we use some programming layer which doesn’t allow for it, or we are just plan curious – like now). Can we still get RPC events? Yes, by adding a dummy parameter to the command object. We don’t have to use this command object in the query – just the existence causes an RPC event instead of SQL event. This allow us to compare SQL events to RPC event and look at only the overhead for the sp_executesql part.

Overhead for using sp_executesql
Below is some VB.NET code which does just that. For the first two, we use exactly the same query multiple executions and compare timing for doing them as SQL events or RPC events. For the second two, we change what produceID we search for (but not parameterizing the search condition) which causes different execution plans for each execution. I verified by counting execution plans that the first two uses the same execution pland and the second two generates a new plan for each iteration in the loop. Here’s the VB.NET code:

Imports System.Data
Imports System.Data.SqlClient
Module Module1

Sub Main()

Dim As Int32iterations As Int32 20000
Dim tickcount As Int64
Dim ProductID As Int32 43664
Dim ProdStatus As Int16
Dim sql As String

sql "SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = "
Dim cn As New SqlConnection("SERVER=localhost\a;DATABASE=AdventureWorks2008;INTEGRATED SECURITY=TRUE")
cn.Open()

Dim utilCmd As New SqlCommand("DBCC FREEPROCCACHE"cn)
utilCmd.ExecuteNonQuery()

'Definitions
'**********************************************************************************************
'Same query every execution, SQL:BatchCompleted
Dim cmdAllText As New SqlCommand(sql "43664"cn)
cmdAllText.CommandType CommandType.Text

'Same query every execution, RPC:Completed
Dim cmdParmText As New SqlCommand(sql "43664"cn)
cmdParmText.CommandType CommandType.Text
cmdParmText.Parameters.Add
("@dummy"SqlDbType.Int)
cmdParmText.Parameters("@dummy").Value -9999

'Different query every execution, SQL:BatchCompleted
Dim cmdAllText2 As New SqlCommand()
cmdAllText2.Connection cn
cmdAllText2.CommandType 
CommandType.Text

'Different query every execution, RPC:Completed
Dim cmdParmText2 As New SqlCommand()
cmdParmText2.Connection cn
cmdParmText2.CommandType 
CommandType.Text
cmdParmText2.Parameters.Add
("@dummy"SqlDbType.Int)
cmdParmText2.Parameters("@dummy").Value -9999

'Execution
'**********************************************************************************************
Console.WriteLine("Same query every execution, SQL:BatchCompleted")
tickcount Environment.TickCount
For To iterations
ProdStatus 
cmdAllText.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)

Console.WriteLine("Same query every execution, RPC:Completed")
tickcount Environment.TickCount
For To iterations
ProdStatus 
cmdParmText.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)

Console.WriteLine("Different query every execution, SQL:BatchCompleted")
tickcount Environment.TickCount
For To iterations
cmdAllText2.CommandText 
sql i.ToString()
ProdStatus cmdAllText2.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)

Console.WriteLine("Different query every execution, RPC:Completed")
tickcount Environment.TickCount
For To iterations
cmdParmText2.CommandText 
sql i.ToString()
ProdStatus cmdParmText2.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)

Console.ReadLine()

End Sub

Sub PrintOuput(ByVal iterations As Int32ByVal tickcount As Int64)
Console.WriteLine("Number of executions: " iterations.ToString)
Console.WriteLine("Total exec time (ms): " tickcount)
Console.WriteLine("Exec time per query (ms): " tickcount iterations)
Console.WriteLine("")
End Sub

End Module

And here’s a typical execution result from above code:

Same query every execution, SQL:BatchCompleted
Number of executions: 20000
Total exec time (ms): 1357
Exec time per query (ms): 0,06785

Same query every execution, RPC:Completed
Number of executions: 20000
Total exec time (ms): 1513
Exec time per query (ms): 0,07565

Different query every execution, SQL:BatchCompleted
Number of executions: 20000
Total exec time (ms): 2402
Exec time per query (ms): 0,1201

Different query every execution, RPC:Completed
Number of executions: 20000
Total exec time (ms): 14446
Exec time per query (ms): 0,7223

As you can see, there is an overhead to use RPC events and sp_executesql. I.e., if we don’t parameterize our command object, then we only pay with no gain. The overhead is definitely noticeable were we generate a new plan for each execution. An interesting aspect is that if I remove the call to DBCC FREEPROCCACHE, (so we get new plan for each iteration in the loop, but plans can be re-used from prior execution of the VB code), then this is much cheaper and closer to the others (about 5.7 seconds instead of 14 seconds). So, we do pay extra for sp_executesql especially when we generate a new plan.

Is above a weird example?
Yes, it is, and I want to emphasize that. Typically, you will use sp_executesql when you actually parameterize your queries. That will cause better plan re-use (and protect from SQL injection). Above is only to show whether sp_executesql has a cost – without considering the benefits.

How about using stored procedures?
Say you are using stored procedures, and wonder how different ways to execute a procedure compares? Let’s have a look. We can do it three ways (probably more, but below are the ones I’m interested in):

  1. Build the proc command and pass it as a string. No parameter objects, no sp_executesql. You are still open to SQL injection. Don’t do this, we only want to show it for comparsion reasons. Profiler will show an SQL:BatchCompleted event with TextData something like “EXEC dbo.GetOrderStatusRes @SalesOrderID = 43664”.
  2. Use parameters for your command objects, but you forget to define the command object of type CommandType.StoredProcedure. Profiler will show an RPC:Completed event with TextData something like “exec sp_executesql N’EXEC dbo.GetOrderStatusRes @SalesOrderID = @theID’,N’@theID int’,@theID=43664”.
  3. The right way. Define the command object as of type CommandType.StoredProcedure, and of course add parameter object to the command object. Profiler will show an RPC:Completed event with TextData something like:
    “declare @p2 tinyint
    set @p2=5
    exec dbo.GetOrderStatusOut @SalesOrderID=43664,@Status=@p2 output
    select @p2”
    Note that the Profiler tool adds the declare, set and select parts, it is not part of what is submitted from the client app. The client app actually submit the paramaters as binary values. Profiler add this to be nice to us, so we can copy that text and execute it, read it, etc.

Anyhow, here’s the VB code:

Imports System.Data
Imports System.Data.SqlClient
Module Module1

Sub Main()

Dim As Int32iterations As Int32 20000
Dim tickcount As Int64
Dim ProductID As Int32 43664
Dim ProdStatus As Int16
Dim sql As String

Dim cn As New SqlConnection("SERVER=localhost\a;DATABASE=AdventureWorks2008;INTEGRATED SECURITY=TRUE")
cn.Open()

'Create the procedures we will execute
Dim utilCmd As New SqlCommand(""cn)
utilCmd.CommandText "IF OBJECT_ID('dbo.GetOrderStatusRes') IS NOT NULL DROP PROC dbo.GetOrderStatusRes" vbCrLf
utilCmd.CommandText 
+"IF OBJECT_ID('dbo.GetOrderStatusOut') IS NOT NULL DROP PROC dbo.GetOrderStatusOut"
utilCmd.ExecuteNonQuery()
utilCmd.CommandText "CREATE PROC dbo.GetOrderStatusRes @SalesOrderID int AS SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = @SalesOrderID"
utilCmd.ExecuteNonQuery()
utilCmd.CommandText "CREATE PROC dbo.GetOrderStatusOut @SalesOrderID int, @status tinyint OUT AS SET @status = (SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = @SalesOrderID)"
utilCmd.ExecuteNonQuery()

'Run each once so it is in cache
utilCmd.CommandText "EXEC dbo.GetOrderStatusRes @SalesOrderID = 43664"
utilCmd.ExecuteScalar()
utilCmd.CommandText "DECLARE @s tinyint EXEC dbo.GetOrderStatusOut @SalesOrderID = 43664, @status = @s OUT"
utilCmd.ExecuteNonQuery()

utilCmd.CommandText "DBCC FREEPROCCACHE"

'Definitions
'**********************************************************************************************
'Exec proc as string, SQL:BatchCompleted
Dim cmdAllText As New SqlCommand("EXEC dbo.GetOrderStatusRes @SalesOrderID = " ProductID.ToString(), cn)
cmdAllText.CommandType CommandType.Text

'Parameterize, but still as text, RPC:Completed with sp_executesql
Dim cmdParmText As New SqlCommand("EXEC dbo.GetOrderStatusRes @SalesOrderID = @theID"cn)
cmdParmText.CommandType CommandType.Text
cmdParmText.Parameters.Add
("@theID"SqlDbType.Int)
cmdParmText.Parameters("@theID").Value ProductID

'Parameterize as stored procedure, RPC:Completed
Dim cmdRpc As New SqlCommand("dbo.GetOrderStatusOut"cn)
cmdRpc.CommandType CommandType.StoredProcedure
Dim prmSalesOrderID As SqlParameter = New SqlParameter("@SalesOrderID"SqlDbType.Int)
prmSalesOrderID.Value ProductID
cmdRpc.Parameters.Add
(prmSalesOrderID)
Dim prmStatus As SqlParameter = New SqlParameter("@Status"SqlDbType.TinyInt)
prmStatus.Direction ParameterDirection.Output
cmdRpc.Parameters.Add(prmStatus)

'Execution
'**********************************************************************************************
utilCmd.ExecuteNonQuery()
Console.ReadLine()

Console.WriteLine("Exec proc as string, SQL:BatchCompleted")
tickcount Environment.TickCount
For To iterations
ProdStatus 
cmdAllText.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)
Console.ReadLine()

utilCmd.ExecuteNonQuery()
Console.WriteLine("Parameterize, but still as text, RPC:Completed with sp_executesql")
tickcount Environment.TickCount
For To iterations
ProdStatus 
cmdParmText.ExecuteScalar()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)
Console.ReadLine()

utilCmd.ExecuteNonQuery()
Console.WriteLine("Parameterize as stored procedure, RPC:Completed")
tickcount Environment.TickCount
For To iterations
cmdRpc.ExecutteNonQuery
()
Next
tickcount Environment.TickCount tickcount
Call PrintOuput(iterationstickcount)

Console.ReadLine()

End Sub
Sub 
PrintOuput(ByVal iterations As Int32ByVal tickcount As Int64)
Console.WriteLine("Number of executions: " iterations.ToString)
Console.WriteLine("Total exec time (ms): " tickcount)
Console.WriteLine("Exec time per query (ms): " tickcount iterations)
Console.WriteLine("")
End Sub

End Module

And here’s the result from a typical execution:

Exec proc as string, SQL:BatchCompleted
Number of executions: 20000
Total exec time (ms): 1810
Exec time per query (ms): 0,0905

Parameterize, but still as text, RPC:Completed with sp_executesql
Number of executions: 20000
Total exec time (ms): 1700
Exec time per query (ms): 0,085

Parameterize as stored procedure, RPC:Completed
Number of executions: 20000
Total exec time (ms): 1388
Exec time per query (ms): 0,0694

We can see that doing it the “proper” way is cheapest, but there’s not a big difference between the three. The first alternative is not good, though, since we aren’t protected from SQL injection. And since you then will be using parameter object anyhow, just go ahead and define the CommandType as stored procedure while you’re at it.

Be careful with constraints calling UDFs

You might just not get what you think. I would be surprised if this hasn’t been blogged already, but if so, it would be worth repeating. Here’s the deal (example from a forum,, slightly re-worked):

I want the values in one column to be unique, assuming the value in another column is 1. Can I use an UDF for that?

On the surface, yes. You can write an UDF to wich you pass the value which should be conditionally unique and in that UDF check how many rows has this value AND othercolumn = 1. If more than 1 row, then function returns 0, else 1 (or something else to signal “OK” or “Not OK”). Now, you can call this function in a CHECK constraint. Something like CHECK(myFunction(uniqueCol) = 1). this will on the surface do its job, as long as you INSERT into the table. But if you update a row and only set the otherColumn for some row from 0 to 1, then the check constraint will not be checked. The optimizer is smart enough to understand that the update doesn’t change anything that we refer to in our CHECK constraint, so why bother checking the constraint? End result here is that the constraint doesn’t do what we want it to do. Use a trigger instead (or some other method). Here’s a repro:

USE tempdb
GO
IF OBJECT_ID('t'IS NOT NULL DROP TABLE t
IF OBJECT_ID('t_uq'IS NOT NULL DROP FUNCTION t_uq
GO

CREATE TABLE t(c0 INTc1 NVARCHAR(50), c2 bit)
GO

CREATE FUNCTION t_uq(@c1 NVARCHAR(50))
RETURNS bit
AS
BEGIN
DECLARE 
@ret bit
IF (SELECT COUNT(*) FROM WHERE c1 @c1 AND c2 1) > 1
SET @ret 0
ELSE
SET 
@ret 1
RETURN @ret
END
GO

ALTER TABLE ADD CONSTRAINT t_c CHECK(dbo.t_uq(c11)

INSERT INTO t(c0c1c2VALUES(1'a'0--OK
INSERT INTO t(c0c1c2VALUES(2'a'0--OK
INSERT INTO t(c0c1c2VALUES(3'b'1--OK
INSERT INTO t(c0c1c2VALUES(4'b'1--Fails

--So far so good, but watch now:

UPDATE SET c2 WHERE c0 2
--No error, the constraint doesn't do its job!

--We have invalid data:
SELECT FROM t

German translation of my updated datetime article

Frank Kalis has been kind enough to translate my datetime article (http://www.karaszi.com/SQLServer/info_datetime.asp) to German. I updated my article a while ago to reflect the new date and time related types in SQL Server 2008, and I just learned that Frank now has finished the translation of my updated article to to German. Check it out at:

http://www.insidesql.org/beitraege/datetime/der-ultimative-guide-fuer-die-datetime-datentypen