Scalar functions and improvements in SQL Server 2019

Now that SQL server 2019 is released I want to discuss one of the more interesting improvements, at least from a performance perspective. I believe that this alone can result in greatly improved performance and reduced resource usage. For some installations, I should add.

Just a couple of quick notes:

This post is about scalar UDFs that does SELECT. A forthcoming post will be about scalar functions that doesn’t do SELECT.
Jump towards the end to see a table with the performance differences. That is why you are here, after all! Then you can come back and read the introduction paragraphs.

For a couple of decades, “we all” have known that scalar functions are bad. Really bad. To the extent that they are verboten in many installations. Why are they so bad? For two reasons:

The function is called once per row. Imagine if you have a SELECT in there, and the function is called 100,000 times. You have now executed that SELECT 100,000 times.
But also the overhead of calling the function is significant. So, even if it does something very simple like adding two numbers the overhead of calling it many times can be huge.

Don’t believe me? Read on.

I don’t want to get too theoretical regarding how this work, digging into execution plans etc. But the big thing here in SQL server 2019 is that SQL server can in-line those functions for you! I won’t go into the pre-requisites for this, just check out the product documentation for those things. You will also see that this requires database compatibility level 150 (2019) and that the database option for this isn’t turned off.

I’m using the AdventureworksDW database, and two tables:

DimProduct, 606 rows, 6 MB.
FactResellerSalesXL, approx 11,000,000 rows. I have three versions of this.
- Row-based clustered index. 2.48 GB.
- Same as above but with page compression: 0.68 GB. I don’t use this table for my tests here.
- Same as the first with a clustered columnstore index: 0,48 GB

UDF with I/O

Let’s start with a UDF that performs I/O. It is dead-simple, it will just summarize the sales amount from the fact table for a certain product. Here’s the first version:

CREATE OR ALTER FUNCTION ProdSalesAmt(@ProductKey int)
  RETURNS int
 AS
 BEGIN
     DECLARE @ret int
     SET @ret = 
         (
         SELECT SUM(d.UnitPrice * d.OrderQuantity) 
         FROM dbo.FactResellerSalesXL AS d
         WHERE d.ProductKey = @ProductKey
         )
     IF @ret IS NULL
         SET @ret = 0
     RETURN @ret
 END
GO

You probably say that the function can be simplified. So here is such a simplified version. The idea is to get rid of all procedural code constructs.:

CREATE OR ALTER FUNCTION ProdSalesAmt_Simple(@ProductKey int)
  RETURNS int
 AS
 BEGIN
     RETURN (
         SELECT ISNULL(SUM(d.UnitPrice * d.OrderQuantity), 0)
         FROM dbo.FactResellerSalesXL AS d
         WHERE d.ProductKey = @ProductKey
         )
 END
GO

Some of you know the trick to not use a scalar UDF but an Inline Table function instead. The calling query would have to use a CROSS APPLY, so the optimization isn’t transparent to the client code. Anyhow, here is such a version:

CREATE OR ALTER FUNCTION ProdSalesAmt_TVF(@ProductKey int)
  RETURNS table
 AS
     RETURN (
         SELECT ISNULL(SUM(d.UnitPrice * d.OrderQuantity), 0) AS SumSales
         FROM dbo.FactResellerSalesXL AS d
         WHERE d.ProductKey = @ProductKey
         )
GO

I’m going to test with different compatibility levels for the database:

ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 140
ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 150

And I’m also going to test with and without in-lining turned on.

ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING  = ON 
ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING  = OFF

Automatic inlining requires compatibility level 150 and also that inining is turned on. I did not see any difference between turning off iniling one or the other way, so I will just report it as inlining turned on or off.

But what if we didn’t use a function in the first place? Sure, functions are a neat programming construct. But neat programming constructs and performance doesn’t necessarily go hand-in-hand – we learned the hard way. So, I will also use a simple join as well as a correlated subquery.

So here are the queries I will use:

-- Q1 Basic scalar function
 SELECT p.ProductKey, p.EnglishProductName, dbo.ProdSalesAmt(p.ProductKey) AS SumSales
 FROM DimProduct AS p
 ORDER BY SumSales

-- Q2 Simplified scalar function
 SELECT p.ProductKey, p.EnglishProductName, dbo.ProdSalesAmt_Simple(p.ProductKey) AS SumSales
 FROM DimProduct AS p
 ORDER BY SumSales

-- Q3 Inline TVF instead of scalar function
 SELECT p.ProductKey, p.EnglishProductName, s.SumSales AS SumSales
 FROM DimProduct AS p
  OUTER APPLY dbo.ProdSalesAmt_TVF(p.ProductKey) AS s
 ORDER BY SumSales

-- Q4 Join instead of calling a function
 SELECT p.ProductKey, p.EnglishProductName, SUM(f.UnitPrice* f.OrderQuantity) AS SumSales
 FROM DimProduct AS p
  LEFT OUTER JOIN FactResellerSalesXL AS f ON f.ProductKey = p.ProductKey
 GROUP BY p.ProductKey, p.EnglishProductName
 ORDER BY SumSales

-- Q5 Correlated subquery
 SELECT p.ProductKey, p.EnglishProductName, 
 ( SELECT SUM(d.UnitPrice * d.OrderQuantity) 
   FROM dbo.FactResellerSalesXL AS d
   WHERE d.ProductKey = p.ProductKey ) AS SumSales
 FROM DimProduct AS p
 ORDER BY SumSales

-- Q6 Join using a table with a columnstore index
 SELECT p.ProductKey, p.EnglishProductName, SUM(f.UnitPrice* f.OrderQuantity) AS SumSales
 FROM DimProduct AS p
  LEFT OUTER JOIN FactResellerSalesXL_CCI AS f ON f.ProductKey = p.ProductKey
 GROUP BY p.ProductKey, p.EnglishProductName
 ORDER BY SumSales

I measured three things. The execution time as shown in SSMS. I also used an Extended Event trace to capture duration and logical reads. I will limit significant figures for duration and logical reads to 3.

Query	Description	Inlining	Time	Duration	Logical reads
Q1	Basic UDF	N	3:23	204,000	200,000,000
Q1	Basic UDF	Y	0:32	32,100	35,600,000
Q2	Simplified UDF	Y	0:32	32,100	35,600,000
Q3	Inline TVF		0:32	33,200	35,600,00
Q4	Join		0:00	692	331,000
Q5	Join corr sub		0:00	708	331,000
Q6	Join col-store		0:00	153	4,960

Conclusions

Let us focus on the time it takes, the “Time” column above. This is the end-user perspective, and of is what matters in the very end.

Without inlining, the end-user would have time to grab a cup of coffee and chat with a friend while waiting. This is the pre-2019 behavior, assuming you didn’t do any optimization.

With inlining, the user can swap to the email program and check for new mails while waiting. We had to do manual work prior to 2019 to get from three and a half minute to half a minute. This meant re-writing the function and the SQL that called the function. SQL server 2019 basically does that for us! Note that I didn’t see any difference with simplifying the scalar function to only contain one SELECT.

But we could be smarter than that. For this simple example we could either do a join or a correlated subquery. It might not be that simple in all situations, but there can still be room for query optimizations, evidently. Going from half a minute to half a second is a big thing. Now the end-user will barely notice the execution time.

And imagine if the dev/dba created a columnstore index to support the query. Now we basically have instant result.

Also consider the resource consumption. I’m thinking about the logical reads column here. I didn’t measure CPU time for this test, but I would expect that it to some extent reflect the execution time and logical reads.

I will discuss a scalar UDF that doesn’t do SELECT (I/O) in a later blog post.

Disclaimers

You can do many more combinations than what I did. For instance scalar inlining with columnstore index. Etc. My goal here was not to test all possible permutations. Please give the others a try if you want!

This was only one example. I’m sure that there are examples where inlining doesn’t work that well. And perhaps also where the benefits are even grater. My aim here was to give it a try, with an example with was simple enough and something that I have seen in the wild (similar cases, that is).