Xp_cmdshell and permissions

This blog post was inspired from a newsgroup discussion. The question basically is:
What do you need to do in order to use xp_cmdshell?

Note that there are obvious security implications of doing this. (I’m not recommending usage of xp_cmdshell in general, this is a technical blog post!) We first need to think about what happens here, from an architectural level:

Somebody who has logged in to SQL Server executes xp_cmdshell. For this, SQL Server need to spawn a process in Windows. A process in Windows need to execute in a Windows user account.

So, what Windows account is used? If the SQL Server login who is executing xp_cmdshell is sysadmin, then SQL Server will use the service account (it will not “pretend to be somebody else”). But if the login isn’t sysadmin, then we need to configure what Windows account to be used (using sp_xp_cmdshell_proxy_account). Note that this configuration is the same for all non-sysadmins!

But there’s a little bit more to it. Below is an outline of what need to be done. Step 2 and 3 are only needed if the one who is to execute xp_cmdshell isn’t sysadmin. Note that the steps don’t have to be performed in the order listed below.

  1. We need to allow usage of xp_cmdshell in general (on 2005). Use “Surface Area Configuration” or sp_configure for this.
  2. We need to have a user in the master database which has execute permission on xp_cmdshell. If you are uncertain about the difference between logins and users, you should read up about it in BOL.
  3. We need to say what Windows account should be used when a non-sysadmin user is executing xp_cmdshell.

So, here’s the TSQL script that does all above:

–1, allow xp_cmdshell
EXEC sp_configure ‘xp_cmdshell’, 1
RECONFIGURE
GO

 

–2, grant permission to xp_cmdshell
USE master
CREATE LOGIN JohnDoe WITH PASSWORD = ‘jlkw#.6(‘

–Note, we are in the master database!!!
CREATE USER JohnDoe FROM LOGIN JohnDoe

–Run as login x
EXECUTE AS login = ‘JohnDoe’
–Below fails, no execute permission on xp_cmdshell
EXEC xp_cmdshell ‘DIR C:\*.*’
REVERT
GO

–Note, we are in the master database!!!
GRANT EXECUTE ON xp_cmdshell TO JohnDoe

–Try again
EXECUTE AS login = ‘JohnDoe’
–Execution of xp_cmdshell is allowed.
–But I haven’t configured the proxy account…
EXEC xp_cmdshell ‘DIR C:\*.*’
REVERT
GO

 

–3, specify the proxy account for non-syadmins
–Replace obvious parts!
EXEC sp_xp_cmdshell_proxy_account ‘Domain\WinAccount’,’pwd’
EXECUTE AS login = ‘JohnDoe’
–Execution of xp_cmdshell is allowed.
–And executes successfully!!!
EXEC xp_cmdshell ‘DIR C:\*.*’
REVERT

 

–Cleanup
EXEC sp_xp_cmdshell_proxy_account null

DROP USER JohnDoe
DROP LOGIN JohnDoe
EXEC sp_configure ‘xp_cmdshell’, 0
RECONFIGURE

 

Bulk logged recovery model and log backups

This post is inspired from a discussion about bulk logged recovery model and log backup. Some basics:

In bulk logged recovery model, some operations can be minimally logged, like:

  • SELECT INTO
  • Index create, rebuild and drop
  • Bulk loading of data

The interesting thing is that you can do a log backup after such a minimally logged operation. This log backup will contain both log records and also the data and index pages affected by the minimally logged operation.

You cannot do a point in time restore using STOPAT or similar for such a log backup (a log backup containing both log recorda and data pages), for obvious reasons. You can do a point in time restore of subsequent log backups, however, assuming that log backup doesn’t contain data pages (no bulk logged operation was done during the time span that the log backup cover).

You cannot do a log backup while in bulk logged mode if a bulk logged operation has occured and if the data files are lost. This is also pretty obvious since SQL Server cannot go and grab the data pages is needs – the data files are lost!

But what about if we have several file groups, we have bulk logged recovery model and performed a bulk logged operation; but the bulk logged operation doesn’t affect the data file which is lost? Theoretically, SQL Server could now allow such a log backup. So, lets give it a try:

In below script, we have two file groups: PRIMARY and SEC. We do a bulk logged operation which affects only PRIMARY, and then delete the file for the scondary file group. Then try a log backup. This fails, so the answer is that all data files need to be accessible in order to perform a log backup (in bulk logged recovery model and after a minimally logged operation). The reason becomes more obvious if we look at the message from such a successful log backup first:

Processed 72 pages for database ‘BulkBackup’, file ‘BulkBackup’ on file 1.
Processed 8 pages for database ‘BulkBackup’, file ‘BulkBackup2’ on file 1.
Processed 58 pages for database ‘BulkBackup’, file ‘BulkBackup_log’ on file 1.
BACKUP LOG successfully processed 138 pages in 0.098 seconds (11.530 MB/sec).

The file BulkBackup2 is a file which is unaffcted by the bulk logged operation. But still SQL Server goes and grab 8 pages from that file. The number “8” makes me suspect that this is the file header (first 8 pages on each file), needed for some reason. (I get the same error with both NO_TRUNCATE and CONTINUE_AFTER_ERROR.)

The findings are also implied by Books Online. Here’s a quote:

“If the database is damaged, for example, if the database does not start, a tail-log backup succeeds only if the log files are undamaged, the database is in a state that supports tail-log backups, and the database does not contain any bulk-logged changes.”

Above do not talk about individual files, it addresses the whole database.

Finally, here’s the script. Read it so you understand it before you execute it. It will drop a database named BulkBackup, if such exists. If you want, you can change recovery to FULL if you want to verify that log backup is indeed possible for a damaged database (in general terms…).

USE master
IF DB_ID(‘BulkBackup’) IS NOT NULL DROP DATABASE BulkBackup
GO
CREATE DATABASE BulkBackup
ON
PRIMARY
(NAME = N’BulkBackup’, FILENAME = N’C:\BulkBackup.mdf’ , SIZE = 20MB , FILEGROWTH = 10MB),
FILEGROUP SEC
(NAME = N’BulkBackup2′, FILENAME = N’C:\BulkBackup2.ndf’ , SIZE = 20MB , FILEGROWTH = 10MB)
LOG ON
(NAME = N’BulkBackup_log’, FILENAME = N’C:\BulkBackup_log.ldf’ , SIZE = 20MB , FILEGROWTH = 10MB)
GO
ALTER DATABASE BulkBackup SET RECOVERY BULK_LOGGED
ALTER DATABASE BulkBackup SET AUTO_CLOSE ON

USE BulkBackup
CREATE TABLE t(c1 int identity, c2 char(5)) ON SEC
INSERT INTO t SELECT TOP 10000 ‘hello’ FROM syscolumns a CROSS JOIN syscolumns b
CREATE TABLE tSec(c1 int identity, c2 char(5)) ON “PRIMARY”
INSERT INTO tSec SELECT TOP 10000 ‘hello’ FROM syscolumns a CROSS JOIN syscolumns b
BACKUP DATABASE BulkBackup TO DISK = ‘nul’

–Bulk logged operation
SELECT * INTO tSec2 FROM tSec

–Verify log backup possible
–after bulk logged operation
BACKUP LOG BulkBackup TO DISK = ‘nul’

–Bulk logged operation again
SELECT * INTO tSec3 FROM tSec

–Crash secondary database file
USE master
GO
WAITFOR DELAY ’00:00:05′
EXEC sp_configure ‘xp_cmdshell’, 1
RECONFIGURE
EXEC xp_cmdshell ‘DEL C:\BulkBackup2.ndf’
EXEC sp_configure ‘xp_cmdshell’, 0
RECONFIGURE

–Can we now do a log backup?
BACKUP LOG BulkBackup TO DISK = ‘nul’ WITH NO_TRUNCATE
–Apparently not. Error messages is:

–BACKUP LOG is not possible because bulk logged changes exist in
–the database and one or more filegroups are unavailable.

–Cleanup:
GO
IF DB_ID(‘BulkBackup’) IS NOT NULL DROP DATABASE BulkBackup

 

Is statistics over non-indexed columns updated by index rebuild?

Short answer: no.

This question came up today in the MCT discussion group. My gut instinct said no, but I wanted to test it to be certain. But first a brief background:

You can rebuild an index using DBCC DBREINDEX or in 2005 (and now preferred) ALTER INDEX … REBUILD. Rebuilding an index internally creates a new index and when that has been done, drops the old index.

So it is pretty obvious that we also get new statistics for that index (based on all data, not sampled, just as when we do CREATE INDEX). As an aside, reorganizing does *not* update the statistics…

But what about statistics over non-indexed columns? SQL Server can create this by itself, assuming you didn’t turn off this database option. These are named something like _WA_sys. And you can also create these explicitly usinf CREATE STATISTICS.

A few words about below script: I wanted to use the STATS_DATE function to retrieve datetime for when the statistics was built/updated. But STATS_DATE doesn’t seem to work on statistics only; it expect an id for an index… So, this is why I use DBCC SHOW_STATISTICS instead. And, unfortunately, DBCC SHOW_STATISTICS only display the statistics build time with minute precision. This is why I have a WAITFOR with > 1 minute in between the operations.

I got the same resuld whether or not I rebuild a clustered or non-clustered index on the table or even when specifying ALL indexes. Script:

USE tempdb
SET NOCOUNT ON
IF OBJECT_ID(‘t’) IS NOT NULL DROP TABLE t
CREATE TABLE t(c1 int identity, c2 char(5))
INSERT INTO t (c2)
SELECT TOP 10000 ‘Hello’ FROM syscolumns a, syscolumns b

CREATE CLUSTERED INDEX x1 ON t(c2)
CREATE STATISTICS s1 ON t(c1)

SELECT ‘ ‘ AS “Before mass modification”
DBCC SHOW_STATISTICS(‘t’, ‘x1’) WITH STAT_HEADER, NO_INFOMSGS
DBCC SHOW_STATISTICS(‘t’, ‘s1′) WITH STAT_HEADER, NO_INFOMSGS

WAITFOR DELAY ’00:01:02’

INSERT INTO t (c2)
SELECT TOP 10000 ‘Hi’ FROM syscolumns a, syscolumns b
SELECT ‘ ‘ AS “Before index rebuild”
DBCC SHOW_STATISTICS(‘t’, ‘x1’) WITH STAT_HEADER, NO_INFOMSGS
DBCC SHOW_STATISTICS(‘t’, ‘s1′) WITH STAT_HEADER, NO_INFOMSGS

WAITFOR DELAY ’00:01:02’

–ALTER INDEX x1 ON t REBUILD
ALTER INDEX ALL ON t REBUILD
SELECT ‘ ‘ AS “After index rebuild”
DBCC SHOW_STATISTICS(‘t’, ‘x1’) WITH STAT_HEADER, NO_INFOMSGS
DBCC SHOW_STATISTICS(‘t’, ‘s1’) WITH STAT_HEADER, NO_INFOMSGS

 

 

Is ‘YYYY-MM-DD’ always treated as year, month, date?

Most of you already know the answer to this question: no.

I won’t recap the rules for interpretation of the current datetime literals, as I have an article on the subject: http://karaszi.com/the-ultimate-guide-to-the-datetime-datatypes

However, SQL Server 2008 introduces a number of new date, datetime and time datatypes. One interesting aspect of these is that interpretation of the ANSI SQL datetime format, ‘YYYY-MM-DD’ is independent of language and datetime settings.

SQL Server MVP Steve Kass opened up the discussion (along with a connect suggestion) that it is time for the old datetime datatype also always treat this format as year, month, date. In general, I’m all for a consistent treatment of this datetime formatting, and that should indeed be the most “natural” one and also adhere to ANSI SQL. However, I don’t feel qualified to estimate if or how much backwards compatibility problems this will cause. Are any of you using format ‘YYYY-DD-MM’ format, for instance?

Here’s a script that show you whether or not SQL Server treats ‘nnnn-nn-nn’ as ‘yyyy-mm-dd’ or something else. For fun, if you have July CTP of SQL Server 2008, you can run the same script and change the datatypes to datetime2 and compare the difference.

SET NOCOUNT ON
USE tempdb
IF OBJECT_ID(‘langdf’) IS NOT NULL DROP TABLE langdf
CREATE TABLE langdf(lang sysname, dt datetime, correct bit)

DECLARE @lang sysname, @sql nvarchar(4000)
DECLARE c CURSOR FOR SELECT alias FROM master.dbo.syslanguages
OPEN c
WHILE 1 = 1
BEGIN
FETCH NEXT FROM c INTO @lang
IF @@FETCH_STATUS <> 0 BREAK
SET @sql =
‘SET LANGUAGE “‘ +
@lang + ‘”
INSERT INTO langdf
SELECT
”’ + @lang + ”’
,CAST(”2007-02-09” AS datetime)
,CASE WHEN CAST(”2007-02-09” AS datetime) = ”20070209” THEN 1 ELSE 0 END’
PRINT @sql
EXEC(@sql)
END
DEALLOCATE c

SELECT
COUNT(NULLIF(correct, 0)) AS “correct interpretation”
,COUNT(NULLIF(correct, 1)) AS “incorrect interpretation”
FROM langdf

SELECT * FROM langdf

Does BACKUP utilize pages in cache?

I’m having a conversation with Tony Rogerson (see his great blog at: http://sqlblogcasts.com/blogs/tonyrogerson/) about whether or not BACKUP will grab paged from cache (Buffer Pool, BP) or always read them from disk.

It seems natural to grab the pages in BP to eliminate some physical I/O. But thinking a bit more, I realized that the data in BP might not be organized in a suitable way. As far as I know, there’s no overall extent structure (map) for the BP. And BACKUP is done at extent basis (extents doesn’t have to be sequental on backup media, though). So, how would SQL Server orchestrate this? It need to lay each extent in backup media, but reading BP, it will get something like page 5 from extent 78, then page 2 from page 456, then page 3 from page 8964, then page 2 from extent 78.

Anyhow, seeing is believeing, so I created a repro for this (see code at end of this blog).

My conclusion is that backup does not read pages from BP, it will read all pages (extents) from disk. My test showed no different in lapse time whether or not data is in cache and same for I/O.

The conversation is still ongoing, and anyone is free to point out flaws in the script or different findings.

USE master
GO
IF DB_ID(‘testBpBackup’) IS NOT NULL DROP DATABASE testBpBackup
GO
CREATE DATABASE testBpBackup
ON  PRIMARY
(NAME = N’testBpBackup_Data’
,FILENAME = N’C:\testBpBackup_Data.mdf’
,SIZE = 1GB
,MAXSIZE = UNLIMITED
,FILEGROWTH = 50MB )
LOG ON
(NAME = N’testBpBackup_Log’
,FILENAME = N’C:\testBpBackup_Log.ldf’
,SIZE = 100MB
,MAXSIZE = UNLIMITED
,FILEGROWTH = 10MB)
GO
USE testBpBackup
GO
IF OBJECT_ID(‘t’) IS NOT NULL DROP TABLE t
CREATE TABLE t (c1 int identity, c2 char(500))
INSERT INTO t
SELECT TOP 1000000 ‘Hello’ FROM syscolumns a, syscolumns b, syscolumns c
EXEC sp_spaceused ‘t’ –Approx 500MB
GO

–Execute below in one batch

–support variables
DECLARE @t datetime, @io_before bigint, @crap int

–Backup with data in bp
SET @crap = (SELECT COUNT(*) FROM t)
SET @t = CURRENT_TIMESTAMP
SET @io_before = (SELECT num_of_reads FROM sys.dm_io_virtual_file_stats( DB_ID(), 1))

BACKUP DATABASE testBpBackup TO DISK = ‘nul’
SELECT
DATEDIFF(s, @t, CURRENT_TIMESTAMP) AS “seconds with data in BP”,
(SELECT num_of_reads FROM sys.dm_io_virtual_file_stats( DB_ID(), 1)) – @io_before AS “I/O with data in BP”

–Backup with data not in bp
DBCC DROPCLEANBUFFERS
SET @t = CURRENT_TIMESTAMP
SET @io_before = (SELECT num_of_reads FROM sys.dm_io_virtual_file_stats( DB_ID(), 1))

BACKUP DATABASE testBpBackup TO DISK = ‘nul’
SELECT
DATEDIFF(s, @t, CURRENT_TIMESTAMP) AS “seconds with data NOT in BP”,
(SELECT num_of_reads FROM sys.dm_io_virtual_file_stats( DB_ID(), 1)) – @io_before AS “I/O with data NOT in BP”

 

No Notification Services in SQL Server 2008

If you have been wondering where Notification Services (NS) is in the prior CTPs of SQL Server 2008, you now have the answer. NS will not be carried forward to SQL Server 2008. Here’s a quote from July CTP readme:

“SQL Server Notification Services will not be included as a component of SQL Server 2008, but will continue to be supported as part of the SQL Server 2005 product support life-cycle. Moving forward, support for key notification scenarios will be incorporated into SQL Server Reporting Services. Existing Reporting Services functionality, such as data driven subscriptions, addresses some of the notification requirements. Features to support additional notification scenarios may be expected in future releases. ”

So, now we know.

I’ve been delivering training of SQL Server 2005 for a while now, and one of the courses has a module on NS. Over time, I gradually adapted to the fact that almost no-one was interested in NS and nowadays I just give a brief overwiew of what NS is, and the basics of creating an NS solution. Funny thing is that when you “get” NS, you realize that it is a neat piece of infrastructural software, taking writing some code off your hands. I guess that there hasn’t been enough interest in NS to carry it forward to next release, but that it pure speculation from my side…

Are execution plans for functions cached?

Obviously, were talking about multi-statement functions, since in-line functions are just views in the end.

My gut feeling for this is “yes”, but I wanted to be absolutely certain. So here goes:

  1. Create a function in Adventureworks
  2. Use that function in a SELECT statement
  3. Check if a plan exists in the plan cache for above

USE Adventureworks
GO

IF OBJECT_ID(‘fn’) IS NOT NULL DROP FUNCTION fn
GO

CREATE FUNCTION fn(@rg uniqueidentifier)
RETURNS @tbl TABLE(SalesOrderDetailID int NOT NULL PRIMARY KEY, OrderQty smallint NOT NULL)
AS
BEGIN
INSERT INTO @tbl(SalesOrderDetailID, OrderQty)
SELECT SalesOrderDetailID, OrderQty FROM sod WHERE rowguid = @rg
RETURN
END
GO

SELECT * FROM fn(‘80667840-F962-4EE3-96E0-AECA108E0D4F’)
GO

SELECT cp.cacheobjtype, cp.plan_handle, qp.query_plan
FROM sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp
WHERE qp.objectid  = OBJECT_ID(‘fn’)

IF OBJECT_ID(‘fn’) IS NOT NULL DROP FUNCTION fn

Note the execution plan in XML format picked up from sys.dm_exec_query_plan. If you executed the query in grid format, you can cklick on it, and then save the XML as a file. Rename the file extension to .sqlplan and open that file in SSMS. You can now see the plan for this function graphically.

Online restore and filegroups

Is data available for a partitioned table if one of the partitions isn’t available? Instead of reading BOL, I wanted to test this. Here is an outline of what I did (you find TSQL code at the end of this post):

  • Create a database over several file groups
  • Create a partitioned table
  • Inserted some rows
  • Database backup
  • Inserted some rows
  • Log backup
  • Inserted some rows
  • Log backup
  • Inserted some rows
  • Database restore of only one file group
  • Answer is yes. If the query is written so that the non-available partition is eliminated (either optimize or run time) we can access the available partitions.
  • Restore of the log backups
    Whole table is now available

TSQL code. Note that this will drop a database named PartOnLineRestore if such exist:

SET NOCOUNT ON
USE master
IF DB_ID(‘PartOnLineRestore’) IS NOT NULL DROP DATABASE PartOnLineRestore
GO
CREATE DATABASE PartOnLineRestore
ON
PRIMARY
(NAME = N’PartOnLineRestore’, FILENAME = N’C:\PartOnLineRestore.mdf’, SIZE = 3MB ),
FILEGROUP FG1
(NAME = N’FG1File’, FILENAME = N’C:\FG1File.ndf’, SIZE = 3MB),
FILEGROUP FG2
(NAME = N’FG2File’, FILENAME = N’C:\FG2File.ndf’, SIZE = 3MB ),
FILEGROUP FG3
(NAME = N’FG3File’, FILENAME = N’C:\FG3File.ndf’, SIZE = 3MB )
LOG ON
(NAME = N’PartOnLineRestore_log’, FILENAME = N’C:\PartOnLineRestore_log.ldf’, SIZE = 3MB)
GO
USE PartOnLineRestore

CREATE PARTITION FUNCTION PF1 (int) AS RANGE RIGHT FOR VALUES (1,2);

CREATE PARTITION SCHEME PS1 AS PARTITION PF1  TO (FG1, FG2, FG3);

IF OBJECT_ID(‘t’) IS NOT NULL DROP TABLE t
CREATE TABLE t (a int, b int, c char(300) default ‘hello’);

— Populate table with some rows
DECLARE @i int; SET @i=0;
WHILE @i<10000
BEGIN
SET @i=@i+1;
INSERT INTO t (a, b) VALUES (5*RAND(),100*RAND())
END;

–Partition table
CREATE CLUSTERED INDEX x ON t(a) ON PS1(a)

–Do some backups
USE master
ALTER DATABASE PartOnLineRestore SET RECOVERY FULL
BACKUP DATABASE PartOnLineRestore TO DISK = ‘C:\PartOnLineRestore.bak’ WITH INIT
INSERT INTO PartOnLineRestore..t (a, b) VALUES (5*RAND(),100*RAND())
BACKUP LOG PartOnLineRestore TO DISK = ‘C:\PartOnLineRestore.bak’ WITH NOINIT
INSERT INTO PartOnLineRestore..t (a, b) VALUES (5*RAND(),100*RAND())
BACKUP LOG PartOnLineRestore TO DISK = ‘C:\PartOnLineRestore.bak’ WITH NOINIT
INSERT INTO PartOnLineRestore..t (a, b) VALUES (5*RAND(),100*RAND())

–Filegroup restore
RESTORE DATABASE PartOnLineRestore FILEGROUP = ‘FG1’ FROM DISK = ‘C:\PartOnLineRestore.bak’

–Can we access the table?
SELECT * FROM PartOnLineRestore..t
–Error message

SELECT * FROM PartOnLineRestore..t WHERE a = 0
–Error message

SELECT * FROM PartOnLineRestore..t WHERE a = 1
–OK, partition elimination at compile time

DECLARE @a int
SET @a = 1
SELECT * FROM PartOnLineRestore..t WHERE a = @a
–OK, partition elimination at run time

–Restore of log backups:
RESTORE LOG PartOnLineRestore FROM DISK = ‘C:\PartOnLineRestore.bak’ WITH FILE = 2
RESTORE LOG PartOnLineRestore FROM DISK = ‘C:\PartOnLineRestore.bak’ WITH FILE = 3
BACKUP LOG PartOnLineRestore TO DISK = ‘C:\PartOnLineRestore.bak’ WITH NOINIT
RESTORE LOG PartOnLineRestore FROM DISK = ‘C:\PartOnLineRestore.bak’ WITH FILE = 4

–All data vailable?
SELECT * FROM PartOnLineRestore..t WHERE a = 0
–OK

SELECT * FROM PartOnLineRestore..t
–OK

–Cleanup:
DROP DATABASE PartOnLineRestore

Whats in the default trace?

As you probably know, there’s a trace running by default in SQL Server 2005. The directory for the trace file is the SQL Server log directory, and you can turn off and on this trace with sp_configure.

But how do we find out what events and columns are traced to this? We use a trace function and some trace catalog views:

The function fn_trace_geteventinfo returns what columns and events are captured by a configured trace. But we don’t want to see the column id and event id, we want the names. So we join this to the following functions:
sys.trace_events
sys.trace_categories
sys.trace_columns

Here’s the end result:

SELECT cat.name AS CategoryName, e.name AS EventName, c.name AS ColumnName
FROM fn_trace_geteventinfo(1) AS rt
INNER JOIN sys.trace_events AS e
ON rt.eventid = e.trace_event_id
INNER JOIN sys.trace_columns AS c
ON rt.columnid = c.trace_column_id
INNER JOIN sys.trace_categories AS cat
ON e.category_id = cat.category_id
ORDER BY CategoryName, EventName, ColumnName

And here’s one with only category and event:

SELECT DISTINCT cat.name AS CategoryName, e.name AS EventName
FROM fn_trace_geteventinfo(1) AS rt
INNER JOIN sys.trace_events AS e
ON rt.eventid = e.trace_event_id
INNER JOIN sys.trace_categories AS cat
ON e.category_id = cat.category_id
ORDER BY CategoryName, EventName

Leaking roof and file shrinking

Shrinking and growing database files cost you resources and have other side effects as well. I have a technical description  http://karaszi.com/why-you-want-to-be-restrictive-with-shrink-of-database-files if you want to get into details about it. Sure, you can have valid reasons to shrink, after archiving lots of data and needing to free some disk space for example. But even after pointing to my article I see people doing this at a regular basis, hence this analogy. The analogy uses transaction log file as example, but the same basic principles applies to data files:

Say you have a leaking roof at the office. You need to put a bucket for the water under the leak (the bucket is your database file, a log file, .ldf, for this analogy).

Ever so often, you need to empty the bucket (backup log if in full or bulk logged recovery model, or when checkpoint occurs if you are in simple recovery model). Emptying the bucket doesn’t make the bucket smaller. The bucket still consumes the same amount of cubic cm of your office space.

You can of course have one of the office workers (the machine resources) take an angle grinder and make the bucker physically smaller when you empty it. This consumes resources. But you would now have an office worker to use a welder, perhaps every 10 minutes or every hour to make the bucket larger, as you have more and more water in the bucket.

I don’t know anyone who had a leaking roof, but if I were in that situation, I sure wouldn’t make this bucket larger and smaller all the time. I’d use a bucket with a size necessary to accommodate the water that builds up between when I empty the bucket. This is how I handle my database files as well.