I’m having a conversation with Tony Rogerson (see his great blog at: http://sqlblogcasts.com/blogs/tonyrogerson/) about whether or not BACKUP will grab paged from cache (Buffer Pool, BP) or always read them from disk.
It seems natural to grab the pages in BP to eliminate some physical I/O. But thinking a bit more, I realized that the data in BP might not be organized in a suitable way. As far as I know, there’s no overall extent structure (map) for the BP. And BACKUP is done at extent basis (extents doesn’t have to be sequental on backup media, though). So, how would SQL Server orchestrate this? It need to lay each extent in backup media, but reading BP, it will get something like page 5 from extent 78, then page 2 from page 456, then page 3 from page 8964, then page 2 from extent 78.
Anyhow, seeing is believeing, so I created a repro for this (see code at end of this blog).
My conclusion is that backup does not read pages from BP, it will read all pages (extents) from disk. My test showed no different in lapse time whether or not data is in cache and same for I/O.
The conversation is still ongoing, and anyone is free to point out flaws in the script or different findings.
USE master
GO
IF DB_ID(‘testBpBackup’) IS NOT NULL DROP DATABASE testBpBackup
GO
CREATE DATABASE testBpBackup
ON PRIMARY
(NAME = N’testBpBackup_Data’
,FILENAME = N’C:\testBpBackup_Data.mdf’
,SIZE = 1GB
,MAXSIZE = UNLIMITED
,FILEGROWTH = 50MB )
LOG ON
(NAME = N’testBpBackup_Log’
,FILENAME = N’C:\testBpBackup_Log.ldf’
,SIZE = 100MB
,MAXSIZE = UNLIMITED
,FILEGROWTH = 10MB)
GO
USE testBpBackup
GO
IF OBJECT_ID(‘t’) IS NOT NULL DROP TABLE t
CREATE TABLE t (c1 int identity, c2 char(500))
INSERT INTO t
SELECT TOP 1000000 ‘Hello’ FROM syscolumns a, syscolumns b, syscolumns c
EXEC sp_spaceused ‘t’ –Approx 500MB
GO
–Execute below in one batch
–support variables
DECLARE @t datetime, @io_before bigint, @crap int
–Backup with data in bp
SET @crap = (SELECT COUNT(*) FROM t)
SET @t = CURRENT_TIMESTAMP
SET @io_before = (SELECT num_of_reads FROM sys.dm_io_virtual_file_stats( DB_ID(), 1))
BACKUP DATABASE testBpBackup TO DISK = ‘nul’
SELECT
DATEDIFF(s, @t, CURRENT_TIMESTAMP) AS “seconds with data in BP”,
(SELECT num_of_reads FROM sys.dm_io_virtual_file_stats( DB_ID(), 1)) – @io_before AS “I/O with data in BP”
–Backup with data not in bp
DBCC DROPCLEANBUFFERS
SET @t = CURRENT_TIMESTAMP
SET @io_before = (SELECT num_of_reads FROM sys.dm_io_virtual_file_stats( DB_ID(), 1))
BACKUP DATABASE testBpBackup TO DISK = ‘nul’
SELECT
DATEDIFF(s, @t, CURRENT_TIMESTAMP) AS “seconds with data NOT in BP”,
(SELECT num_of_reads FROM sys.dm_io_virtual_file_stats( DB_ID(), 1)) – @io_before AS “I/O with data NOT in BP”