How much do you gain from defragmenting your indexes?
Let me show you. Below are diagrams for tables (indexes) with 0% fragmentation vs. over 99% fragmentation. To the left is response time for the an unfragmented index and to the right is fragmented index. The query is super-simple doing pretty much only I/O (physical, not cached) and doesn’t return any data to the client. I.e., we pretty much compare only the I/O portion:
Not that much, huh? Read on for details:
It has been a few years since I wrote about index fragmentation. For my earlier posts, I both elaborate in writing about the topic and I also ran some tests to try to quantify the difference. I want to revisit the topic for a few reasons:
- I want to have one (1) post to point to when this topic comes up.
- I want to clean up the text.
- I’m running these tests on my laptop, and I have a new laptop since I last ran the tests. (I’d love to run these on a real server, but there are many reasons why that isn’t doable/practical for me.)
- The last time i wrote three blog posts. The main reason for that was that my tests for the first and second blog posts were flawed (because of buffer pool ramp-up and auto-stats).
In the end I want to discuss internal and external fragmentation and also show some numbers. The numbers are of course what I measured on my machine with my load. You find the T-SQL code at the end if you want to play with this yourself. Some of the text below are from my earlier blog posts.
See numbers and diagrams at the end, or at the top. I measured a few cases: the difference between no external fragmentation and severe external fragmentation (over 99%). I have both a narrow index and a wide index, and I read one (1), 10,000 and 100,000 rows using index searches (“range scan”). There were obviously no difference reading 1 row so I exclude that from my discussion below. For the other cases the extra time with an extreme level of external fragmentation is (from lowest impact to highest) 7%, 10%, 13% and 32%. The highest number (32%) is when reading many rows from a narrow index, i.e. many rows per page. Again, this is with an extreme level of fragmentation.
What is index fragmentation?
By index here I mean b-tree indexes. You know, the regular types of indexes. I’m not referring to “odd” indexes such as columnstore, hash, geospatial, fulltext, or xml indexes. For a b-tree index (from now on, I will just say “index”), the leaf level is stored according the order of the index key. I.e., the leaf is sorted. As you modify data, you can get page splits. Imagine inserting a row “in the middle” and there’s no room on the page. SQL server will split the page, by inserting a page from somewhere into the linked list, move 50% to that new page, and then add the row to where it should be.
- If you now follow the linked list, you jump back and forth in the data file: external fragmentation.
- The page split left us with two (nearly) half-full pages: internal fragmentation.
What do we do about it?
Duh, we defragment the indexes, of course! The two commonly used methods are:
- Rebuild the indexes. ALTER INDEX … REBUILD. This will create a new index and then remove the old index. This requires free space in the database file(s) about the same size as the index.
- Reorganize the index. ALTER INDEX … REORGANIZE. Basically walk the linked list and for every time you have to go backward in the file when going to the next page, you swap these two pages. It does a bit more than that (re-introduce the fillfactor), but essentially swapping places of the pages not going forwards physically in the file.
While doing defrag, we might leave some free space on each page (aka fillfactor). Rebuild allow us to specify the fillfactor, while reorg will re-apply whatever value you specified when creating the index.
Why might fragmentation hurt us?
Several aspects here. Let us discuss them one by one:
Lots of empty space
Yes, a page split leaves us with two half-full pages. At that point in time. But as more rows are added, they will fit on that free space.
In the end, you will probably average at about 75% fullness (75% being in between half-full and full). That is for indexes where the inserts (and possibly updates) are inserted “all over the place”.
But we also have increasing indexes. Think columns such as “OrderDate”. Or identity/sequence columns, which are pretty often primary keys, which in turn pretty often are also the clustered index. For such an index, inserts will always add rows to the end of the linked list, and SQL Server will just add pages to the end – not “splitting” any page. (Yeah, adding a page to the end of the linked list is technically called a page split in SQL Server lingo, but that show more the problems that SQL server has with terminology than anything else.) The point here is that increasing indexes will not have free space in them because of inserts going to the end.
Common values I’ve seen for fillfactor in rebuild scripts are between 70 and 80%.
- For increasing indexes, that just make the indexes bigger. And remember that it isn’t unlikely that the actual data is an increasing index (the clustered index). So, instead of having a 100 GB table, you now have a 120 GB table.
- For other indexes, you probably get about the same size “naturally” (without doing defrag) as when doing a defrag (depending on your fillfactor, of course).
Pages are out of order
Ask your SAN people how much difference there is between sequential I/O and random I/O in your SAN. The answer to this question will tell you whether this point is relevant. For a single spinning disk, there is a big difference. With sequential I/O you don’t have the disk head movement. But as you introduce more disks in various RAID configs, add in a SAN in the mix and now have a bunch of disks and also other concurrent activity apart from your SQL Server in that SAN, the picture isn’t so clear anymore. Now add SSD or some similar technology to the mix. In the end, there might not be much of a difference between random and sequential I/O on a modern storage solution.
Splitting a page takes time
Yup, sure does. If you can defragment frequently enough with a chosen fillfactor for each index so that they never split, then you avoid this. But how many of us do that? I.e., hand craft the chosen fillfactor for each index by observing the amount of fragmentation for each index since last time we did defrag, and repeat this week after week until we tuned the “perfect” fillfactor for each index? Some of us might actually do that, but as a consultant, I can tell you that none of my clients have opted for me to do that type of work. In the end, I’m not that certain that we reduce page splits by doing our regular defrag, at least not to the extent that we hope for.
Splitting writes to the transaction log.
Yes, it does. But again, how much difference is there in reality between defrag and not defrag regarding page splits occurring during business hours.
Why might doing defrag hurt us?
Of course there is another side of the coin! Defrag isn’t free. When people asking me whether to defrag, a short question back from me is “How much does it hurt?” If the answer is “None at all”, then sure go ahead! If the answer is “A lot!”, then we want to think twice.
It reduces concurrency because of locking
- Rebuild has an ONLINE option available if you are on Enterprise Edition (EE), which makes the rebuild essentially online. But using online causes the rebuilt to take longer time compared to offline.
- If not on EE, then the whole table will be locked.
- Shared lock if the rebuild is on a nonclustered index – this allow concurrent reads but not modifications.
- Exclusive locks if the index is a clustered index – disallowing both read and modifications.
- Reorg will only put a brief lock of the two pages “where it currently is”. This makes reorg much more online-ish than rebuild on non-EE.
It puts load your system
Say you have a 200 GB database, and rebuild everything. Now you shuffle 200 GB data. This isn’t free. Even if you use smarter scripts to only defrag what is actually fragmented in the first place, defrag will probably shuffle quite a lot of data. Your VM/SAN/SQL/whatever probably have other things to do at night-time. We all fight over these precious night/weekend hours. And for systems that are expected to be online pretty much all the time, this is even more pressing. Now throw this into the cloud and you can convert that type of work into real money!
You will need a larger LDF file
If you rebuild everything (blindly) then the ldf file need to be the same size as the (sum of) data file(s). A smart script perhaps only rebuild 30% of the data, cutting this down to some 30% of the data size. (This doesn’t apply to simple recovery model.)
The following log backup will be large
Again, if you rebuild 100 GB worth of indexes, then the subsequent log backup will be 100 GB in size. Etc. You get the picture by now. (Again this doesn’t apply to simple recovery model.)
What about page fullness?
To be honest, I find this to be a much more interesting aspect. I’d like to split this into two sub-sections. Increasing and “random” index.
This can be like an identity or ever increasing datetime column (think for instance order_date). This will not become fragmented from INSERTS and assuming that you don’t delete “every other row”, then you will end up with a nice unfragmented index with a high degree of page fullness. If you rebuild it with a fillfactor of 100%, you have just wasted resources. If you rebuild it with a lower fillfactor, you also waste space and cause SQL Server to read more data when the index is used – causing worse performance.
By this I mean index where the data distribution is all over the place. For instance an index on the LastName column. As you insert data, pages are split and you end up with a fillfactor of about 75% (which is in between a half-full and a full page). If you rebuild it with a similar fillfactor, you didn’t gain anything. If you rebuild it with a high fillfactor, then the following activity in your database will generate a high number of page splits. It seems reasonable to let it be, accepting a fillfactor of about 75% and accept a page split happening every now and then.
But we also delete data!
Do you? Really? OK, let us say that you do. What is the delete pattern? And the degree of deletes? How does this spread over your particular index? Deleting old data over some ever increasing index will just deallocate those pages. Deleting data which doesn’t match with the index key (rows deleted are spread all over the index) will leave empty space. This will be re-used as you insert new rows. But if you delete a lot of data all over the place, then you will have a low degree of page fullness. IMO, this is a rather unusual situation and should warrant a manually triggered index maintenance, in contrast to rebuilding indexes every week or so.
(And don’t get me started on heaps, but surely you don’t have those…? I once had a SELECT of all rows for a table with only 3000 rows, which took minutes. It had allocated more extents than number of rows in the table. Rebuild the heap, and that SELECT took, of course, sub-second. Until this degraded again, over time. Needless to say, this didn’t stay as a heap for long. And no, the table wasn’t wide, at all.)
But my friend had a query that went from 10 minutes to 1 second after defrag!
Yeah, that was because the statistics were also updated by the index rebuild resulting in a different execution plan. As soon as you see big differences, it is because of statistics update, not the defrag per se.
I like to provide the optimizer with as high-quality statistics as possible. Sure, index rebuild will give you new statistics. But if you condition it based on fragmentation, then you end up with not rebuilding a bunch of indexes and have stale statistics. Auto update statistics is OK, but you have to modify a decent amount of data before it kicks in. And when it kicks in, the user waits for it. Unless you do async update…. But I digress. Statistics management is a topic of its own. My point is that you probably don’t want to lose track of statistics management because you keep staring as index fragmentation.
Things to consider and watch out for
Here are some of the things that can throw you off when you play with these things.
Auto-update of statistics
I was bitten by this on my first test. I did a SELECT after modifying many rows (which fragmented the index) and this triggered auto-update statistics, meaning that SELECT took a very long time.
Buffer pool ramp-up
If you have Enterprise Edition, then there’s a “smartness” that reads more data than necessary if you have lots of free memory. Paul White told me about a lesser known optimization where SQL server will do “large I/O” (my own term for this) if you have plenty of unused memory. Instead of reading single pages, it reads whole extents. This makes sense, warming up the cache. https://blogs.msdn.microsoft.com/ialonso/2011/12/09/the-read-ahead-that-doesnt-count-as-read-ahead/
Just for fun, I ran my tests while disabling read-ahead (aka sequential-prefetch) and it took from 3 to 8 times longer with RA disabled. So, yes, RA can make a huge difference.
Show me the numbers already!
OK, sure! Some basics first: I was using two tables, both with a clustered index. I’m focusing on external fragmentation, i.e., jumping back and forth on disk when following the linked list. For both cases I was using a clustered index. The query seeked the index, and then followed the linked list to read as many pages/rows from the leaf that satisfied the search condition. I was both using a narrow index (about 200 rows per page) and a wide index (about 11 rows per page). I ran the queries a with 10,000 and also 100,000 rows selectivity. I didn’t re-read the same rows, so for each SELECT, the rows were read from disk.
Compare the numbers in the red circles. This is the average elapse time per SELECT. The top within each circle is without fragmentation and bottom is with over 99% fragmentation.
- I executes a SELECT statement several times, not reading the same data over and over again.
- For the millisecond column I measured the time it took to execute all queries and divide with number of queries. I.e., it is an average per query.
- For the logical_reads and physical_reads columns I used and Extended Events trace to capture number of I/O per SELECT statement and then average those numbers.
- I did a rebuild with a fillfactor to achieve the same page fullness. See above section about page fullness for a discussion of that topic. For this test I was interested in external fragmentation.
- Note a higher number of physical reads than logical reads. That is read-ahead in action. When disabling RA, I had the same numbers for both physical and logical reads (and between 3 to 8 times longer elapse time).
Here are some diagrams over the numbers. The narrow index, with no fragmentation (n) and high fragmentation (y), and reading 10,000 and 100,000 rows:
And the wide index, again with no fragmentation (n) and high fragmentation (y), and reading 10,000 and 100,000 rows:
What about the old blog posts?
I have added pretty much everything into this blog post, but in case you want to check them out:
This section is about the code I used. Feel free to use it for your own tests. Below is the basic flow of the code, from my last blog post. I did some modifications and tidy-up of the code for this new blog post, but I’m too lazy to go over the bulleted list below to reflect those changes.
- Create a proc which does bunch of reads to fill the buffer pool with data.
- Create some other supporting stuff, like trace definition, trace control procedure etc.
- Grow the data and log files for the Stackoverflow (10 GB) database.
- Create table to hold my measures.
- Create the narrow_index table which is a table with a clustered index which is narrow. A row is some 40 bytes (very rough calculation off the top of my head).
- Create the wide_index is a table with a clustered index which is wider – some 340 bytes.
- Set max server memory to 1 GB.
- Turn off auto-update statistics.
- Create the proc that will do the tests.
- Generate fragmentation in both indexes/tables.
- The narrow table has 20,000,000 rows and is about 970 MB in size.
- The wide table has 10,000,000 rows and is about 6.8 GB in size.
- Run the proc that does the test with 1 row selectivity.
- Run the proc that does the test with 10000 rows selectivity.
- Run the proc that does the test with 100000 rows selectivity.
- Each test runs 99 SELECTs, spread over the data (so not to read the same row).
- Average the result, into a table named resx.
- I did 4 such iterations, so I have the tables res1, res2, res3 and res4.
- Check the numbers from these tables and see if they correspond reasonably. They did.
- Average the results from these four table to get the end results.
The procedure which executes the SELECTS and captures the measures has this basic flow
- Stop the (XE) trace if it is running.
- Delete all old trace files.
- Do a first SELECT to get rid of auto-stats (yeah, I disabled it, but just in case).
- Empty the cache.
- Reads lots of other data into cache.
- Start the trace
- Do the SELECT, in a loop, reading as many rows as requested (param to the proc), as many times as requested (another param to the proc). Don’t read the same data over and over again. This is done both for the narrow and wide table/index.
- Capture memory usage for each table (buffer pool usage).
- Get the trace raw data into a temp table.
- Parse the trace data/above table using XQuery into the measures table.
- Ie., we now have one row for each SELECT with a bunch of measures.
--Measure cost of fragmentation in terms of execution time and memory usage --Tibor Karaszi, 2019. Revisited and modified 2022. --Requires the Stackoverflow database. I was using the 10 GB version. SET NOCOUNT ON --Setup section USE StackOverflow GO --Proc to study plans CREATE OR ALTER VIEW vShowPlans AS SELECT DB_NAME(sql.dbid) AS db_name_ ,OBJECT_NAME(sql.objectid) AS object_name_ ,p.cacheobjtype ,p.objtype ,s.execution_count ,s.total_worker_time --Add whatever measures you want ,p.size_in_bytes / 1024 AS size_in_kB ,SUBSTRING (sql.[text],(s.statement_start_offset/2) + 1,((CASE s.statement_end_offset WHEN -1 THEN DATALENGTH(sql.[text]) WHEN 0 THEN DATALENGTH(sql.[text]) ELSE s.statement_end_offset END - s.statement_start_offset)/2) + 1) AS text_query ,sql.text AS text_batch ,qp.query_plan --,* FROM sys.dm_exec_query_stats AS s INNER JOIN sys.dm_exec_cached_plans AS p ON s.plan_handle = p.plan_handle CROSS APPLY sys.dm_exec_sql_text(s.plan_handle) AS sql CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) AS qp WHERE sql.dbid <> 32767 --Igorera resursdatabasen GO --Proc to fill buffer pool, reads lots of data, takes a while to execute CREATE OR ALTER PROC fill_bp AS DECLARE @a bigint SET @a = (SELECT AVG(CAST(PostId AS bigint)) FROM Comments) SET @a = (SELECT AVG(CAST(CommentCount AS bigint)) FROM Posts) SET @a = (SELECT AVG(CAST(PostId AS bigint)) FROM Comments) SET @a = (SELECT AVG(CAST(VoteTypeId AS bigint)) FROM Votes) SET @a = (SELECT AVG(CAST(PostId AS bigint)) FROM Comments) SET @a = (SELECT AVG(CAST(VoteTypeId AS bigint)) FROM Votes) SET @a = (SELECT AVG(CAST(VoteTypeId AS bigint)) FROM Votes) GO --Disable IAM order scan, so we know that SQL Server will follow the linked list --See https://sqlperformance.com/2015/01/t-sql-queries/allocation-order-scans EXEC sp_configure 'cursor threshold', 1000000 RECONFIGURE GO --Run the initial "prepp" section without restrictions, so it doesn't take too long EXEC sp_configure 'max server memory', 26357 RECONFIGURE ALTER DATABASE SCOPED CONFIGURATION SET MAXDOP = 0 GO --Proc to change trace status CREATE OR ALTER PROC change_trace_status @state varchar(50) --start, stop, delete_all_files AS
IF @state = 'start' BEGIN IF NOT EXISTS (SELECT * FROM sys.dm_xe_sessions WHERE name = 'frag_test') ALTER EVENT SESSION frag_test ON SERVER STATE = START END IF @state = 'stop' BEGIN IF EXISTS (SELECT * FROM sys.dm_xe_sessions WHERE name = 'frag_test') ALTER EVENT SESSION frag_test ON SERVER STATE = STOP END --Delete XE file, using xp_cmdshell (ugly, I know) IF @state = 'delete_all_files' BEGIN EXEC sp_configure 'xp_cmdshell', 1 RECONFIGURE WITH OVERRIDE EXEC xp_cmdshell 'DEL C:\R\frag_test*.xel', no_output EXEC sp_configure 'xp_cmdshell', 0 RECONFIGURE WITH OVERRIDE END /* EXEC change_trace_status @state = 'start' EXEC change_trace_status @state = 'stop' EXEC change_trace_status @state = 'delete_all_files' */GO --Drop and create event session to keep track of execution time EXEC change_trace_status @state = 'stop' WAITFOR DELAY '00:00:01' IF EXISTS (SELECT * FROM sys.server_event_sessions WHERE name = 'frag_test') DROP EVENT SESSION frag_test ON SERVER EXEC change_trace_status @state = 'delete_all_files' CREATE EVENT SESSION frag_test ON SERVER ADD EVENT sqlserver.sp_statement_completed() ADD TARGET package0.event_file(SET filename=N'C:\R\frag_test') WITH (MAX_DISPATCH_LATENCY=10 SECONDS) GO --Create proc to report progress CREATE OR ALTER PROC #status @msg varchar(200) AS RAISERROR(@msg, 10, 1) WITH NOWAIT GO --Grow the data and log files for StackOverflow database. EXEC #status 'Grow the data and log files for StackOverflow database…' IF EXISTS( SELECT size * 8/(1024*1024), * FROM sys.database_files WHERE name = N'StackOverflow2010' AND size * 8/(1024*1024) < 30) ALTER DATABASE [StackOverflow] MODIFY FILE ( NAME = N'StackOverflow2010', SIZE = 30GB ) IF EXISTS( SELECT size * 8/(1024*1024), * FROM sys.database_files WHERE name = N'StackOverflow2010_log' AND size * 8/(1024*1024) < 15) ALTER DATABASE [StackOverflow] MODIFY FILE ( NAME = N'StackOverflow2010_log', SIZE = 15GB ) GO --Table to hold output DROP TABLE IF EXISTS measures CREATE TABLE measures ( id int identity(1,1) PRIMARY KEY NOT NULL ,rows_to_read int NOT NULL ,index_width varchar(20) NOT NULL ,fragmented varchar(2) NOT NULL ,usec_xe bigint NOT NULL ,usec_clock bigint NOT NULL ,cpu_microsec bigint NOT NULL ,physical_reads bigint NOT NULL ,logical_reads bigint NOT NULL ,mb_cache decimal(9,2) NOT NULL ); --Create the table for the narrow index EXEC #status 'Setup section. Create table with narrow index…' DROP TABLE IF EXISTS narrow_index --Adjust numbers. 20,000,000 rows means about 970 MB in the end --540 MB with dense data (we do not measure on this) --970 MB after fragmentation/rebuild with a fillfactor to get same number of pages as wehn data is fragmented SELECT TOP(1000100020) ROW_NUMBER() OVER( ORDER BY (SELECT NULL)) AS c1, CAST('Hello' AS varchar(82)) AS c2 INTO narrow_index FROM sys.columns AS a, sys.columns AS b, sys.columns AS c CREATE CLUSTERED INDEX x ON narrow_index(c1) --Create the table for the wide index EXEC #status 'Setup section. Create table with wide index…' DROP TABLE IF EXISTS wide_index --Adjust numbers. 10,000,000 rows give us about --3.4 GB with dense data (we do not measure on this) --6.8 GB after fragmentation/rebuild with a fillfactor to get same number of pages as wehn data is fragmented SELECT TOP(1000100010) ROW_NUMBER() OVER( ORDER BY (SELECT NULL)) AS c1 ,CAST('Hi' AS char(80)) AS c2 ,CAST('there' AS char(80)) AS c3 ,CAST('I''m' AS char(80)) AS c4 ,CAST('on' AS char(80)) AS c5 ,CAST('my' AS varchar(200)) AS c6 INTO wide_index FROM sys.columns AS a, sys.columns AS b, sys.columns AS c CREATE CLUSTERED INDEX x ON wide_index(c1) GO --Investigate the data if you want /* --wide index SELECT TOP(100) * FROM wide_index EXEC sp_indexinfo wide_index --Found on my web-site EXEC sp_help 'wide_index' --narrow index SELECT TOP(100) * FROM narrow_index EXEC sp_indexinfo narrow_index --Found on my web-site EXEC sp_help 'narrow_index' */ --/Setup section --Create the proc that executes our SQL EXEC #status 'Create the proc that executes our SQL…' GO CREATE OR ALTER PROC run_the_sql @fragmented varchar(20) ,@rows_to_read int ,@range_iterations int ,@fill_bp char(1) --'y' or 'n' ,@report_frag char(1) = 'n' AS DECLARE @sql nvarchar(1000) ,@sql_condition varchar(1000) ,@bp_wide_mb decimal(9,2) ,@bp_narrow_mb decimal(9,2) ,@range_start int ,@range_end int ,@range_iterations_counter int ,@a int ,@start_time datetime2 ,@exec_time_wide_index_usec bigint ,@exec_time_narrow_index_usec bigint ,@parm_definition nvarchar(200) SET STATISTICS IO OFF EXEC change_trace_status @state = 'stop' EXEC change_trace_status @state = 'delete_all_files' --Do a first execution to get autostats etc out of the way DBCC FREEPROCCACHE SET @range_iterations_counter = 1 SET @range_start = 1000 SET @range_end = @range_start + @rows_to_read - 1 SET @parm_definition = '@range_start int, @range_end int' SET @sql = N'DECLARE @a int SELECT @a = COUNT_BIG(c1) FROM wide_index WHERE c1 BETWEEN @range_start AND @range_end' EXEC sp_executesql @sql, @parm_definition, @range_start, @range_end SET @sql = N'DECLARE @a int SELECT @a = COUNT_BIG(c1) FROM narrow_index WHERE c1 BETWEEN @range_start AND @range_end' EXEC sp_executesql @sql, @parm_definition, @range_start, @range_end --Empty data cache (buffer pool, BP) CHECKPOINT DBCC DROPCLEANBUFFERS --Run proc to read stuff into BP if requested IF @fill_bp = 'y' EXEC fill_bp --Start the trace EXEC change_trace_status @state = 'start' --Do the SELECTs, narrow index SET @sql = N'DECLARE @a int SELECT @a = COUNT_BIG(c1) FROM narrow_index WHERE c1 BETWEEN @range_start AND @range_end' SET @range_iterations_counter = 1 SET @range_start = 1000 SET @range_end = @range_start + @rows_to_read - 1 SET @start_time = SYSDATETIME() WHILE @range_iterations_counter <= @range_iterations BEGIN EXEC sp_executesql @sql, @parm_definition, @range_start, @range_end SET @range_start = @range_start + @rows_to_read SET @range_end = @range_end + @rows_to_read SET @range_iterations_counter += 1 END SET @exec_time_narrow_index_usec = DATEDIFF_BIG(microsecond, @start_time, SYSDATETIME()) / @range_iterations --Do the SELECTs, wide index SET @sql = N'DECLARE @a int SELECT @a = COUNT_BIG(c1) FROM wide_index WHERE c1 BETWEEN @range_start AND @range_end' SET @range_iterations_counter = 1 SET @range_start = 1000 SET @range_end = @range_start + @rows_to_read - 1 SET @start_time = SYSDATETIME() WHILE @range_iterations_counter <= @range_iterations BEGIN EXEC sp_executesql @sql, @parm_definition, @range_start, @range_end SET @range_start = @range_start + @rows_to_read SET @range_end = @range_end + @rows_to_read SET @range_iterations_counter += 1 END SET @exec_time_wide_index_usec = DATEDIFF_BIG(microsecond, @start_time, SYSDATETIME()) / @range_iterations EXEC change_trace_status @state = 'stop' --Show the execution plan statistics --SELECT @fragmented AS frag_level, * FROM vShowPlans as p WHERE p.db_name_ = DB_NAME() AND text_query LIKE '%wide_index%' OR text_query LIKE '%narrow_index%' ORDER BY text_query --Keep track of BP usage SET @bp_wide_mb = ( SELECT CAST((COUNT(*) * 8.00) / 1024 AS DECIMAL(9,2)) AS MB FROM sys.allocation_units AS a JOIN sys.dm_os_buffer_descriptors AS b ON a.allocation_unit_id = b.allocation_unit_id JOIN sys.partitions AS p ON a.container_id = p.hobt_id WHERE p.object_id = OBJECT_ID('wide_index') AND b.database_id = DB_ID() ) SET @bp_narrow_mb = ( SELECT CAST((COUNT(*) * 8.00) / 1024 AS DECIMAL(9,2)) AS MB FROM sys.allocation_units AS a JOIN sys.dm_os_buffer_descriptors AS b ON a.allocation_unit_id = b.allocation_unit_id JOIN sys.partitions AS p ON a.container_id = p.hobt_id WHERE p.object_id = OBJECT_ID('narrow_index') AND b.database_id = DB_ID() ) --Wait for trace data to arrive in target WAITFOR DELAY '00:00:10' --Get the trace data into our table SELECT CAST(event_data AS XML) AS StatementData INTO #myRawXeData FROM sys.fn_xe_file_target_read_file('C:\R\frag_test*.xel', NULL, NULL, NULL); --Done with trace file, delete it EXEC change_trace_status @state = 'delete_all_files'; --Transform raw trace data into our measures table WITH t AS( SELECT StatementData.value('(event/data[@name="duration"]/value)','bigint') AS duration_microsec ,StatementData.value('(event/data[@name="cpu_time"]/value)','bigint') AS cpu_microsec ,StatementData.value('(event/data[@name="physical_reads"]/value)','bigint') AS physical_reads ,StatementData.value('(event/data[@name="logical_reads"]/value)','bigint') AS logical_reads ,StatementData.value('(event/data[@name="statement"]/value)','nvarchar(500)') AS statement_ --,StatementData.value('(event/@timestamp)','datetime') AS timestamp_ FROM #myRawXeData AS evts WHERE StatementData.value('(event/data[@name="statement"]/value)','nvarchar(500)') LIKE '%index WHERE c1%' --ORDER BY timestamp_ ), t2 AS ( SELECT CASE WHEN t.statement_ LIKE '%wide_index%' THEN 'wide_index' ELSE 'narrow_index' END AS index_width ,CASE @fragmented WHEN 'high_frag_level' THEN 'y' ELSE 'n' END AS fragmented ,duration_microsec ,CASE WHEN t.statement_ LIKE '%wide_index%' THEN @exec_time_wide_index_usec ELSE @exec_time_narrow_index_usec END AS usec_clock ,cpu_microsec ,physical_reads ,logical_reads ,CASE WHEN t.statement_ LIKE '%wide_index%' THEN @bp_wide_mb ELSE @bp_narrow_mb END AS mb_cache FROM t) INSERT INTO measures(rows_to_read, index_width, fragmented, usec_xe, usec_clock, cpu_microsec, physical_reads, logical_reads, mb_cache) SELECT @rows_to_read, index_width, fragmented, duration_microsec, usec_clock, cpu_microsec, physical_reads, logical_reads, mb_cache FROM t2; --Report fragmentation level, if requested IF @report_frag = 'y' --Note size of index and frag level, should be comparative between executions SELECT OBJECT_NAME(s.object_id) AS table_name ,s.index_type_desc ,CAST(s.avg_fragmentation_in_percent AS decimal(5,1)) AS frag_level ,s.page_count/1000 AS page_count_1000s ,s.index_depth FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') AS s WHERE s.index_level = 0 AND s.alloc_unit_type_desc = 'IN_ROW_DATA' AND OBJECT_NAME(s.object_id) IN('narrow_index', 'wide_index') ORDER BY index_id GO --END of proc that executes our SQL --1: Generate fragmentation in both indexes --Fragment wide ix EXEC #status 'Generate fragmentation in wide index…' UPDATE wide_index SET c6 = REPLICATE('a', 200) WHERE c1 % 20 = 0 UPDATE STATISTICS wide_index WITH FULLSCAN, MAXDOP = 0 --Fragment narrow ix EXEC #status 'Generate fragmentation in narrow index…' UPDATE narrow_index SET c2 = REPLICATE('a', 20) WHERE c1 % 100 = 0 UPDATE STATISTICS narrow_index WITH FULLSCAN, MAXDOP = 0 --Configure the instance and database --Execute this if you want to have a rather full BP, restricts memory to 3 GB EXEC sp_configure 'max server memory', 3000 RECONFIGURE --Turn off auto-update statistics ALTER DATABASE Stackoverflow SET AUTO_UPDATE_STATISTICS OFF --Turn off parallelism ALTER DATABASE SCOPED CONFIGURATION SET MAXDOP = 1 --Just for fun, if you want: disable read-ahead (sequential pre-fetch) --DBCC TRACEON(652, -1) --Run the queries DROP TABLE IF EXISTS #myRawXeData EXEC #status 'Run queries with high frag level…' --EXEC run_the_sql @fragmented = 'high_frag_level', @rows_to_read = 1, @range_iterations = 100, @fill_bp = 'y', @report_frag = 'y' EXEC run_the_sql @fragmented = 'high_frag_level', @rows_to_read = 10000, @range_iterations = 100, @fill_bp = 'y', @report_frag = 'n' EXEC run_the_sql @fragmented = 'high_frag_level', @rows_to_read = 100000, @range_iterations = 100, @fill_bp = 'y', @report_frag = 'n' --2: Defrag the indexes with a fillfactor set to make same size as when fragmented EXEC sp_configure 'max server memory', 26357 RECONFIGURE EXEC #status 'Eliminate fragmentation in wide index…' ALTER INDEX x ON wide_index REBUILD WITH (FILLFACTOR = 48, MAXDOP = 0) EXEC #status 'Eliminate fragmentation in narrow index…' ALTER INDEX x ON narrow_index REBUILD WITH (FILLFACTOR = 50, MAXDOP = 0) EXEC sp_configure 'max server memory', 3000 RECONFIGURE --Run the queries DROP TABLE IF EXISTS #myRawXeData EXEC #status 'Run queries with low frag level…' --EXEC run_the_sql @fragmented = 'low_frag_level', @rows_to_read = 1, @range_iterations = 100, @fill_bp = 'y', @report_frag = 'y' EXEC run_the_sql @fragmented = 'low_frag_level', @rows_to_read = 10000, @range_iterations = 100, @fill_bp = 'y', @report_frag = 'n' EXEC run_the_sql @fragmented = 'low_frag_level', @rows_to_read = 100000, @range_iterations = 100, @fill_bp = 'y', @report_frag = 'n' ----------------------------------------------------------------------------------- --Reset configurations EXEC sp_configure 'cursor threshold', -1 EXEC sp_configure 'max server memory', 26357 RECONFIGURE ALTER DATABASE Stackoverflow SET AUTO_UPDATE_STATISTICS ON ALTER DATABASE SCOPED CONFIGURATION SET MAXDOP = 0 DBCC TRACEOFF(652, -1) GO ----------------------------------------------------------------------------------- --Run below manually and investigate the output --Raw data from the trace SELECT * FROM measures ORDER BY rows_to_read, index_width, fragmented --Average the data. SELECT m.rows_to_read ,m.index_width ,m.fragmented --,AVG(m.usec_xe) AS usec_xe ,AVG(m.usec_clock) AS usec_clock ,AVG(m.mb_cache) AS mb_cache ,AVG(m.physical_reads) AS physical_reads ,AVG(m.logical_reads) AS logical_reads --,AVG(m.cpu_microsec) AS cpu_microsec --I didn't find this relevant, hence the exclusion FROM measures AS m GROUP BY m.rows_to_read, m.index_width, m.fragmented ORDER BY index_width, rows_to_read, fragmented; ----------------------------------------------------------------------------------- --Cleanup /* DROP TABLE IF EXISTS narrow_index DROP TABLE IF EXISTS myXeData DROP TABLE IF EXISTS myXeData2 DROP TABLE IF EXISTS bp_usage */