datawarehouseSqlserverandstories: TempDB Multiple Datafiles

Problem

Most DBAs are aware that (prior to SQL 2016) one of the most common, post installation changes that needs to be made in SQL Server is the addition of extra TempDB datafiles. What seems to be less well known is that ALL these files all need to be of equal size and need to expand in equal increments.

The addition of extra TempDB files is to alleviate contention on the PFS, GAM and SGAM pages (these are the pages that help SQL decide which pages to use to store data; meta-pages maybe?). The idea being that, basically, the extra datafiles means more PFS, GAM and SGAM pages, which means TempDB workloads have a greater number of pages they can use hence less time spent waiting for used pages to become unlatched. The reason for the EQUALLY sized datafiles is the way SQL allocates the data to each datafile. SQL will check to see if all the datafiles have the same amount of free space. If so, it will add data in a “round robin” fashion, where each datafile will take it in turn to store data. This is how the contention on the PFS, GAM and SGAM pages are reduced. If one of the TempDB files is of a greater size than the rest and has more free space, SQL will prioritise this one file over the others and use it until the free space becomes the same across all files in the filegroup.

Example

In the following example, i've used sys.dm_io_virtual_file_stats to capture the MB written and IO count on each file in TempDB. A baseline is taken initially, then a query is executed 5 times which creates a temporary table and fills it with data and finally another snapshot of sys.dm_io_virtual_file_stats is taken and the results used to find the totals.

Queries

Before starting the TempDB workload the following query is executed

SELECT GETDATE() as dtStart,
iovfs.*,
mf.name,
mf.type_desc
INTO #dm_io_virtual_file_stats_start
FROM sys.master_files mf
INNER JOIN sys.dm_io_virtual_file_stats(2,NULL) iovfs
ON mf.database_id = iovfs.database_id and mf.file_id = iovfs.file_id

The following query is executed 5 times to generated a workload in TempDB

SELECT TOP ( 2000000 )
fn = LEFT( o.name, 64 ),
ln = LEFT( c.name, 64 ), em = LEFT( o.name, LEN ) c.name )% 5+1 ) + '.'
+ LEFT( c.name, LEN ( o.name )% 5+2 ) + '@'
+ RIGHT( c.name, LEN ( o.name + c.name )% 12 + 1 )
+ LEFT( RTRIM ( CHECKSUM ( NEWID ())), 3 ) + '.com' ,
a = CASE WHEN c.name LIKE '%y%' THEN 0 ELSE 1 END
INTO #temptest
FROM sys.all_objects AS o
CROSS JOIN sys.all_columns AS c

And finally the following SQL is executed to compute the results.

SELECT
DB_NAME ( t1.[database_id] ) [DatabaseName] , t1.[name] [LogicalName] ,
(SUM(t2.[num_of_bytes_written])-SUM(t1.[num_of_bytes_written]))/1048576
AS [Written_MB]
, SUM(t2.[num_of_reads]+t2.[num_of_writes])-SUM(t1.[num_of_reads]+t1.[num_of_writes])
AS [IO_Count]
FROM #dm_io_virtual_file_stats_start AS t1
INNER JOIN
(SELECT iovfs.*,mf.[name],mf.[type_desc]
FROM sys.master_files mf
INNER JOIN sys.dm_io_virtual_file_stats (2,NULL) iovfs
ON mf.[database_id]= iovfs.[database_id] AND mf.[file_id] = iovfs.[file_id]) t2
ON t1.[database_id] = t2.[database_id] AND t1.[file_id] = t2.[file_id]
GROUP BY DB_NAME(t1.[database_id]), t1.[name]
ORDER BY IO_Count DESC

Results

1 TempDB file

Using a single TempDB datafile...

the following values were observed.

2 Equal TempDB files

Using two equal TempDB files

the following values were observed. Note the almost equal Written_MB and IO_Count.

2 Unequal TempDB files

Using two unequal TempDB files

the following values were observed. Note the unequal Written_MB and IO_Count.

4 Equal sized TempDB files

Using four equally sized TempDB files

the following values were observed.

4 Unequal TempDB files

And finally, Using four unequally sized TempDB files

the following values were observed.

Solutions

There are a couple of ways, when creating multiple TempDB files, to make sure they stay evenly sized. My preference is to dedicate TempDB datafiles to its own drive. Divide the drive size by the number of datafiles, create each datafile that size and dont allow them to autoextend. This can cause problems if an errent query uses all the space in TempDB, in which case the query will need to rewriten to use less TempDB space or additional space will need to be added to the drive.

A trace flag, -T1117, can be also be used. This trace flags grows all datafiles in a filegroup at the same time. Beware that this is a global trace flag and will affect all database on an instance.

Conclusion

Its important, when creating multiple TempDB files, that each file is of equal size. Be aware of TempDBs usage patterns and how each datafile will grow if fully used.

3 comments:

Paulo Morais Nascimento28 March 2017 at 07:18
Hello Iain,

You start your nice article by saying "Most DBAs are aware that (prior to SQL 2016) one of the most common..."

Does this means that spliting the system database tempdb datafiles is no longer needed in SQL Server 2016 (or vNext) ?

Thanks,
iain galloway28 March 2017 at 12:05
hello paulo

its still needed but now, during the install, theres an option to create multiple files. aaron bertrand explains the pros and cons in this article...

https://blogs.sentryone.com/aaronbertrand/sql-server-2016-tempdb-fixes/

thanks
iain
Paulo Morais Nascimento28 March 2017 at 16:53
Ok Iain, actually had some thoughts about that some time ago: https://www.linkedin.com/pulse/sql-server-tempdb-multiple-files-during-setup-paulo-a-nascimento

Cheers,
Paulo

Tuesday, 14 March 2017

TempDB Multiple Datafiles

3 comments: