Wednesday, 6 July 2016

Tempdb monitoring - Free space alert

SQL Server uses tempdb for many features, including row versioning, service broker and query execution amongst others.  During the execution of queries, data used for sorts and hash joins can be “spilled” to tempdb if the memory granted to the query isn't enough to store the data.  If this happens for a very big query or lots of small queries or for queries on instance that makes extensive use read committed snapshot or any combination of many reasons, an error resembling the one below can be received...


Msg 1105, Level 17, State 2, Line 3
Could not allocate space for object 'dbo.SORT temporary run storage: 140737494384640' in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

Due to the way I configure tempdb (a single drive dedicated to the datafiles with multiple equal sized datafiles all pre-grown to their maximum size) the advice offered in the error message wont work. Instead, I create a SQL Agent alert to warn me before the space is all used up.

Recreating the problem

Running the following query in Adventure Works will create a plan with sorts and hash joins, but with the correct statistics and enough memory available, will not need to use tempdb...

SELECT * 
FROM Sales.SalesOrderDetail s1 
INNER HASH JOIN Sales.SalesOrderDetail s2 
ON s1.rowguid = s2.rowguid 
ORDER BY s1.rowguid












PerfMon confirms that the free space in tempdb doesn't change...



















By fooling the optimiser into thinking there's only 1 row in the table...

UPDATE STATISTICS Sales.SalesOrderDetail WITH ROWCOUNT = 1

...the query plan generated will include a spill to tempdb because the estimated number of rows is very wrong and the subsequent memory grant is too low.










...and tempdb will have to store the data, as can be seen in PerfMon.




















This isn't a problem if the space in tempdb is available but if the there isn't enough space, errors will occur.
 









Solutions

What can be done when tempdb becomes full? Creating more space is the obvious answer. Enabling autogrowth on the datafile(s) can help but its best practise to have multiple, equal sized tempdb datafiles. Turning on trace flag 1117, which expands all datafiles in a filegroup at the same rate is a good idea although this is a server wide setting so will affect other databases on the instance.

My solution, creating an alert, doesn't really solve the problem but at least gives an early warning so evasive action can be taken.

Tempdb is getting full” alert

Although I call it an alert, its actually made up of the following 3 things...

1 – A SQL Agent alert
2 – A SQL Agent job
3 – A stored procedure

Starting from the bottom, the stored procedure is used to find what's using tempdb and then send an email with this info. I check current running queries, the size of the row store and the size of permanent tables created in tempdb although other checks can be added. To check the current running queries, I use Adam Mechanics sp_whoisactive to look for queries with a tempdb_allocation above zero and insert the 5 with the highest values into a temporary table (yes, I realise what I'm doing here :-) ). I use sys.dm_os_performance_stats for the row store size and sys.allocation_units for the permanent table size.

Once all the tempdb checks are collected, a dynamic HTML block of code is built to be used to email the DBA team using sp_send_dbmail.

USE [master]
GO
CREATE PROC [dbo].[spTempDBSizeInfo]
AS
DECLARE @subject varchar(500)
DECLARE @body NVARCHAR(MAX)
DECLARE @body_format varchar(20) = 'HTML'

SELECT @subject = 'Free TempDB space on ' + @@ServerName + ' - ' + CAST(100 - cast((A.TotalSpaceInMB-cntr_value/1024)*100/A.TotalSpaceInMB as int) AS VARCHAR(3)) + '% left of ' + cast(A.TotalSpaceInMB as varchar(12)) + 'MB'
FROM sys.dm_os_performance_counters
CROSS APPLY (select sum(size)*8/1024 as TotalSpaceInMB from tempdb.sys.database_files where type = 0) AS A
WHERE counter_name = 'Free Space in tempdb (KB)'


CREATE TABLE [dbo].[#tblWhoIsActive](
[dd hh:mm:ss.mss] [varchar](8000) NULL,
[session_id] [smallint] NOT NULL,
[sql_text] [xml] NULL,
[login_name] [nvarchar](128) NOT NULL,
[wait_info] [nvarchar](4000) NULL,
[tran_log_writes] [nvarchar](4000) NULL,
[CPU] [varchar](30) NULL,
[tempdb_allocations] [varchar](30) NULL,
[tempdb_current] [varchar](30) NULL,
[blocking_session_id] [smallint] NULL,
[reads] [varchar](30) NULL,
[writes] [varchar](30) NULL,
[physical_reads] [varchar](30) NULL,
[query_plan] [xml] NULL,
[used_memory] [varchar](30) NULL,
[status] [varchar](30) NOT NULL,
[tran_start_time] [datetime] NULL,
[open_tran_count] [varchar](30) NULL,
[percent_complete] [varchar](30) NULL,
[host_name] [nvarchar](128) NULL,
[database_name] [nvarchar](128) NULL,
[program_name] [nvarchar](128) NULL,
[start_time] [datetime] NOT NULL,
[login_time] [datetime] NULL,
[request_id] [int] NULL,
[collection_time] [datetime] NOT NULL) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
EXEC sp_WhoIsActive
@get_transaction_info = 2,
@get_plans = 1,
@destination_table='#tblWhoIsActive'

SELECT TOP 5
[dd hh:mm:ss.mss],
session_id,
CAST(sql_text as varchar(MAX)) AS sql_text,
login_name,
(REPLACE(tempdb_current,',','')*8)/1024 as tempdb_current_MB,
(REPLACE(tempdb_allocations,',','')*8)/1024 as tempdb_allocations_MB,
CPU,
reads,
writes
INTO #tblWhoIsActive_temp
FROM #tblWhoIsActive
WHERE REPLACE(tempdb_allocations,',','') > 0
ORDER BY tempdb_allocations DESC
IF @@ROWCOUNT = 0
BEGIN
SET @body = N'<head>' + N'<style type="text/css">h2, body {font-family: Arial, verdana;} table{font-size:11px; border-collapse:collapse;} td{background-color:#F1F1F1; border:1px solid black; padding:3px;} th{background-color:#99CCFF;}</style>' +
N'<h2><font color="#0000ff" size="4">Free TempDB space</font></h2>' + N'</head>' + N'<p>' + ' ' + '</p>'+ N'<body>' + N' <hr> ' + N'<h1><font color="#0000ff" size="2">The top 5 TempDB intensive queries currently running are:</font></h1>' +
N' ' + N'<font color="#0000ff" size="2">No queries using TempDB currently running</font>' + N' ' + N' <br></br>' + N'<p>' + ' ' + '</p>'+ N' <hr> ' + N'<h1><font color="#0000ff" size="2">TempDB components size MB:</font></h1>' +
N' ' + N'<table border="1">' + N'<tr><th>PersistedTableSizeMB</th><th>VersionStoreSizeMB</th></tr>' +
CAST ( (
SELECT td = SUM(au.total_pages)*8/1024, '',
td = MAX(A.VersionStoreSizeMB), ''
FROM tempdb.sys.partitions p
INNER JOIN tempdb.sys.allocation_units au
ON p.hobt_id = au.container_id
CROSS APPLY (SELECT cntr_value/1024 AS VersionStoreSizeMB FROM sys.dm_os_performance_counters WHERE counter_name = 'Version Store Size (KB)') AS A
WHERE OBJECT_NAME(p.object_id) not like '#%'
FOR XML PATH('tr'), TYPE )
AS NVARCHAR(MAX) ) +
N'</table>' + N'</body>' ;
END
ELSE
BEGIN
SET @body = N'<head>' + N'<style type="text/css">h2, body {font-family: Arial, verdana;} table{font-size:11px; border-collapse:collapse;} td{background-color:#F1F1F1; border:1px solid black; padding:3px;} th{background-color:#99CCFF;}</style>' +
N'<h2><font color="#0000ff" size="4">Free TempDB space</font></h2>' + N'</head>' + N'<p>' + ' ' + '</p>'+ N'<body>' + N' <hr> ' + N'<h1><font color="#0000ff" size="2">The top 5 tempdb intensive queries currently running are:</font></h1>' +
N' ' + N'<table border="1">' + N'<tr><th>[dd hh:mm:ss.mss]</th><th>session_id</th><th>sql_text</th><th>login_name</th><th>tempdb_allocations_MB</th><th>tempdb_current_MB</th><th>CPU</th><th>reads</th><th>writes</th></tr>' +
CAST ( (
SELECT td = [dd hh:mm:ss.mss], '',
td = session_id, '',
td = CAST(sql_text AS VARCHAR(256)),'',
'',
td = login_name, '',
td = tempdb_allocations_MB, '',
td = tempdb_current_MB, '',
td = CPU, '',
td = reads, '',
td = writes, ''
FROM #tblWhoIsActive_temp
ORDER BY tempdb_allocations_MB DESC
FOR XML PATH('tr'), TYPE )
AS NVARCHAR(MAX) ) +
N'</table>' + N' <br></br>' + N'<p>' + ' ' + '</p>'+ N' <hr> ' + N'<h1><font color="#0000ff" size="2">TempDB components size MB:</font></h1>' + N' ' + N'<table border="1">' +
N'<tr><th>PersistedTableSizeMB</th><th>VersionStoreSizeMB</th></tr>' +
CAST ( (
SELECT
td = SUM(au.total_pages)*8/1024, '',
td = MAX(A.VersionStoreSizeMB), ''
from tempdb.sys.partitions p
INNER JOIN tempdb.sys.allocation_units au
ON p.hobt_id = au.container_id
CROSS APPLY
(SELECT cntr_value/1024 AS VersionStoreSizeMB FROM sys.dm_os_performance_counters WHERE counter_name = 'Version Store Size (KB)') AS A
WHERE object_name(p.object_id) not like '#%'
FOR XML PATH('tr'), TYPE ) AS NVARCHAR(MAX) ) +
N'</table>' + N'</body>'
END

EXEC MSDB.dbo.sp_send_dbmail
@profile_name = 'ProfileName'
,@recipients = 'DBATeam'
,@subject = @subject
,@body = @body
,@body_format = @body_format



The stored procedure is called by an unscheduled SQL Agent Job.























The SQL Agent job is executed by a performance condition alert being triggered within SQL Agent alerts, with the option set to only raise the alert every 5 minutes. This value maybe too high for some people but can be altered on the options page.

























The SQL Agent alert is configured to be triggered when the Free Space falls below a certain value, which I normally set at 50% of the size of tempdb.






















As tempdb size can differ for different instances, i've done a little bit of TSQL trickery to make the script runnable on any instance.


USE [msdb]
GO
DECLARE @PercentToWarn DECIMAL(12,2) = 50
SET @PercentToWarn = @PercentToWarn/100

DECLARE @InstanceName VARCHAR(128) =
(SELECT CASE WHEN SERVERPROPERTY ('InstanceName') IS NULL THEN 'SQLSERVER'
ELSE 'MSSQL$' + CAST(SERVERPROPERTY ('InstanceName') AS VARCHAR(32)) END)

DECLARE @TempWarnSize INT = 
 (SELECT SUM(size)*8 FROM tempdb.sys.database_files WHERE type = 0) * @PercentToWarn
 
DECLARE @Job_id UNIQUEIDENTIFIER = (
 select job_id FROM sysjobs WHERE name = 'zzTempDB_Size_Info')
 
DECLARE @PerfCond VARCHAR(1024) = 
@InstanceName + ':Transactions|Free Space in tempdb (KB)||<|' + CAST(@TempWarnSize AS VARCHAR(132)) + ''

BEGIN
TRY
EXEC msdb.dbo.sp_help_alert 'FreeSpaceInTempdb'
END TRY

BEGIN
CATCH
EXEC msdb.dbo.sp_add_alert
@name=N'FreeSpaceInTempdb',       
@enabled=1,       
@delay_between_responses=300,       
@include_event_description_in=0,      
@performance_condition=@PerfCond,       
@job_id=@Job_id
END CATCH                 
GO


Results 

When tempdb free space drops below the configured value (50% by default although on this instance its configured to 80%), an email will be sent the DBAs.










Further Reading 

Although the tempdb 2005 whitepaper is starting to get a bit old, the details contained within are still largely relevant to subsequent SQL Server versions and is an informative read.

https://technet.microsoft.com/en-gb/library/cc966545.aspx

Also worth watching is a PASS lecture called Inside Tempdb by Bob Ward.

https://www.youtube.com/watch?v=SvseGMobe2w&feature=share&list=PLF80A8A233EE9F22F