Capturing deadlock events

Microsoft SQL Server

Traditionally, trace flags 1204 or 1222 are used to write information about deadlock events to the SQL Server log. But since SQL Server 2012, there is a default Extended Event session running called system_health that can be used instead. Extended events are stored as XML, and in this post I’ll show a method to extract information when a deadlock occurs and store it in a regular table. By having the deadlocks stored in a table, it is possible to view the history of deadlocks in order to enhance deadlock investigations.

Viewing Deadlock information from the system_health session

First, if you are unsure about the mechanisms behind deadlocks, check out my blog post Deadlock Basics. There you will find instructions of how to create a deadlock situation on your own.

Before we start, make sure the system_health session is running by, in SSMS, checking Management -> Extended Events -> Sessions. There should be a green arrow indicating that the session is running:

Extended Events System Health Session running

Extended Events System Health Session running

Next, create a deadlock event by following the instructions here.

The system_health session writes to two targets, both an event file and the ring buffer:

System Health Event File And Ring Buffer

System Health Event File And Ring Buffer

From now on, I’m going to use the event file. This is because the ring buffer suffers from an extreme lag, sometimes taking days before the event shows up. Check out this connect item for more information on that. Or here. Another reason is that there is not GUI for the events in the ring buffer. For the events stored on file, there is a GUI that can be accessed by right clicking *.event_file and selecting “View Target Data”:

System Health Event File Target Data

System Health Event File Target Data

Right click the “name” header and select “Sort Descending”. Somewhere at the top you will see the deadlock you created above, named xml_deadlock_report. If you select the row there is a graphical display of the event:

Graphical Deadlock Event

Graphical Deadlock Event

The method described above can be very useful to investigate on recent deadlock events. But if you want to store the event to keep a history of deadlock events, a bit more work is required. Remember that the extended events sessions only keep data for a limited time period. On a busy system with many events the information can be pushed out from the session targets rather quickly.

Storing Deadlock events in a table

Extended Events are stored as XML. The event viewed above was rendered from an XML representation of the event. So a bit of XQuery is required to get the XML to a table format. First we need to get the complete file name for the file target:

DECLARE @XE_EVENT_FILE_TARGET XML, @XE_FILE_PATH NVARCHAR(260)

SELECT @XE_EVENT_FILE_TARGET = CONVERT(XML, xst.target_data)
FROM sys.dm_xe_sessions xs
JOIN sys.dm_xe_session_targets xst
ON xs.address = xst.event_session_address
WHERE xs.name = 'system_health'
AND xst.target_name = 'event_file'

SELECT @XE_FILE_PATH = t.c.value('File[1]/@name', 'varchar(260)') FROM @XE_EVENT_FILE_TARGET.nodes('/EventFileTarget') AS t(c)

SELECT @XE_FILE_PATH

This should return something like “C:\Program Files\Microsoft SQL Server\MSSQL12.SS2014\MSSQL\Log\system_health_0_130729460341480000.xel”

Next, we use the function sys.fn_xe_file_target_read_file to read the XML. Add the following SQL to the code above:

IF (SELECT OBJECT_ID('#Tind_EE_DATA')) IS NULL BEGIN
	DROP TABLE #Tlind_EE_DATA
END

SELECT CAST(event_data AS XML) AS event_data
INTO #TLind_EE_DATA
FROM sys.fn_xe_file_target_read_file(@XE_FILE_PATH, null, null, null)
WHERE object_name = 'xml_deadlock_report'

SELECT * FROM #TLind_EE_DATA

Now you will see a row for each deadlock event recorded in the current session file target. The column contains XML, so to extract columns from the XML add the following statement to the code above:

;WITH CTE_EVENTS AS
	(
		SELECT
			c.value('@timestamp', 'DATETIME2') AS DeadlockTimestamp,
			c.value('(@name)[1]', 'nvarchar(250)') AS DeadlockEvent,
			c.value('(data/value/deadlock/victim-list/victimProcess)[1]/@id', 'nvarchar(250)') AS VictimProcessId,
			c.value('(data/value/deadlock/process-list/process)[1]/@id', 'nvarchar(250)') AS Processid_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@waitresource', 'nvarchar(250)') AS WaitResource_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@waittime', 'int') AS WaitTime_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@ownerId', 'bigint') AS OwnerId_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@currentdb', 'int') AS CurrentDb_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@lasttranstarted', 'datetime2') AS LastTranStarted_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@lastbatchstarted', 'datetime2') AS LastBatchStarted_1,
			c.value('(data/value/deadlock/process-list/process/inputbuf)[1]', 'nvarchar(max)') AS Statement_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@clientapp', 'nvarchar(250)') AS Client_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@hostname', 'nvarchar(250)') AS Hostname_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@loginname', 'nvarchar(250)') AS Loginname_1,
			c.value('(data/value/deadlock/process-list/process)[2]/@id', 'nvarchar(250)') AS Processid_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@waitresource', 'nvarchar(250)') AS WaitResource_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@waittime', 'int') AS WaitTime_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@ownerId', 'bigint') AS OwnerId_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@currentdb', 'int') AS CurrentDb_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@lasttranstarted', 'datetime2') AS LastTranStarted_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@lastbatchstarted', 'datetime2') AS LastBatchStarted_2,
			c.value('(data/value/deadlock/process-list/process/inputbuf)[2]', 'nvarchar(max)') AS Statement_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@clientapp', 'nvarchar(250)') AS Client_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@hostname', 'nvarchar(250)') AS Hostname_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@loginname', 'nvarchar(250)') AS Loginname_2
		FROM #TLind_EE_DATA
		CROSS APPLY event_data.nodes('//event') AS t (c)
	)
SELECT *
FROM CTE_EVENTS

The result is columns with some of the information recorded on a deadlock event. The most interesting columns concern the time for the event, the statements involved and which users. Both for the deadlock victim and for the surviving transaction. Be careful to trust the Timestamp value however since it may be incorrect. If you want to extract even more information, click on the column with the XML to see the complete content:

Deadlock Event Data

Deadlock Event Data

So. The complete script to store all deadlock events in the system_health target file could look like this. I have added a table to store the information and a variable to only read the latest events.

SET QUOTED_IDENTIFIER ON
SET NOCOUNT ON

DECLARE @LastDeadLock DATETIME2, @XE_EVENT_FILE_TARGET XML, @XE_FILE_PATH NVARCHAR(260)

SELECT @XE_EVENT_FILE_TARGET = CONVERT(XML, xst.target_data)
FROM sys.dm_xe_sessions xs
JOIN sys.dm_xe_session_targets xst
ON xs.address = xst.event_session_address
WHERE xs.name = 'system_health'
AND xst.target_name = 'event_file'

SELECT @XE_FILE_PATH = t.c.value('File[1]/@name', 'varchar(260)') FROM @XE_EVENT_FILE_TARGET.nodes('/EventFileTarget') AS t(c)

SELECT CAST(event_data AS XML) AS event_data
INTO #TLind_EE_DATA
FROM sys.fn_xe_file_target_read_file(@XE_FILE_PATH, null, null, null)
WHERE object_name = 'xml_deadlock_report'

IF (SELECT OBJECT_ID('TLind_DEADLOCKS')) IS NULL BEGIN
	CREATE TABLE TLind_DEADLOCKS
		(
			DeadlockTimestamp DATETIME2 NOT NULL,
			DeadlockEvent NVARCHAR(250) NOT NULL,
			VictimProcessId NVARCHAR(250) NOT NULL,
			Processid_1 NVARCHAR(250) NOT NULL,
			WaitResource_1 NVARCHAR(250) NULL,
			WaitTime_1 INT NULL,
			OwnerId_1 BIGINT NULL,
			CurrentDb_1 INT NULL,
			LastTranStarted_1 DATETIME2 NULL,
			LastBatchStarted_1 DATETIME2 NOT NULL,
			Statement_1 NVARCHAR(MAX) NULL,
			Client_1 NVARCHAR(250) NULL,
			Hostname_1 NVARCHAR(250) NULL,
			Loginname_1 NVARCHAR(250) NULL,
			Processid_2 NVARCHAR(250) NOT NULL,
			WaitResource_2 NVARCHAR(250) NULL,
			WaitTime_2 INT NULL,
			OwnerId_2 BIGINT NULL,
			CurrentDb_2 INT NULL,
			TranStarted_2 DATETIME2 NULL,
			LastBatchStarted_2 DATETIME2 NOT NULL,
			Statement_2 NVARCHAR(MAX) NULL,
			Client_2 NVARCHAR(250) NULL,
			Hostname_2 NVARCHAR(250) NULL,
			Loginname_2 NVARCHAR(250) NULL
		)
END

SELECT @LastDeadLock = ISNULL(MAX(DeadlockTimestamp), '1900-01-01') FROM TLind_DEADLOCKS

;WITH CTE_EVENTS AS
	(
		SELECT
			c.value('@timestamp', 'DATETIME2') AS DeadlockTimestamp,
			c.value('(@name)[1]', 'nvarchar(250)') AS DeadlockEvent,
			c.value('(data/value/deadlock/victim-list/victimProcess)[1]/@id', 'nvarchar(250)') AS VictimProcessId,
			c.value('(data/value/deadlock/process-list/process)[1]/@id', 'nvarchar(250)') AS Processid_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@waitresource', 'nvarchar(250)') AS WaitResource_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@waittime', 'int') AS WaitTime_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@ownerId', 'bigint') AS OwnerId_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@currentdb', 'int') AS CurrentDb_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@lasttranstarted', 'datetime2') AS LastTranStarted_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@lastbatchstarted', 'datetime2') AS LastBatchStarted_1,
			c.value('(data/value/deadlock/process-list/process/inputbuf)[1]', 'nvarchar(max)') AS Statement_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@clientapp', 'nvarchar(250)') AS Client_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@hostname', 'nvarchar(250)') AS Hostname_1,
			c.value('(data/value/deadlock/process-list/process)[1]/@loginname', 'nvarchar(250)') AS Loginname_1,
			c.value('(data/value/deadlock/process-list/process)[2]/@id', 'nvarchar(250)') AS Processid_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@waitresource', 'nvarchar(250)') AS WaitResource_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@waittime', 'int') AS WaitTime_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@ownerId', 'bigint') AS OwnerId_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@currentdb', 'int') AS CurrentDb_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@lasttranstarted', 'datetime2') AS LastTranStarted_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@lastbatchstarted', 'datetime2') AS LastBatchStarted_2,
			c.value('(data/value/deadlock/process-list/process/inputbuf)[2]', 'nvarchar(max)') AS Statement_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@clientapp', 'nvarchar(250)') AS Client_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@hostname', 'nvarchar(250)') AS Hostname_2,
			c.value('(data/value/deadlock/process-list/process)[2]/@loginname', 'nvarchar(250)') AS Loginname_2
		FROM #TLind_EE_DATA
		CROSS APPLY event_data.nodes('//event') AS t (c)
	)
INSERT TLind_DEADLOCKS
SELECT *
FROM CTE_EVENTS
WHERE DeadlockTimestamp > @LastDeadLock

DROP TABLE #TLind_EE_DATA

I run this in tempdb, that’s why I check for the existence of the table TLind_DEADLOCKS. But if you run it in tempdb, remember to extract the information and store it in a more permanent storage. Or if you prefer, run it in another database.

So when should this script be run? One way would be to run it in SQL Agent on a schedule. But the disadvantage with that is that if you run it too often, valuable server resources are consumed. If you run it too infrequently, there is a risk if missing event that have been pushed out from the file. A better way would be to use Alerts. That way the script will only run whenever a deadlock actually occurs. So let’s create an alert that fires on error 1205 – Deadlock. But first we need to make sure that error 1205 is logged, since by default it is not:

--TO CHECK IF LOGGING IS ENABLED (is_event_logged)
SELECT * FROM sys.messages WHERE language_id = 1033 AND message_id = 1205

--TO ALTER THE LOGGING PARAMETER
EXEC sp_altermessage @message_id = 1205, @parameter = 'WITH_LOG' ,@parameter_value = 'true'
GO

Next you’ll need to make sure SQL Server Alerting is set up correctly. Here you can find more information on how to do that.

Next, create a SQL Server Agent job called DeadLockEventCapture. It only needs one step that runs the code above. Make sure to select tempdb or whatever database you want to use:

Dead Lock Event Capture

Dead Lock Event Capture

If you already are using alerts, check if there already is an alert for error 1205. If so, you could probably use the same alert. Else, create a new alert with the name 1205 – Deadlock, Error number 1205:

SQL Server Alert 1205

SQL Server Alert 1205

On the response tab, check “Execute job” and select the job DeadLockEventCapture:

SQL Alert Dead Lock Event Capture

SQL Alert Dead Lock Event Capture

On the Options tab I suggest that you enter a value for “Delay between responses” to at least 30 seconds. This is to prevent the job from running too often on a system that experiences many deadlocks.

*** Updated 2015-04-23 ***

Added columns so that waittime, ownerId and currentdb are extracted as well.

Leave a Reply

Required fields are marked *.