Better Notification from SQL Agent Jobs

I manage about 10 SQL servers and I rely heavily on SQL Agent jobs to do this. My aim is to get everything going smoothly with minimal intervention from me, but I want to know straight away when something goes wrong. SQL Agent has an excellent scheduler and what you can do in a job step is flexible and powerful but it’s not great at coping if the job goes wrong. Here’s my simple solution.

Let’s look at an example: a job to backup a number of databases, one in each step. I want it to backup as many as it can, so if there’s an error I want it to keep going and then tell me at the end that the job wasn’t successful.

You can’t pass data directly between job steps. MSDN says you can do it via a permanent table or global temporary table – the permanent table worked for me but not the global temporary table – but this doesn’t feel right either. Depending on how you configure what a step does if it fails, you can control things to a degree:

  • If each step quits the job I don’t get all my backups done.
  • If each step goes to the next step the job won’t notify me there was a problem because it got successfully to the last step and that succeeded too. (I raised a request on Connect about this. If the job history viewer can show an asterisk when one of the steps failed, the notification system should be able to make this distinction too.)
  • I suppose I could use the ‘Go to step X’ to create a structure with an optional ‘Step X-1 failed’ step after each step which is skipped when there’s no error. But that’s just ugly and fragile.

None of these work for me. So what I do is have each step just go to the next step and the last step is a stored proc that checks the job history for this run and raises an error if anything failed. That last step quits the job reporting failure or success. Then I set the job to notify me if it fails. If any backup fails the job just moves on to the next one and, at the end, the stored proc sees the error and raises a new one, which triggers the notification email to me.

Here’s the stored proc:

if exists (	select	1 
		from	dbo.sysobjects 	
		where	id = object_id(N'dbo.dba_CheckJobSuccess') 
			and objectproperty(id, N'IsProcedure') = 1	)
	drop proc dbo.dba_CheckJobSuccess;

set ansi_nulls on
set quoted_identifier on
create proc [dbo].[dba_CheckJobSuccess]
	@JobID		uniqueidentifier 
Version : 1.0
Date    : 07/05/2009
set nocount on
--Check for job success
declare @PrevInstance	int
	, @ErrCount	int
--Find the end of the history for the previous run
select	top 1 @PrevInstance = sjh.instance_id
from	msdb.dbo.sysjobhistory sjh with (nolock)
where	sjh.job_id = @JobID
	and sjh.step_id = 0 --0=(Job outcome)
order by sjh.instance_id desc
select	@PrevInstance = isnull(@PrevInstance,0)
--Look for failed steps since then
select	@ErrCount = count(*)
from	msdb.dbo.sysjobhistory sjh with (nolock)
where	sjh.job_id = @JobID
	and sjh.instance_id > @PrevInstance
	and sjh.step_id <> 0 --0=(Job outcome)
	and sjh.run_status = 0
select @ErrCount = isnull(@ErrCount,0)
--Raise an error if any step(s) failed
if @ErrCount > 0
	raiserror('One or more job steps failed.', 16, 1)

And here’s how the last step calls it:

--Check for job success
exec master.dbo.dba_CheckJobSuccess @JobID = $(ESCAPE_SQUOTE(JOBID))

The value for @JobID uses token replacement to get the ID of the job in which the code is executing so I can paste this into any job and it will just work.


One thought on “Better Notification from SQL Agent Jobs

  1. Pingback: Monitoring SQL Agent Jobs | David Wimbush

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s