Resynchronizing a target server (MSX – TSX)

I often use SQL Server Agent master / target servers (MSX/TSX). I find it so convenient to create the job once and then just add whatever targetservers (TSX) should have this job. Especially when you later modify the job. Again, just modify it once. The usage of MSX in general, and how I use it, can easily become a series of blog posts in itself. But that is not the point here.

Sometimes a TSX goes out-of-sync with its master. I’ve never understood exactly under what circumstances, but it feels like it happened when you do things “too quickly”. Like change a job, push it out, and before the push has finished, you change it again. Or something like that. A TSX going out-of-sync doesn’t happen frequently. I’ve had it a handful of times. And every time, I have spent time searching etc. on how to fix it.

A couple of days ago, a client of mine had this case, and he had himself tracked down a possible way to fix this. We decided to go ahead with this, and it worked just fine. So, the purpose here is to document the (very easy) fix, for whenever this happens again. And for all of you out there who might benefit, of course. The error you see is something like:

[291] An unresolved problem exists with the download instructions (sysdownloadlist) for target server ‘Y’ at MSX ‘X’

X here is obviously the master and Y the target. And the solution was quite simply (in the msdb database):

EXEC dbo.sp_resync_targetserver, N’Y’

Are you using MSX/TSX? Have you had sync issues? How did you handle them?

Express Edition revisited, focus on SSMS

(Note: I have re-written parts of this post in the light of the comments that SP1 of 2012 include Complete tools.)

I have decided to revisit the topic of whats included in Express Edition, with focus on the tools. I have a couple of reasons for this:

  • In my 2011 post, I never tried to connect from Express SSMS to a non-Express database engine.
  • I want to check if there are any significant differences in SQL Server 2012 Express Edition, compared to SQL Server 2008R2 Express Edition.

It isn’t uncommon that people want to have SQL Server Management Studio (SSMS) on their machines; and instead of searching for the install files for the full product, they download the freely available Express Edition and install SSMS from there. This was the main reason for this update post, and the reason I focus on SSMS and the tools in this post.

It turns out that both 2008R2 and 2012 RTM Express editions of SSMS includes a lot, but not quite everyting that the full version of SSMS has. And they don’t have Profiler or Database Engine Tuning Advisor. 2012 SP1 Express download does indeed have the Complete tool package.

Basic and Complete
The full SSMS (etc.) is referred to as “Management Tools – Complete”. This is only available with the Product you pay for and with 2012 SP1 Express. The only one available with the various free Express downloads (prior to 2012 SP1), is called “Management Tools – Basic”. You can explicitly request to install Basic from an install media that includes Complete, but you have to explicitly request that in the setup program. You don’t want to do that.

One difference between 2008R2 and 2012 is when you install from a pay-media and select that you want to install Express. For 2008R2, you then only have SSMS Basic available. For 2012, you have Complete. In other words, if you use a 2012 pay-media and select Express to install SSMS, you have the option to have the full-blown SSMS – Complete (including other tools, like Profiler).

The downloads
For SQL Server 2008R2, you have “Express Edition” and “Express Edition with Advanced Services”. The former is basically only the database engine, where the later has some Tools (SSMS Basic, primarily). See my earlier blog post for more details about 2008R2.

For 2012, there are bunch of downloads available. Note that if you want Complete tools, you need to download SP1 of the installers. You find SP1 here (and RTM, which you don’t want to use, here). SP1 includes Complete tools, and you will see that those downloads are significantly larger compared to RTM. It isn’t obvious what each exe files stand for, but scroll down and you will find pretty good explanations. I tried several of these (SSMS only, Express with Tools, Express with Advanced Services). They all have in common that for RTM the tool included is Basic, where for SP1 we have Complete.

So what is the difference between Basic and Complete?

In the table below, my focus was on what isn’t in Basic. In general, I don’t bother to list functionality which is available in both Basic and Complete. So, if the functionality isn’t in the table below, it is likely available in Basic. I might have missed something, of course! And my main focus was on SSMS and the database engine.

 

Component/Functionality 2008R2 2012 RTM 2012 SP1
Functionality in SSMS
Node for Agent Y Y Y
Graphical Execution Plans Y Y Y
Projects and Solutions N Y Y
Maint Plans, Wizard Y Y Y
Maint Plans, New, designer N (1) N (2) Y
Maint Plans, Modify N (1) N (2) Y
Node for SSIS Catalog N/A Y Y
Tools menu, Profiler N N Y
Tools menu, Tuning Advisor N N Y
Connect Object Explorer to:
Analysis Services N N Y
Reporting Services N N Y
Integration Services N N Y
Tools
Profiler N N Y
Database Engine Tuning Advisor N N Y

(1): The selections are there, but they were dead – nothing happened when you select them.
(2): The selections are there, but I got an error message when selecting any of them.

Backup and the evil RETAINDAYS option

“So what bad has this option done?”, you probably as yourself. Well, not much, but I find it evil because it confuses people, especially those new to SQL Server. I have many times seen people specifying something like 3, and expect SQL Server to keep the three most recent backups in the backup file and overwrite everything which is older than that. Well, that is not what the option does.

But before we go into details, let’s look at an example backup command which is using this option:

BACKUP DATABASE sqlmaint TO DISK = 'R:\sqlmaint.bak' WITH RETAINDAYS = 3

The RETAINDAYS is also exposed in the backup dialog in SSMS: “Backup set will expire: After x days“.

It is also exposed in Maintenance Plans, the backup task. The option is named “Backup set will expire: After x days“. It is only enabled if you select the “Back up databases across one or more files” option, which is not the default option. This makes sense.
The default option is “Create a backup files for every database“, which means that every time a backup is performed, a new file is created consisting of the database name, date and time. Since we will see that this option is only relevant when we do append, it makes sense in the RETAINDAYS not being enabled for this choice.

So what does this option do? All it does is make SQL Server return an error message of you try to do a backup using the INIT option (which means overwrite) before the date and time has occurred. In other words, it tries to help you in not overwriting a backup file, using the INIT option, before it is time. You can still overwrite earlier by either using the stronger FORMAT option instead of INIT; or by simply deleting the backup file. Btw, the RETAINDAYS parameter has a cousin named EXPIREDATE, which does the same thing but you specify a datetime value instead of number of days.

Backup generations
So, we have seen that RETAINDAYS do not in any way provide any automatic backup generation handling. There is no such functionality built-in in the BACKUP command. This means that when you are looking for functionality such as “keep backup files three days, and remove older than that”, you need to look outside the BACKUP command. I.e., some script or tool of some sort.

I have an example (without the “delete old files” part) here, which I mostly created as a starting point for those who want to roll their own and want to have some example to start with. Many of you are probably using Maintenance plans (the “Create a backup files for every database” option in the backup task, along with Maintenance Cleanup task). Another popular script/tool for this is Ola Hallengren’s Maintenance Solution, which you find at http://ola.hallengren.com/.

Analyzing the errorlog

How often do you do this? Look over each message (type) in the errorlog file and determine whether this is something you want to act on. Sure, some (but not all) of you have some monitoring solution in place, but are you 100% confident that it really will notify for all messages that you might find interesting? That there isn’t even one little message hiding in there that you would find valuable knowing about? Or how about messages that you typically don’t are about, but knowing that you have a high frequency can be valuable information?

So, this boils down to actually reading the errorlog file. Some of you probably already have scripts and tool that makes this easier than just reading every simple message from top to bottom. I wanted to share how I do it, and this is why I wrote my Analyze SQL Server logs article. Check it out. And, feedback is always welcome!

List columns where collation doesn’t match database collation

Below script lists all database/table/column where the column collation doesn’t match the database collation. I just wrote it for a migration project and thought I’d share it. I’m sure lots of tings can be improved, but below worked just fine for me for a one-time execution on a number of servers.

IF OBJECT_ID('tempdb..#res') IS NOT NULL DROP TABLE #res
GO

DECLARE
@db sysname
,@sql nvarchar(2000)

CREATE TABLE #res(server_name sysname, db_name sysname, db_collation sysname, table_name sysname, column_name sysname, column_collation sysname)

DECLARE c CURSOR FOR
SELECT name FROM sys.databases WHERE NAME NOT IN('master', 'model', 'tempdb', 'msdb') AND state_desc = 'ONLINE'

OPEN c
WHILE 1 = 1
BEGIN
FETCH NEXT FROM c INTO @db
IF @@FETCH_STATUS <> 0
BREAK
SET @sql =
'SELECT
@@SERVERNAME AS server_name
,''' + @db + ''' AS db_name
,CAST(DATABASEPROPERTYEX(''' + @db + ''', ''Collation'') AS sysname) AS db_collation
,OBJECT_NAME(c.object_id, ' + CAST(DB_ID(@db) AS sysname) + ') AS table_name
,c.name AS column_name
,c.collation_name AS column_collation
FROM ' + QUOTENAME(@db) + '.sys.columns AS c
INNER JOIN ' + QUOTENAME(@db) + '.sys.tables AS t ON t.object_id = c.object_id
WHERE t.type = ''U''
AND c.collation_name IS NOT NULL
AND c.collation_name <> CAST(DATABASEPROPERTYEX(''' + @db + ''', ''Collation'') AS sysname)
'
--PRINT @sql
INSERT INTO #res
EXEC(@sql)
END
CLOSE c
DEALLOCATE c
SELECT * FROM #res

Send email after Agent job execution

After executing a job, I want to have an email with the following information:

  • Success or failure in the mail subject (not just in the body)
  • Success or failure and execution time for each job step in mail body
  • Have output file(s) for the job steps attached to email

Agent has none of above built-in so I created my own procedure for this. This is meant to be added as a final job step for your jobs. You find all details and the procedure here: http://karaszi.com/mailafterjob-send-email-after-agent-job-include-ouput-files. Feedback is of course very welcome.

Who owns your jobs?

Yep, another Agent blog. The topic of job owner has been discussed here before, but this is a specific situation, which potentially is a ticking bomb. First a quick recap of how agent does things:

When a job is owned by somebody who is sysadmin, then no impersonation is performed.

  • T-SQL jobsteps will be executed as Agent’s own Windows authenticated login.
  • Other jobsteps (which operates as the OS level) is done in Agent’s service account’s context.

Above is, assuming you didn’t ask for impersonation for the jobstep (Run As).

Last week, at a SQL Server admin class, Andreas Jarbéus from Swedish Match asked about job owned by Windows account and what happens when that person leaves the company. I was about to say “no problem, if the job owner is sysadmin”, but I felt there was a bit more to it. The scenario is:

  • The job owner is member of a Windows/AD group. Say you have a “SQL Server DBA” group in your AD.
  • This group is added as a login to your SQL Servers. The persons in that group do not have individual logins in SQL Server – only through this group.
  • This group is sysadmin. (I’m not saying that having jobs in general owned by sysadmins is a best practice. You can for instance have vendors installing some software and you don’t want to make them sysadmin. In such a case, clearly we don’t want the job to keep running if that person would disappear. So, I’m targeting jobs owned by sysadmin here, and I would bet you have bunch of those.)
  • Somebody in this group creates an Agent job.
  • The person in the group leaves the company and is removed from this Windows/AD group.

Above is a pretty common scenario. The thing now is that the job no longer works. And here’s the reason:

If the owner was somebody in Windows/AD who’s SID exists in SQL Server (the individual was added as a login to SQL Server), then the job will still work. This also applies for SQL Logins which might be disabled (like “sa” for instance). The reason for this is that the SID for the job owner exists in sys.server_principals and SQL Server will know that this SID is sysadmin. Since Agent won’t do impersonation for sysadmins, there’s no need to “go out to Windows” and see if this SID exists in Windows/AD. Remove the login, and the job will stop working, though – but you have at least don’t something explicitly inside your SQL Server to cause this (removed the login).

But when the owner’s SID don’t exist in sysadmin we have a problem. The problem is that the recorded owner of the job is the SID for the Windows user, even though that SID doesn’t exist in sys.server_principals. As long as this still exists in Windows/AD, the job will work just fine. Agent will run this job owned by ?, ask Windows who this SID is and see that this SID is a member of a group which exists as a login in SQL Server and that is sysadmin. Now, imagine what happen if the SID doesn’t exist in Windows anymore. The job is owned by ?, and that is all that SQL Server knows. The SID no longer exist in Windows/AD so you there’s nobody to tell SQL Server “Yeah, this used to be a sysadmin in SQL Server – through a group membership in Windows/AD”.

Even worse, the job seems to keep working until you re-start the SQL Server service, some caching going on. Makes it even more difficult to determine why the job suddenly stops working: “We deleted this Windows account 4.5 months ago.” This is easy to repro, and you don’t even need a domain (local Windows accounts and groups work just fine):

  • Create a Windows group for this.
  • Create a Windows account for this.
  • Add the Windows account to above group.
  • Add above group as a login to SQL Server.
  • Make this group sysadmin.
  • Create a job, with two jobsteps:
  • T-SQL which executes: PRINT ORIGINAL_LOGIN() + ‘ | ‘ + SUSER_SNAME() + ‘ | ‘ + USER_NAME()
  • CmdExec which executes: WHOAMI.EXE
  • (Above isn’t really necessary, but useful for testing other impersonation aspects of Agent)
  • Specify the Windows account as owner of the job.
  • Run the job, make sure it works fine.
  • Remove the Windows account from the Windows group.
  • Run the job and see it fails.

I’m sure this has been blogged before, but I was a bit surprised by it. I have now added a check for this in my “Check best practices” stored procedure I run every day… On the flip side, this doesn’t seem to happen for multi-server (MSX) jobs, the ownership seems to have special handling for these – you will see if you check the owner_sid in sysjobs.

Agent Alerts Management Pack updated

I have updated the Agent Alerts Management Pack, cleaned up the T-SQL code which creates the alerts. You can now just copy, paste and execute that code. Earlier, for some of the some of the alert definitions it just generated the calls to the query result window, and you had to copy this to a query window and execute it. Now you just copy the code and execute it. You find it here: http://karaszi.com/agent-alerts-management-pack.

Restricting memory using Resource Governor

You might know that Resource Governor (RG) can be used to allow you to affect resource allocation inside SQL Server. RG was introduced with SQL Server 2008 and requires Enterprise Edition or similar/higher. There are two things you can govern using RG: cpu and memory.

For introductory information on RG, see for instance these BOL topics.

A misconception I see from time to time is that the memory setting is to restrict the Buffer Pool, BP, (mainly used for caching pages from disk). This seems difficult since a page in BP has a cost, but isn’t owned by whoever brought it into cache or who referenced it last time. So, it seems difficult with current implementation of cache handling and ageing to govern the BP memory for RG. What RG does govern is “working memory” for a query. Think memory used for sorting, locks, hashing and such.

We just had such a discussion in the MSDN forums, and I decided to do a test to see if we could show that RG do not govern the BP. Brief outline of the script

  • Restrict the BP to 300 MB
  • Create two databases (a and b)
  • Create a table with size of 255MB in each database
  • Create two logins with name a and b, which will be used to execute queries
  • The logins end up with two different resource pools (aPool and bPool), where aPool has max mem 80% and bPool has max mem 20%
  • A loop which generates queries that are executed using xp_cmdshell and SQLCMD
  • The loop first scan over the first approx 9% of data, first login a in database a, then in login b in database b
  • Next iteration is then done over the following 9 % of data
  • After 11 iteration, we’ve done most data
  • Finally check how much memory in the BP each database is using

If RG were to govern the BP, we expect to see database a using lots of more pages than database b. That not what we are seeing. You will probably see that database b is using slightly more memory than a. That is because you done exactly the same queries over exactly the same data for each database and the most recent database you queried was database b (pushing out some of the pages with same cost that database a had in cache). You might want to execute the loop a few times to possibly see things even out a bit.

Seeing is believing, they say, and to me this show that Resource Governor do not govern the Buffer Pool.

If you believe there are faults in how the script does things, feel free to discuss. If you have some alternative script you want us to try, I suggest you open a thread in the MSDN forum (for example) and there post a script which is pretty immediately executable for us(possibly with modifications of file path and server name). Unless the changes you propose are so minor that can be immediately explained in your text.

Use script at own risk.


USE MASTER
SET NOCOUNT ON

--Configure Buffer Pool for max 300 MB memory
EXEC sp_configure 'max server memory', 300
RECONFIGURE

--Drop and create databases a and b
IF DB_ID('a') IS NOT NULL DROP DATABASE a
IF DB_ID('b') IS NOT NULL DROP DATABASE b
GO
CREATE DATABASE a
ON  PRIMARY
(NAME = N'a', FILENAME = N'R:\a.mdf' , SIZE = 250MB , FILEGROWTH = 5MB )
LOG ON
( NAME = N'a_log', FILENAME = N'R:\a_log.ldf' , SIZE = 30MB , FILEGROWTH = 5MB)

CREATE DATABASE b
ON  PRIMARY
(NAME = N'b', FILENAME = N'R:\b.mdf' , SIZE = 250MB , FILEGROWTH = 5MB )
LOG ON
( NAME = N'b_log', FILENAME = N'R:\b_log.ldf' , SIZE = 30MB , FILEGROWTH = 5MB)
GO
ALTER DATABASE a SET RECOVERY SIMPLE
ALTER DATABASE b SET RECOVERY SIMPLE

--Create tables and populate with data, approx 250 MB each
CREATE TABLE a..t(c1 INT IDENTITY PRIMARY KEY CLUSTERED, c2 INT, c3 CHAR(100))
GO
INSERT INTO a..t (c2, c3)
SELECT TOP (90000) ROW_NUMBER() OVER(ORDER BY a.OBJECT_ID), 'hello'
FROM sys.columns AS a CROSS JOIN sys.columns AS b
GO 25
CREATE TABLE b..t(c1 INT IDENTITY PRIMARY KEY CLUSTERED, c2 INT, c3 CHAR(100))
GO
INSERT INTO b..t (c2, c3)
SELECT TOP (90000) ROW_NUMBER() OVER(ORDER BY a.OBJECT_ID), 'hello'
FROM sys.columns AS a CROSS JOIN sys.columns AS b
GO 25

--Logins
IF SUSER_ID('a') IS NOT NULL DROP LOGIN a
IF SUSER_ID('b') IS NOT NULL DROP LOGIN b
GO
CREATE LOGIN a WITH PASSWORD = 'pwd'
CREATE LOGIN b WITH PASSWORD = 'pwd'

--Users
USE a
IF DATABASE_PRINCIPAL_ID('a') IS NOT NULL DROP USER a
GO
CREATE USER a
GO
GRANT SELECT ON t TO a
GO
USE b
IF DATABASE_PRINCIPAL_ID('b') IS NOT NULL DROP USER b
GO
CREATE USER b
GO
GRANT SELECT ON t TO b
GO

--Configure RG:
USE MASTER
ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = NULL);
ALTER RESOURCE GOVERNOR RECONFIGURE;
IF EXISTS(SELECT * FROM sys.resource_governor_workload_groups WHERE name = 'aGroup')
DROP WORKLOAD GROUP aGroup
IF EXISTS(SELECT * FROM sys.resource_governor_workload_groups WHERE name = 'bGroup')
DROP WORKLOAD GROUP bGroup
IF EXISTS(SELECT * FROM sys.resource_governor_resource_pools WHERE name = 'aPool')
DROP RESOURCE POOL aPool
IF EXISTS(SELECT * FROM sys.resource_governor_resource_pools WHERE name = 'bPool')
DROP RESOURCE POOL bPool

CREATE RESOURCE POOL aPool WITH (MAX_MEMORY_PERCENT = 80);
CREATE RESOURCE POOL bPool WITH (MAX_MEMORY_PERCENT = 20);

CREATE WORKLOAD GROUP aGroup USING aPool;
CREATE WORKLOAD GROUP bGroup USING bPool;

IF OBJECT_ID ('dbo.MyClassifier') IS NOT NULL DROP FUNCTION dbo.MyClassifier;
GO

CREATE FUNCTION dbo.MyClassifier ()
RETURNS SYSNAME WITH SCHEMABINDING
AS
BEGIN
DECLARE @GroupName SYSNAME;
IF SUSER_SNAME() = 'a'
SET @GroupName = 'aGroup';
ELSE IF SUSER_SNAME() = 'b'
SET @GroupName = 'bGroup';
ELSE SET @GroupName = 'Default';
RETURN @GroupName;
END;
GO

ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = dbo.MyClassifier);
GO
ALTER RESOURCE GOVERNOR RECONFIGURE;
GO

--Allow xp_cmdshell
EXEC sp_configure 'xp_cmdshell', 1
RECONFIGURE

--Execute queries as A and B to read bunch of data
--We have 2,250,000 in the tables
--We will construct this command: SELECT COUNT(c2) FROM a..t WHERE c1 BETWEEN 1 AND 200000
--The start and end values will differ, with ranges in 200,000 increments. 11 loops will add up to 2,000,000.
DECLARE @v INT = 1, @sql VARCHAR(2000)
WHILE @v <= 2200000
BEGIN
SET @sql = 'SQLCMD /Slocalhost\a /Ua /Ppwd /Q"SELECT COUNT(c2) FROM a..t WHERE c1 BETWEEN ' + CAST(@v AS VARCHAR(20)) + ' AND ' + CAST(@v + 199999 AS VARCHAR(20))+ '"'
EXEC xp_cmdshell @sql, no_output
SET @sql = 'SQLCMD /Slocalhost\a /Ub /Ppwd /Q"SELECT COUNT(c2) FROM b..t WHERE c1 BETWEEN ' + CAST(@v AS VARCHAR(20)) + ' AND ' + CAST(@v + 199999 AS VARCHAR(20))+ '"'
EXEC xp_cmdshell @sql, no_output
SET @v = @v + 200000
END

--Check how many pages from each db in BP:
SELECT
DB_NAME(b.database_id) AS database_name
,(COUNT(*) * 8192) / (1024 * 1024)) AS buffer_count_MB
FROM  sys.dm_os_buffer_descriptors AS b
GROUP BY  b.database_id
ORDER BY database_name

EXEC sp_configure 'xp_cmdshell', 0
RECONFIGURE

/*--CleanupDROP DATABASE aDROP DATABASE bALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = NULL);ALTER RESOURCE GOVERNOR RECONFIGURE;DROP WORKLOAD GROUP aGroupDROP WORKLOAD GROUP bGroupDROP RESOURCE POOL aPoolDROP RESOURCE POOL bPool*/