SCOM can cause Unchecked Growth in SSODB

Ok, that’s a bit of an attention grabbing headline, so let me clarify that statement: SCOM can cause unchecked growth in SSODB if you’re not regularly backing up the SSODB transaction log.

We encountered this one today – a client’s SSODB ran out of space overnight, causing the BizTalk environment to shut-down. On further investigation, it would appear that every time SCOM checks the health of Enterprise Single Sign-On, an entry is recorded in the SSOX_AuditXpLookup table:

ESSO appears to be clever enough to manage the size of this table, truncating it every 30 minutes, however this doesn’t help if you’re not managing the size of the database transaction log through backups. To quote the SQL Server documentation:

If log records were never deleted from the transaction log, it would eventually fill all the disk space that is available to the physical log files. Log truncation automatically frees space in the logical log for reuse by the transaction log.

All the more reason to enable and run the supplied Backup BizTalk Server job to help maintain the health of your BizTalk environment.

If you’re looking for more information on the Backup BizTalk Server Job, take a look at my series of posts on the topic: http://www.modhul.com/2009/06/29/configuring-biztalk-for-disaster-recovery-part-1/

Hat-tip to this MSDN Forums posting that helped us identify the issue.

BizTalk DR: Configuring BizTalk Backup for Disaster Recovery – Part 2

Originally posted by Nick Heppleston at: http://www.modhul.com/2009/09/04/configuring-biztalk-backup-for-disaster-recovery-part-2/

This is the second of a three-part series covering the BizTalk Backup Process, where I’m looking at one of the two supported methods to backup and restore a BizTalk Server environment – Log Shipping. The series will cover the following topics:

  • In Part 1, I’ll provide a high-level overview of the backup and restore process. Part 1 is already published, go an read it now!
  • In Part 2, I’ll walk through an actual backup and restore setup, looking at the configuration steps as we go.
  • In Part 3, I’ll demonstrate a long-running BizTalk application that is restored on a sample disaster recovery environment using the backup and restore process configured in Part 2.

Please accept my apologies for the wide screenshots in this part of the series. Please also note that I have not repeated the configuration steps on the MSDN website – this is an excellent resource and I do not wish to repeat or plagarise this work. In this part of the series, I have simply added my own observations that I thought were lacking from the MSDN text.

BizTalk Log Shipping – A Quick Recap

Before we look at how the BizTalk Log Shipping process is configured, let us quickly recap on the process; for a more detailed overview, I recommend that you read Part One of this series of blog posts, specifically the section which covers The Backup and Restore Process.

The backup and restore process (BizTalk Log Shipping) is as follows:

  • The BizTalk Backup SQL Agent Job creates data and transaction-log backups from the production environment, placing log marks into the transaction-log backups. An audit table is maintained which lists the backups that have been created, the type of backup (data or transaction-log backup) and details the location of the backup files. The physical data- and log-files are then pushed to a UNC share, as shown below:

BizTalk Backup Process Overview

  • The standby environment reads the audit table maintained by the production environment, using this information to determine which data- and transaction-log files need to be restored. These files are restored from the UNC share using the SQL ‘NORECOVERY’ option, leaving the databases in a non-operational state, allowing additional transaction-logs to be applied. Note: the transaction-logs are not restored to the transaction-log mark in this step, instead the full transaction-log is restored:

BizTalk Backup Process Overview

These two steps continue throughout live operations. In the event of a disaster recovery situation, where the production environment has been lost and the standby environment needs to be brought online, three recovery steps must be performed on the standby environment:

  • The most recent transaction-log backup is restored to its transaction-log mark using the SQL ‘WITH RECOVERY’ option, placing the databases into an operationally ready state.
  • Database and registry updates statements are run to re-configure the restored databases to use the standby environment.
  • BizTalk services are brought online (including the Enterprise Single Sing-On server and Business Rules Engine service etc.) and the standby environment is made live as the production environment.

Once the live environment is recovered and the switch from the standby environment back to the live environment is ready to be made, the process is reversed, with the standby environment creating backups that are restored to the live environment.

In this post, we’ll look at how to configure each of these steps.

BizTalk Log Shipping – Prerequisites

Several prerequisites need to be setup before configuring BizTalk Log Shipping. I would recommend that you spend time to ensure that these areas work, as they will save you a great deal of pain when configuring log shipping. They include:

  1. Before starting to configure the BizTalk Log Shipping process, ensure that the production and standby servers can communicate with each other over the network;
  2. The production and standby SQL Server instances need to communicate with each other via MSDTC (Microsoft Distributed Transaction Coordinator). There are a number of excellent resources on the internet addressing MSDTC configuration issues including Troubleshooting Problems with MSDTC on the MSDN website. Ensure that the two serverscan communicate over MSDTC before starting to configure Log Shipping;
  3. The BizTalk backup and restore SQL Agent jobs should run as under an isolated user account – choose a Windows user account for this purpose and create a SQL Server login for this account. Map this SQL Server login to the BTS_BACKUP_USERS database role in the BizTalk Server databases; and finally
  4. It is recommend that the data- and transaction-log backup files are written to a highly-available UNC share. Ensure that an UNC share is created that is accessible by both the production and standby servers and that the necessary permissions are assigned. Check that you can read, write and delete files from this share using Windows Explorer on both  servers.

Configuring The Backup Job

The first task in configuring BizTalk Log Shipping is to configure the SQL Agent ‘Backup BizTalk Server (BizTalkMgmtDb)’ backup job on the production environment. This job is created by the BizTalk installer and is responsible for a number of tasks, including:

  • Creating a full database backup at the specified interval;
  • Writing a transaction-log mark to the transaction-log and creating a backup of the transaction-log; and finally
  • Clearing the backup history table, based on specified criteria.

To configure the job, follow the instructions detailed in the excellent How to Configure the Backup BizTalk Server Job on the MSDN website.

When configuring the backup job, remember the following:

  • The Frequency parameter denotes the frequency at which a full backup is taken. By default, this is set to d (daily) which is sufficient for most BizTalk environments. If you require more frequent full backups, change this accordingly – other values include h (hourly), w (weekly), m (monthly), or y (yearly). Note that irrespective of what frequency you set here, transaction log backups will occur every time this job is run.
  • The (optional) Force full backup after partial backup failures parameter instructs the backup job to create a full backup if a transaction-log backup failed the last time the job ran, ignoring the Frequency setting. For example, if you use a setting of ‘daily’ for the Frequency parameter and the job fails to create a full set of transaction-log backups, the next time the job runs, it will create a full-backup followed by a further transaction-log backup; a second full-backup followed by a transaction-log backup will be created when the job runs again at midnight (UTC time) to honour the Frequency setting.
  • When setting the Location of backup files parameter, ensure that the user who is running this SQL Agent job has sufficient privileges to write to that location.

By default, this job runs once every fifteen minutes; this means that a transaction-log backup will be created and restored once every fifteen minutes (once you have configured the log shipping restore jobs, which execute once every minute). If you want to reduce delay in restoring these transaction-logs to your standby environment, reduce the scheduled frequency of the backup job. I would however recommend that this job is run no more frequently than once every three minutes; you will however need to find a happy medium that works best in your environment – this will be based on the load on your BizTalk Server and the rate at which your SQL Server transaction-logs grow.

Once you have configured the backup job, execute it starting at step one and ensure you have backup data- and log-files in your UNC share location, and entries in the BizTalkMgmtDb.dbo.adm_BackupHistory table, as follows:

BackupHistory

These entries in the Backup History table will be read by the restore jobs on the standby SQL Server and used to identify which data and log files need to be restored. Of particular interest in the above screenshot of the Backup History table are the following columns:

  • The BackupSetId column, which is used during the restore to identify an entire backup-set;
  • The MarkName column, which details the Transaction-Log Mark Name given to transaction-logs (Notice how the full database-backup has a value of NULL instead of a a log mark name, as these cannot be applied to full backups);
  • The BackupFilename and BackupFileLocation columns, which are used by the restore jobs to locate the data- and log-files to restore;
  • The BackupType column, which is used by the restore jobs to identify the type of backup the backup set relates to; and finally
  • The SetComplete column, which is used by the (optional) Force full backup after partial backup failures parameter of the backup job to determine whether a full or partial backup was taken during the last execution.

One final point to note about the ‘Backup BizTalk Server (BizTalkMgmtDb)’ backup job: if the SQL Agent Service or the backup-job is disabled and misses the time-window in which the full-backup should be taken (based on the Frequency parameter), the next execution of the job will cause a full backup to be created. For example, if you use a setting of ‘daily’ for the Frequency parameter and the job is disabled between 2000 and 0200 (i.e. during the time window that the daily full-backup will occur – 0000UTC), when the job is enabled, it will detect that its ‘frequency’ has been missed and cause a full-back to be created, followed by a transaction-log backup.

Forcing a Full Backup

There may be cases where you wish to perform an ad-hoc full-backup, outside of the Frequency window specified in Backup BizTalk Server SQL Agent Job – for example, you may want to take a full backup before applying a hotfix or service pack, or when replacing hardware.

When a backup is forced, the Frequency window is ignored next time the Backup BizTalk Server job executes and both data- and log-file backups are generated. If a problem occurs in the scenarios described above and the databases need to be recovered, only one transaction log needs to be restored for each database, reducing the recovery time significantly.

To force a full backup, execute the BizTalkMgmtDb.dbo.sp_ForceFullBackup stored procedure; the next time the Backup BizTalk Server job executes, both data- and log-file backups will be generated.

Configuring BizTalk Log Shipping – The Restore Jobs

We’re now ready to configure the standby environment to automagically restore our data- and log-file backups that are created by our production environment.

Several new jobs will be created in the standby SQL Server environment to handle the restoring of backups (along with a number of tables and stored procedures in the master database, which are used to hold information about the Log Shipping process). The jobs in particular are responsible for a number of tasks, including:

  • Retrieving the backup history from the production environment;
  • Restoring the data- and transaction-log backup files to the standby SQL Server in a non-operational state, allowing further transaction-logs to be restored; and finally
  • Restoring all of the databases to an operational state, by restoring the next available transaction-log to their last log mark only when disaster recovery is invoked and the standby SQL Server environment is to be made operational.

For more detailed information on these jobs, I recommend that you read the first part of this series, specifically the section which covers the Restore Process and the BizTalk Log Shipping Jobs.

To configure log shipping on the standby environment, follow the instructions detailed in the excellent How to Configure the Destination System for Log Shipping on the MSDN website.

When configuring the backup job, remember the following:

  • The standby environment will try and read the data- and log-files from the same location as they were written to by the backup job; it is therefore prudent to use a UNC share location (mapped to a drive letter if necessary) that both the standby and production environments have read/write permissions to.
  • If you encounter problems executing the bts_ConfigureBizTalkLogShipping stored procedure, ensure that you have correctly configured MSDTC between the production and standby environments; additionally, check that you have enabled Ad Hoc Distributed Queries on the standby SQL Server (if using SQL Server 2005/2008) as this stored procedure executes distributed queries over a linked server to the production SQL Server.
  • If you need to re-configure your standby environment, re-run the configuration process including starting with the LogShipping_Destination_Schema.sql and LogShipping_Destination_Logic.sql before executing the bts_ConfigureBizTalkLogShipping stored procedure.

A note about the Physical Location of Database Files when Restoring

If you have been around SQL Server for any length of time, you will have come across some of the more ‘helpful’ features of restoring data- and transaction-log files – when you restore a backup, SQL Server will attempt to re-create the database on the standby environment in exactly the same directory structure as the data- and log-file/s appeared on the production SQL Server.

If the directory structure doesn’t exist on the standby server exactly as it did on the production server, you will receive an error similar to the following and the restore will fail:

Executed as user: BIZTALKDRBizTalkBackup. Directory lookup for the file “D:MSSQL.4MSSQLDATABizTalkMgmtDb.mdf” failed with the operating system error 3(The system cannot find the path specified.). [SQLSTATE 42000] (Error 5133) File ‘BizTalkMgmtDb’ cannot be restored to ‘D:MSSQL.4MSSQLDATABizTalkMgmtDb.mdf’. Use WITH MOVE to identify a valid location for the file. [SQLSTATE 42000] (Error 3156) Directory lookup for the file “D:MSSQL.4MSSQLDATABizTalkMgmtDb_log.LDF” failed with the operating system error 3(The system cannot find the path specified.). [SQLSTATE 42000] (Error 5133) File ‘BizTalkMgmtDb_log’ cannot be restored to ‘D:MSSQL.4MSSQLDATABizTalkMgmtDb_log.LDF’. Use WITH MOVE to identify a valid location for the file. [SQLSTATE 42000] (Error 3156) Problems were identified while planning for the RESTORE statement. Previous messages provide details. [SQLSTATE 42000] (Error 3119) RESTORE DATABASE is terminating abnormally. [SQLSTATE 42000] (Error 3013). The step failed.

There are a number of ways we can resolve this, including using the WITH MOVE statement, as suggested in the error text, however this would require us to make changes to the stored procedure that is invoked by the Restore Databases SQL Agent job – this would invalidate our warranty with Microsoft as changes to the SQL Server portion of the codebase are not supported.

SQL Server uses several techniques to determine the physical location that databases should be restored to (further information can be found in Microsoft Knowledgebase Article 221465); the easiest method is to perform an implicit ‘MOVE’ – if the database already exists on the standby environment, but at a different location than specified in the backup set, the location on the server is used during the restore.

I would therefore recommend that you create the various BizTalk databases (including the Business Rules Engine Db and SSO Db) in their correct locations on the standby environment before you enable the restore job, this way SQL Server will restore the databases to your preferred location.

Testing the Restore Functionality

With log shipping successfully configured, enable the BTS Log Shipping – Get Backup History (DBServer: “[Database Server Name]”, DBName: “[Database Name]”) SQL Agent job to retrieve backup history information from the production environment.

If everything is configured correctly, you will see entries in the standby SQL Server master.dbo.bts_LogShippingHistory table which show the backups that are yet to be restored (see the Restored and RestoredDateTime columns highlighted in Green). In the following screenshot, there is one full database backup, plus three transaction-log backups to be restored.

LogShippingHistory-DatabasesToBeRestoredv2

With the backup history successfully retrieved from the production environment, we’re now ready to start restoring the data- and log-files to our standby environment.

Enable the BTS Log Shipping – Restore Databases (DBServer: “[Database Server Name]”, DBName: “[Database Name]”) SQL Agent job. As described in the first post of this series, this job restores the BizTalk databases to a non-operational state, allowing further transaction-logs to be restored. The job uses the  ‘NORECOVERY’ option, leaving each database in the ‘Restoring…’ state, thereby allowing additional transaction-logs to be applied.

In the screenshot below, the Restore Databases job has been enabled and has executed, restoring the full backup set (BackupType ‘Db’) and two transaction-log backup sets (BackupType ‘Lg”) as highlighted in Green; the transaction log set that has not been restored – highlighted in Orange – is the N-1 backup-set that is only restored when the standby environment is to be brought online as the new production environment; this is to ensure that we always have one transaction-log that can be restored to the transaction-log mark. Also note that the RestoredDateTime column is populated with the UTC time, not the actual system time.

LogShippingHistory-RestoredDatabasesv2

One final point to take away when restoring databases is the way in which the restore job handles full backups (BackupType ‘Db’). If the production environment creates a full backup – either because the Backup job has crossed its Frequency window, or the BizTalk Administrator has forced a full backup – the Log Shipping Restore job will only restore the transaction log created during the full backup – it skips the data-file backup as this isn’t required to maintain database integrity. As demonstrated in the screenshot below, the transaction logs before and after the full backup have been restored (highlighted in Green), however the data-file backup hasn’t been restored (highlighted in Purple); finally, the N-1 transaction log set is yet to be restored (highlighted in Orange):

LogShippingHistory-RestoredDatabasesFullBackupNotRestored

Cleaning the Standby Environment

You may want to clean the standby environment without performing a full reconfiguration of BizTalk Log Shipping, so that the restore process starts from scratch: the databases are re-created and restoration starts as if it is the first time it has run.

To achieve this, execute the master.dbo.bts_LogShippingClean stored procedure on the standby environment. This stored procedure deletes all references to the previously restored backup-sets and physically deletes the databases that are in the ‘restoring’ state. The bts_LogShippingClean stored proc accepts a single parameter @SourceId; the value you should use for this  parameter is based on your log shipping configuration which is available in the master.dbo.bts_LogShippingDatabases table, an example of which is shown below:

CleanupLogShipping-SourceId

These two actions force the BTS Log Shipping – Restore Databases job to re-create the databases in a restoring state by applying the last full backup, followed by subsequent transaction log backups the next time it runs. This is demonstrated in the screenshot below – the data-file and transaction-log backups are restored (highlighted in Green), with the N-1 transaction log set is yet to be restored (highlighted in Orange); also notice how the full- and transaction log backups are restored sequentially at the same time, within a few seconds of each other (see the RestoredDateTime column):

LogShippingHistory-LogShippingCleanFullBackupRestored

Handling Partial Backup-Sets

One of the most common errors I encountered while putting together this series of blog posts related to partially restored backups-sets. The SQL Agent BTS Log Shipping – Restore Databases job automagically detects this condition and raises an error in the Event Log similar to the following:

Executed as user: BIZTALKDRSqlAgent. A partial set has been detected but a valid full backup set does not exist yet.  Please force a full backup on the source server. [SQLSTATE 42000] (Error 50000)  @LastSetId: 487 [SQLSTATE 01000] (Error 0)  @RestoredToMark: 0 [SQLSTATE 01000] (Error 0)  Here101 [SQLSTATE 01000] (Error 0)  @NextSetId: 488 [SQLSTATE 01000] (Error 0)  Here102 [SQLSTATE 01000] (Error 0)   [SQLSTATE 01000] (Error 0)  Statement2: 1 [SQLSTATE 01000] (Error 0)  Statement3: 1 [SQLSTATE 01000] (Error 0).  The step failed.

The restore job identifies this error condition by simply checking for a backup set that has both restored and non-restored entries, as shown in the screenshot below (highlighted in Orange):

LogShippingHistory-PartialRestore

When a partially restored backup-set is detected, the restore job will attempt to identify and restore a more recent full data- and transaction-log backup, thereby minimizing the synchronization gap between the production and standby environments. If a more recent full backup cannot be found, the job will fail reporting the error detailed above and either wait until a new full backup is created as the backup job passes its Frequency window, or when the BizTalk Administration forces a full backup.

Wrap-Up

So that’s the end of Part 2. We’ve walked through configuring the backup and restore jobs and looked at how to check that these jobs are working correctly, how to force full backups and problems caused by partial restores.

In Part 3, we’ll demonstrate a long-running BizTalk application that is restored and brought operational on a standby environment using the instructions detailed here.

Postscript: Creating a Sandbox Environment to Test Log Shipping

Before jumping into configuring Log Shipping on a production environment, I would suggest that you become familiar with how the process itself works, either in a test or sandboxed environment. In writing this series of blog posts, I created a small VirtualBox based virtual-machine sandbox with which I could test the various scenario’s I planned on writing about. This sandbox consisted of:

  • A ‘production’ server hosting BizTalk and the BizTalk databases in an operational state;
  • A ‘standby’ server hosting BizTalk and the restored BizTalk databases;
  • A standalone server acting as an Active Directory Domain Controller.

All three servers communicate through a private VirtualBox network and cannot be seen outside of the physical host machine. The ‘production’ and ‘standby’ servers run Windows Server 2008, plus BizTalk Server 2009 and SQL Server 2008 (all Enterprise Edition), running with 512Mb RAM; the AD server runs Windows Server 2008 Core, running with 256Mb RAM. All three virtual-machines run extremely well given their small amount of RAM, however they need to be located on a separate HDD to the physical machine’s OS otherwise heavy I/O thrashing causes a significant degradation in performance.

Reblog this post [with Zemanta]

BizTalk DR: Configuring BizTalk Backup for Disaster Recovery – Part 1

Originally posted by Nick Heppleston at: http://www.modhul.com/2009/06/29/configuring-biztalk-for-disaster-recovery-part-1/

I’m often surprised by the lack of disaster recovery capability organisations have for BizTalk, the novel (read: ‘unsupported’) approaches taken to provide a theoretical level of cover that is then never tested, or the lengthy procedures (24 hours +) for a task that should be as simple as pressing ‘the big read panic button’ when things go horribly wrong.

In this three-part series on the BizTalk Backup and Restore process, I’m going to look at one of the two supported methods to backup and restore a BizTalk Server environment – Log Shipping. The series will cover the following topics:

  • In Part 1, I’ll provide a high-level overview of the backup and restore process.
  • In Part 2, I’ll walk through an actual backup and restore setup, looking at the configuration steps as we go.
  • In Part 3, I’ll demonstrate a long-running BizTalk application that is restored on a sample disaster recovery environment using the backup and restore process configured in Part 2.

BizTalk Backup and Restore – How Distributed Transactions Complicate Things

Before we look at how the BizTalk backup and restore process works, lets take a step back and look at why BizTalk’s use of distributed transactions mean we must use the supplied backup and restore jobs.

Distributed Transactions and Transactional-Log Marking

BizTalk uses distributed transactions to maintain consistency of its actions across several databases. To ensure that these distributed transactions are consistent across all databases after a restore, the BizTalk Backup Job uses Transactional-Log Marking to mark the transaction-logs of all BizTalk databases participating in the backup.

The marking transaction occurs when the BizTalk Backup Job runs: the job blocks new distributed transactions from starting, waits for currently running distributed transactions to complete, and then executes to place the marking transaction; once complete, distributed-transactions waiting for the marking process to finish can continue processing.

If we didn’t use transactional-log marking, the following scenario may occur: we have two databases which are being written to as part of a distributed transaction: if a transaction-log backup is taken (without using transaction log marking) from database-one before the transaction-log backup is taken on database-two, when we restore the two databases the distributed transaction is in an inconsistent state – committed on database-one, but there is no evidence of it on database-two.

The important point to consider here is that because of the distributed transactions, the databases need to be restored in the same transactionally consistent state. If we did not use the BizTalk Backup Jobs and the transaction-log marking feature, we cannot acheive this consistency.

Why is SQL Server Mirroring not Supported?

So why can’t we just use SQL Server Mirroring to transfer the transaction-logs from the live to disaster-recovery environments? Unfortunately, SQL Server Mirroring does not support distributed transactions: after a failover event, the new principal server contacts the MSDTC service, however the service has no knowledge of the new principal server. As a result, the MSDTC service unexpectedly terminates any transactions that are “preparing to commit”, while other transactions may already have comitted.

If you’re tempted to use mirroring to backup your BizTalk databases, don’t! Mirroring is not supported by Microsoft as a backup strategy. As old-fashioned as it is, Log Shipping is fully supported in all versions and all editions of SQL Server, the log shipping jobs are relatively easy to configure and provides all of the distributed transaction/transactional-log marking functionality required to ensure a consistent restore in a disaster recovery situation.

Further information on mirroring and distributed transactions can be found on MSDN: Database Mirroring and Cross-Database Transactions.

The Backup and Restore Process

The backup and restore process – otherwise known as ‘Log Shipping’ in MSDN documentation – is relatively easy; we’ll cover the various steps in more detail below, however to give you an overview, the process is as follows:

  • The BizTalk Backup SQL Agent Job creates data- and transaction-log backups from the production environment, placing log marks into the transaction-log backups. An audit table is maintained which lists the backups that have been created, the type of backup (data or log backup) and details the location of the backup files.
  • The BizTalk disaster recovery environment reads the audit table maintained by the production environment, using the information to determine which data- and transaction-log files need to be restored. These files are restored using the SQL ‘NORECOVERY’ option, leaving the databases in a non-operational state, allowing additional transaction-logs to be applied. Note: the transaction-logs are not restored to the transaction-log mark in this step, instead the full transaction-log is restored.

These two steps continue throughout live operations. In the event of a disaster recovery situation, where the production environment has been lost and the standby environment needs to be brought online, three recovery steps must be performed on the standby environment:

  • The most recent transaction-log backup is restored to its transaction-log mark using the SQL ‘WITH RECOVERY’ option, placing the databases into an operationally ready state.
  • Database and registry updates statements are run to re-configure the restored databases to use the standby environment.
  • BizTalk services are brought online (including the Enterprise Single Sing-On server and Business Rules Engine service etc.) and the standby environment is made live.

Once the live environment is recovered and the switch from the standby environment back to the live environment is ready to be made, the process is reversed, with the standby environment creating backups that are restored to the live environment.

The Backup Process

BizTalk Backup Process Overview

We’re now ready to take a deeper look into how the backup and restore process works from a high-level, as shown in the diagram above. In Part II we will look at configuring the backup and restore process itself.

Before we continue however its worth taking a moment to look at a fundamental concept within the backup process: Backup-Sets.

Backup Sets

A backup-set is a group of database backups that operate as a logical whole. During the backup process, all of the databases within the backup-set must be successfully backed-up, or the backup-set will be considered incomplete; likewise, when restoring, all of the databases within the backup-set must be successfully restored, or the restoration will be considered incomplete (although there are steps within the restore jobs that handle incomplete restores automatically). This concept ensures that all of the BizTalk databases are backed-up and restored to the same point, ensuring consistency in a disaster-recovery situation.

At a minimum, a backup-set must contain the BizTalk databases listed below, however it can also contain any custom application databases that need the same transactional consistency as the BizTalk databases:

  • Management database – BizTalkMgmtDb
  • MessageBox database – BizTalkMsgBoxDb
  • Tracking database – BizTalkDTADb
  • Rule-Engine database –  BizTalkRuleEngineDb
  • Enterprise Single Sign-On database – SSODb

The Backup BizTalk Server Job

Out of the box, a default BizTalk installation will create a SQL Agent job called Backup BizTalk Server. This job creates either full- or transactional-log backup-sets of the BizTalk databases ready for restoring in a disaster situation; it is also responsible for writing the transactional-log mark discussed earlier. The job will run by default every 15 minutes, however this can be configured to a time-interval that suits your environment.

The job also maintains a backup history which is read by the standby environment to determine what backups are available for restoration; the history also provides auditing capabilities that allow the backup job to recover when only partial backup-sets are created.

The Backup BizTalk Server job performs three steps:

1. Backup Full

Performs a full backup of the BizTalk databases (BizTalkMgmtDb, BizTalkMsgBoxDb, BizTalkDTADb etc.) and writes to the backup history tables once a backup-set has been created.

A full backup is performed if the ForFullBackup flag has been set, or if the job executes in the scheduled window for a full-backup. The scheduled window is determined by the value passed to the Frequency parameter in the Job Step properties – values can either be ‘H’ (hourly) or ‘D’ (daily – 12 midnight), the job default is daily full-backups.

The .BAK files generated by this step are written to the location specified by the Location of Backup Files parameter in the Job Step properties. This should be a highly-available UNC Share that is accessible by the DR environment in a disaster situation.

2. Mark and Backup Log

Performs a transactional-log mark and subsequent backup of the BizTalk database transaction-logs, writing to the backup history tables once a backup-set has been created. Before the log mark is applied, new distributed transactions are blocked, while in-process distributed transactions are allowed to complete. Once the log-mark has been applied, the block on new distributed-transactions is released.

The transactional-log .BAK backup files generated by this step are written to the location specified by the Location of Backup Files parameter in the Job Step properties. Again, this needs to be a highly-available UNC Share that is accessible by the standby environment in a disaster situation.

3. Clear Backup History

The final step deletes the backup-set histories from the backup history tables based on the value specified by the Days to Keep parameter in the Job Step properties; the default is 14 days.

With the Backup job configured and running, we can focus our attention on the restoration of the data and transaction-log files in our standby environment.

The Restore Process

BizTalk Backup - Restore Process Overview

The BizTalk Restore jobs – otherwise know as ‘Log Shipping’ jobs on MSDN – operate on the standby environment and restore the data- and transaction-log files created in the steps detailed above. Transaction logs are continually recovered using the NORECOVERY option which leaves the databases in the ‘Restoring’ state, allowing additional transaction-logs to be applied. In the event of a disaster situation, where the standby environment needs to be brought online as the replacement for the live environment, the transaction-logs are restored to the transaction-log mark using the WITH RECOVERY option, placing the database into an operationally ready state. The various BizTalk services then need to be brought online.

Configuration of the Log Shipping Jobs

Before the log shipping jobs can be started, they must first be configured and prepared for first use – this is a one-off task when configuring log shipping. The log shipping jobs, supporting tables & stored procedures are created by executing two SQL scripts in the master database. The standby environment is finally configured, which creates the necessary connections between the two environments, along with the log shipping SQL Agent jobs themselves. This connection between the two systems allows the standby environment to obtain information about the backups that have been performed (Step 1 in the diagram above) – and therefore need to be restored (Step 2.) – and the location of the data- and transaction-log backup files themselves.
 

The BizTalk Log Shipping Jobs

Following configuration of the standby environment, three new SQL Agent jobs will be created to perform the daily recovery of data- and transactional-log backup files; in the event of the standby environment being made live, they also restore the databases to the last known transactional-log mark.

A restore history is maintained by the restore jobs (which is separate to the backup history); this restore history provides auditing capabilities that allow the jobs to recover when only partial backups are successfully restored.

The following three jobs are created to provide log shipping capabilities of the backup files:

1. The ‘BTS Log Shipping – Get Backup History’ Job

This job queries the live SQL Server history tables to determine which backups have been performed and identify the location of the respective data and transaction-log backup files, ready for restoration in the following ‘Restore Databases’ job. This job executes every minute.

2. The ‘BTS Log Shipping – Restore Databases’ Job

This job restores the backup data- and log-files to the standby SQL Server in a non-operational state, allowing further transaction-logs to be restored. The job executes every minute and will continue to execute while there are data- or log-files to restore (so if you see this job running continually, its possible that you have a backlog of files that are being restored).

The job identifies either the next transaction-log or full backup-set to to apply, and restores the necessary files with the ‘NORECOVERY’ option, leaving each database in the ‘Restoring…’ state; this allows additional transaction-logs to be applied. Data-files take priority over transaction-log files in the restore order, which ensures that daily full backups are restored before further transaction-log backups. Once a data- or transaction-log backup has been restored, the log shipping history table in the standby environment is updated to record the status of the restore operation.

Even though this job executes every minute, the restore will only take place for transaction-logs where there are two or more transaction-log backup-sets to restore; this is to ensure that we always have one transaction-log that can be restored to the transaction-log mark (see the ‘Restore to Mark’ job below). In practice, this means that the restore of the current transaction-log will not happen until the next transaction-log is received, a delay that is determined by the frequency of the Backup BizTalk Server job, which is by default scheduled to execute once every 15 minutes.

In the event of a disaster recovery situation, where the production environment has been lost and the standby environment needs to be brought online, we execute the third log shipping job, which is disabled by default:

3. The ‘BTS Log Shipping – Restore to Mark’ Job

This job restores all of the databases to an operational state, by restoring the next available transaction-log to their last log mark; finally, the job creates SQL Agent jobs. This job has no schedule and is intended to be run by the DBA when the standby SQL Server is to be brought online in a disaster recovery event.

The stored procedure called by this job restores the final transaction log using the ‘WITH RECOVERY’ option, placing the database into an operationally ready state. Furthermore, the restore operation ensures that the transaction-log is restored to the last log mark, ensuring that all databases are restored to exactly the same point.

Finally, the stored procedure drops and re-creates the SQL Agent jobs on the standby SQL Server that were present on the live SQL Server; this is to ensure that any changes that have been made to the default configuration of these jobs is migrated to the standby environment before it is brought online.

Note: This job is disabled by default and only executed when the standby environment is to be brought online.

Once the Restore to Mark job has been executed, two further scripts need to be executed so the restored BizTalk databases points to the standby SQL Server instance and not to the old live SQL Server instance:

  • The first script updates the SQL Server instance name in the BizTalk Management, Rule Engine, HWS and  BAM databases and SQL Agent jobs.
  • The second script updates the SQL Server instance name in the registry for the BizTalk Management, Rule Engine, HWS and EDI databases.

Finally, the BizTalk Host Instances, Rule Engine Update, Enterprise Single Sign-On and WMI services need to be started. Once fully started, the standby environment is considered live and operational.

Wrap-Up

So that’s the end of Part 1. We’ve looked at how the BizTalk Backup and Restore process works at a high-level – how to backup, restore and bring our standby environment online in a disaster event.

In Part 2, we’ll walk through configuring the backup and restore jobs; in Part 3, we’ll demonstrate a long-running BizTalk application that is restored on a standby environment using the backup and restore process configured in Part 2.

BizTalk DR: Configuring BizTalk Backup for Disaster Recovery – Part 2

Reblog this post [with Zemanta]

Using Context.Write to *Update* a Context Property Value

Originally posted by Nick Heppleston at: http://www.modhul.com/2009/03/05/using-contextwrite-to-update-a-context-property-value/

This one has been done to death, but I am writing a moniker replacement pipeline component and noticed some interesting behaviour when looking to update Context Properties.

If you need to update an existing Context Property value, there is no Update() method defined in the IBaseMessageContext interface. However it is possible to write your new value which will perform an update. So use the following code to update the value in an existing Context Property:

private void WriteContextPropertyValue(IBaseMessage msg, string propName, string propNamespace, string propValue)
{
    message.Context.Write(propertyName, propertyNamespace, propertyValue);
}

Using the above code in an NUnit test, we can see that our original SOAP moniker is updated with the correct HTTP moniker with little more than a Context Property write.

contextpropertyupdateoutput

FTP Adapter Context Property Oddities

Originally posted by Nick Heppleston at: http://www.modhul.com/2009/03/04/ftp-adapter-context-property-oddities/

Interesting to see that the FTP adapter doesn’t capture the full Uri of the file in its ReceivedFileName context property – instead it simply gives us the filename:

ftp-context-properties2

Compare this with the FILE adapter where the full name (incl. the path) of the file is provided:

file-context-properties

So, if you need to capture say the folder structure of the FTP server, you need to do a few lines of magic to extract the folder name from the InboundTransportLocation property (shown in green in the first screenshot) and munge it with the ReceivedFileName.

Reblog this post [with Zemanta]

Gotcha when Exporting Adapter Config in Binding Files

Originally posted by Nick Heppleston at: http://www.modhul.com/2009/02/17/gotcha-when-exporting-adapter-config-in-binding-files/

This one caught me out today – hope it might help others in the future…

If you export Bindings to get the lovely syntax for the ReceiveLocationTransportTypeData element, make sure that all of the properties in the adapter configuration dialog that you are interested in are populated *before* you do the export. Otherwise, it would appear not all properties come out in the Binding File Xml.

In the example below, I have mistakenly not set the username and password properties for the FTP adapter and they have not appeared in the ReceiveLocationTransportTypeData element when I export the bindings:

<ReceiveLocationTransportTypeData>&lt;CustomProps&gt;&lt;AdapterConfig vt="8"&gt;&amp;lt;Config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"&amp;gt;&amp;lt;serverAddress&amp;gt;test-ftp-server&amp;lt;/serverAddress&amp;gt;&amp;lt;serverPort&amp;gt;21&amp;lt;/serverPort&amp;gt;&amp;lt;representationType&amp;gt;binary&amp;lt;/representationType&amp;gt;&amp;lt;maximumBatchSize&amp;gt;0&amp;lt;/maximumBatchSize&amp;gt;&amp;lt;maximumNumberOfFiles&amp;gt;0&amp;lt;/maximumNumberOfFiles&amp;gt;&amp;lt;passiveMode&amp;gt;False&amp;lt;/passiveMode&amp;gt;&amp;lt;firewallType&amp;gt;NoFirewall&amp;lt;/firewallType&amp;gt;&amp;lt;firewallPort&amp;gt;21&amp;lt;/firewallPort&amp;gt;&amp;lt;pollingUnitOfMeasure&amp;gt;Seconds&amp;lt;/pollingUnitOfMeasure&amp;gt;&amp;lt;pollingInterval&amp;gt;60&amp;lt;/pollingInterval&amp;gt;&amp;lt;errorThreshold&amp;gt;10&amp;lt;/errorThreshold&amp;gt;&amp;lt;maxFileSize&amp;gt;100&amp;lt;/maxFileSize&amp;gt;&amp;lt;uri&amp;gt;ftp://test-ftp-server:21&amp;lt;/uri&amp;gt;&amp;lt;/Config&amp;gt;&lt;/AdapterConfig&gt;&lt;/CustomProps&gt;</ReceiveLocationTransportTypeData>

Compare this to the ReceiveLocationTransportTypeData element after setting these properties and exporting the bindings – the userName and password properties are now included (yes, I know its like reading spaghetti, but you get the idea!):

<ReceiveLocationTransportTypeData>&lt;CustomProps&gt;&lt;AdapterConfig vt="8"&gt;&amp;lt;Config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"&amp;gt;&amp;lt;uri&amp;gt;ftp://test-ftp-server:21&amp;lt;/uri&amp;gt;&amp;lt;serverAddress&amp;gt;test-ftp-server&amp;lt;/serverAddress&amp;gt;&amp;lt;serverPort&amp;gt;21&amp;lt;/serverPort&amp;gt;&amp;lt;userName&amp;gt;user&amp;lt;/userName&amp;gt;&amp;lt;password&amp;gt;******&amp;lt;/password&amp;gt;&amp;lt;representationType&amp;gt;binary&amp;lt;/representationType&amp;gt;&amp;lt;maximumBatchSize&amp;gt;0&amp;lt;/maximumBatchSize&amp;gt;&amp;lt;maximumNumberOfFiles&amp;gt;0&amp;lt;/maximumNumberOfFiles&amp;gt;&amp;lt;passiveMode&amp;gt;False&amp;lt;/passiveMode&amp;gt;&amp;lt;firewallType&amp;gt;NoFirewall&amp;lt;/firewallType&amp;gt;&amp;lt;firewallPort&amp;gt;21&amp;lt;/firewallPort&amp;gt;&amp;lt;pollingUnitOfMeasure&amp;gt;Seconds&amp;lt;/pollingUnitOfMeasure&amp;gt;&amp;lt;pollingInterval&amp;gt;60&amp;lt;/pollingInterval&amp;gt;&amp;lt;errorThreshold&amp;gt;10&amp;lt;/errorThreshold&amp;gt;&amp;lt;maxFileSize&amp;gt;100&amp;lt;/maxFileSize&amp;gt;&amp;lt;/Config&amp;gt;&lt;/AdapterConfig&gt;&lt;/CustomProps&gt;</ReceiveLocationTransportTypeData>

I can understand why the adapter config only exports those properties that are used – why clutter the config with empty values, but a gotcha to watch-out for nonetheless.

Writing Unit Tests for Pipeline Components with NCover

I discovered NCover over the weekend and wow, am I a fan!

NCover is a code coverage tool which, to quote the website, “helps you test intelligently by revealing which tests haven’t been written yet.” In laymans terms, NCover graphically shows you which lines of code were not touched during the execution of your unit tests, allowing you to create tests accordingly to achieve 100% code coverage, as shown below:

Pipeline Component Testing

Armed with Tomas Restrepo’s excellent Pipeline Testing library, we can now comprehensively test our Pipeline Components:

1. Develop units tests to test your pipeline and ensure that the tests execute and are successful;

2. Invoke your testing framework from the command line to check that your tests will run correctly in the NCover environment (I’m using NUnit here, but you could also invoke VSTS):

"C:Program FilesNUnit 2.4.8binnunit-console.exe" BizTalkMessageArchivingComponent.Tests.dll

Which should produce the following output on the command line:

3. With our unit tests successfully executing, we can wrap the NUnit invocation in NCover loveliness which will inspect our code coverage while those tests executed:

ncover.console "C:Program FilesNUnit 2.4.8binnunit-console.exe" BizTalkMessageArchivingComponent.Tests.dll //ea Modhul.BizTalk.Pipelines.ArchiveMessages.Attributes.NCoverExcludeCoverage //et "Winterdom.*;BizTalkMessageArchivingComponent.Tests.*"

Producing command-line output similar to the following (notice the NUnit tests running between the ***Program Output*** and ***End Program Output*** text):

NCover has now determined the code coverage based of your unit tests and produces a Coverage.Xml file. This file contains information relating to the code coverage and can be loaded in the NCover.Explorer tool to produce a VS like environment that displays lines of code that were touched and un-touched by your unit tests.

4. Load the NCover.Explorer tool and open the Coverage.Xml file generated above, you will be presented with a screen detailing your code coverage. In the example below, you can see that the PerformImmediateCopy and StreamOnReadEvent methods do not have full code coverage – both have code which was not executed in our unit tests:

Clicking on one of the methods in the tree view loads the offending method, displaying the lines which were not executed by our tests in red; lines that were executed are displayed in blue:

Based on this information, we can now create tests to cater for these exceptions, ensuring we have 100% code coverage.

NCover is an excellent tool and although it isn’t free, I personally think its a must-have for any developers tool-kit.

BizTalk Assembly Redirection

Over on the BizTalk Gurus Forums, Richard Wallace asked for opinions on best practice for central business logic that would span over a number of BizTalk applications. One of his suggestions was to use .Net assemblies that contained this business logic, however he was concerned about the re-build / re-deploy tax that this would introduce into his BizTalk projects if he needed to change his business logic assemblies.

In my reply I suggested that he looked at assembly redirection, which allows developers to redirect one version of an assembly to another, through changes to an executable’s .config file – in our case, btsntsvc.exe.config. Having never done this in anger in BizTalk, I though this would be a good opportunity for a blog post – here are my findings.

Configuration Changes

To enable assembly redirection, the following lines need to be added to the runtime/assemblyBinding element in the .config file

   1:  <runtime>
   2:      <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
   3:          <dependentAssembly>
   4:              <assemblyIdentity name="BizTalkAssemblyRedirectionHelper" publicKeyToken="5cce5ee5c1d7dc25" culture="neutral" />
   5:              <bindingRedirect oldVersion="1.0.0.0" newVersion="2.0.0.0" />
   6:          </dependentAssembly>
   7:      </assemblyBinding>
   8:  </runtime>

On line 4, we identify the assembly we wish to redirect; on line 5 we identify the new version. If we had wanted, we could specify a range of ‘old’ assemblies that are to be redirected, as follows:

   1:  <bindingRedirect oldVersion="1.0.0.0-1.2.0.0" newVersion="2.0.0.0"/>

Restart the Host Instance/s for the redirect to take effect. If the new version of the assembly cannot be found in the GAC, you will only see an error when the CLR attempts to load the assembly at runtime, not when the Host Instance restarts, the error is a usual favourite:

Error details: Object reference not set to an instance of an object.

Testing Redirection: Orchestrations & Maps

To test the redirection, I created an orchestration and map that invoked simple helper classes contained within a separate assembly. The helper classes expose a single method which takes two strings as input parameters and returns a concatenation of the these two parameters – an example of the version 2.0 code can be seen below. Versions 1.0.0.0 and 2.0.0.0 were deployed to the GAC.

   1:  using System;
   2:  using System.Diagnostics;
   3:  
   4:  namespace BizTalkAssemblyRedirectionHelper
   5:  {
   6:      [Serializable]
   7:      public static class RedirectionHelper
   8:      {
   9:          public static string ConcatHelper(string a, string b)
  10:          {
  11:              Trace.WriteLine("RedirectionHelper - Orchestration, version 2.0.0.0");
  12:              return (a + b);
  13:          }
  14:      }
  15:  
  16:      public class RedirectionMapHelper
  17:      {
  18:          public string ConcatMapHelper(string a, string b)
  19:          {
  20:              Trace.WriteLine("RedirectionHelper - Map, version 2.0.0.0");
  21:              return (a + b);
  22:          }
  23:      }
  24:  }

Four messages were submitted into BizTalk; in-between the second and third messages, the assembly redirection code was enabled to point to version 2 of the assembly and the Host Instance restarted. The results of the test can be seen in the DebugView screenshot below – once the redirection config change is applied, both the Map and Orchestration start to use the version 2 of the assembly:

Assembly Redirection - DebugView Output

Conclusions

A few conclusions can be drawn from this test:

  • Assembly redirection can be used within BizTalk for helper classes in both Orchestrations and Maps (on ports & in orchestrations). There are other cases I can think of, such as pipeline components, for example.
  • The signatures of these helper methods must remain the same between releases.
  • There is a possible issue with build and revision numbers as highlighted by Richard Hallgren in BizTalk assembly version redirection.
Enhanced by Zemanta

BizTalk 2006 (incl. R2) Management Pack for Operations Manager 2007 Released

Looks like the BizTalk 2006 & BizTalk 2006 R2 Management Pack for Operations Manager 2007 has been released and is available in the Management Pack Catalog.

To quote the catalogue entry, the management pack:

…is a entirely new … providing comprehensive discovery and monitoring of BizTalk Server components and applications. In addition to general support for BizTalk Server 2006, BizTalk Server 2006 R2, this management pack provides coverage for new BizTalk Server 2006 R2 features, such as the native EDI runtime and RFID.

I think its great that we now have native EDI and RFID monitoring ‘out of the box’ – no more need to cludge something together to monitor these two aspects of the product. I’d personally like to see this MP backported to MOM 2005, but I doubt that will happen!

Truncating the BizTalk 2006 Tracking Database

In Truncating the BizTalk 2004 Tracking  Database I discussed how to truncate the tracking database in BizTalk 2004. Over on the BizTalk Gurus forums, user Nick Busy wanted to do the same thing for BizTalk Server 2006 – he’s kindly allowed me to repost his instructions for the community on this blog:

0. Before start, ensure you have got the database admin priveleges on the database

1. Stop all BizTalk Server Host Instances

2. Full backup BizTalkDTADb database (just in case)

3. Make scripts to create views (MANDATORY)

dbo.dtav_ServiceFacts
dbo.dtav_MessageFacts
dbo.dtav_FindMessageFacts

4. Run SQL script:

use BizTalkDTADb
GO

— Drop the Views (before you perform this, ensure you take copies of these views!)
— unfortunately, it’s necessary for SQL 2000, but you can skip it for SQL 2005
Drop View dbo.dtav_ServiceFacts
Drop View dbo.dtav_MessageFacts
Drop View dbo.dtav_FindMessageFacts
Go

— Truncate the necessary Tables
Truncate Table dta_CallChain
Truncate Table dta_DebugTrace
Truncate Table dta_MessageInOutEvents

Truncate Table dta_ServiceInstanceExceptions
Truncate Table dta_ServiceInstances

Truncate Table Tracking_Fragments1
Truncate Table Tracking_Parts1
Truncate Table Tracking_Spool1

Truncate Table dta_MessageFieldValues

— end of the script

5. Update statistics on BizTalkDTADb database

— update statistics
exec sp_updatestats

6. Run the saved scripts (see step 3) to recreate the dropped views from your own environment.

7. Shrink BizTalkDTADb database (sometimes it doesn’t work from GUI, so using sql command will help)

— shrink database
dbcc shrinkdatabase (BizTalkDTADb, 10)

8. Start BizTalk Server Host Instances

9. Configure and enable SQL Agent job “DTA Purge and Archive” (to avoid over-growing the database in the future)

P.S. The script above does not truncate Rule Engine related tables.

Thanks Nick, much appreciated.