Yet Another Message Splitter Pattern

I have just completed working on a small customer project that (at first sight) appeared trivial, but after putting together a prototype was anything but!

The client uses BizTalk Server 2004 to accept sales orders from a number of customers in several different formats. One of their larger customers sends messages in the xCBL format and provides a mechanism to accept ‘Smart Form’ data – information over and above that usually contained in a sales order, such as name badge, rubber stamp and business card details (layout, typography etc.)*

The required functionality was as follows:

  • Accept orders in the custom format (in this case, xCBL) and translate into an intermediate format (this was already completed);
  • Identify messages from those customers that could send Smart Form data;
  • Determine which lines (if any) on the order contained Smart Form product codes through a database lookup;
  • Send one e-mail per order containing the additional Smart Form detail to the customer services team for manual processing.

All of the message splitter patterns I have come across so far deal with the generation of new messages as each line is parsed (i.e. a 10 line order would produce 10 messages etc.). However, for this problem, I needed to update the state of each message (depending on whether it was a Smart Form product) and output a single e-mail/message containing only the affected lines at the end of the process.

(Yet Another) Message Splitter Pattern

The final orchestration is described in detail below (a large-ish screenshot is available) and is quite elegant (at least I think so!). It utilises a ‘global’ Xml Document variable to maintain the state of the order as each line is parsed, with that state being updated at the end of each line pass.

Before we get into the detail, let me provide a top level overview: A message is received containing all of the order lines. For each line we initiate a database call to determine whether the line contains a Smart Form product. If it does, an attribute on the line (the output attribute) is flagged as true (all lines start out flagged as false). After all lines have been parsed, we have a message that can be pushed through a final map that only outputs those lines where they are flagged as output = true. This acheives our goal of only sending one message per order, rather than one message per Smart Form line.

Lets get into the detail (you may wish to refer to the screenshot mentioned above)….

  • We start by receiving a message from the MsgBox in the ‘SmartForm’ message format – a stripped down format of the intermediate order message that just contains the data we will be interested in for Smart Form processing;
  • We instanciate a new instance of our global Xml Document (nothing more than SmartFormXmlDoc = new XmlDocument()) for later processing;

We then iterate over each line:

  • After incrementing the line count, we build our Sql Adapter message to query the database and determine whether the product code on the current line is a Smart Form product (it actually passes across a customer identifier to allow this orchestration to be used in more than one customer scenario in the future).
  • With the response message from the Sql Adapter, we can made a decision as to whether the line is a Smart Form product.

We’re now at the core of the functionality of this pattern as the next few steps update the line in our message and maintain the state in the global Xml Document (these steps only occur if the line contains a Smart Form product):

  • We start by creating a ‘working’ instance of the Smart Form message into which we assign the current contents of the global Xml Document: WorkingSmartFormMsg = SmartFormXmlDocument.
  • With a working copy of the message (which contains the current state of the Xml Document), we perform some quick XPath to update the output attribute of the current line from false to true (which will allow us later to identify which lines are to be output in the final message).
  • Once we’ve made the update, we assign the message changes back to the Xml Document to maintain the state for the next iteration: SmartFormXmlDocument = WorkingSmartFormMsg, and throw away the working message. Simple!

Once we have completed our final iteration around each line in the message, we will be left with an Xml Document with updated output attributes on each line.

To build the final output message, we simply run our modified message through a map (XSLT of course) and only output those nodes where output = true. the resultant message only contains those lines where the product code is a Smart Form code which can be sent to the customer services team by e-mail, through the SMTP adapter. If there are no matching product codes, the message is not sent out of the orchestration.

Good luck with the pattern, if you would like any help with any of these steps, please drop me a line.

*The old system utilised Python as a pre-processor to BizTalk to identify the lines that contained Smart Form product codes and generated an e-mail to the relevant customer services team, as this information could not be passed through their existing ERP system. However, following the departure of their resident Python developer and a few high-profile failures of the scripts, they wanted to bring the business process onto the BizTalk platform.

Computing Careers RSS Feed

I’m sure they are behind the times, but Computing Careers have just released their job searches as RSS feeds!

I’m can now get details of the latest permanent/contract BizTalk jobs delivered straight to Bloglines! Note to self: must get certified before the end of the year…

The feed is available here:

As an aside, where do you think is the best place to find your next job (especially if its BizTalk related)?

Getting into the Flow

I came across an entry on Michael Buffingtons blog this evening where he discusses the techniques he uses to ‘get into the flow’. Reading his suggestions (such as a cold office, limiting his number of apps etc.) I started to think about how I go about focusing my mind to accomplish the job at hand.

At work I sit in an open plan office (no cubicles) that is so distracting it hurts (telephone, colleagues and no end of pointless crap to deal with). I like to get into the office by 8am and when I can, I will remain productive until midday. I’m often one of the first into the office and I find I can have an incredibly productive hour before the phones calls etc. begin – in fact I like to use this time to tackle any of my tougher tasks. I tend to spend the afternoon doing tasks that are necessary rather than enjoyable – these tend to be quick, 5 – 10 minute jobs that don’t require much brain power, but have me darting from one system to another that would break my though pattern if I were trying to mix with development.

On an evening I have a totally different beast to contend with – the fiancé! I have a really short commute to work (a 15 minute walk) and I find that I’m still in ‘work mode’ when I get home, so I tend to do anything I can other than sit at my laptop. At home I have the loft converted into an office which is quiet and separate to the living space so I can distance myself from home and work life. I tend to find that I am the most productive past 11pm until I’m exhausted and crawl into bed. Unfortunately, this doesn’t go down well with the other half and so I have to find a delicate balance between the ideas factory on my shoulders (that *never* seems to stop) and the fiancé ;-)

So, back at the coalface, a short list of ‘stuff’ that helps me get into the zone:

  • Productivity apps – I use a great tool called StrawpEX (based on the Nullsoft SEX tool) which is a handy note-taker, text jotter thing. It sits in the (Windows) system tray, launching the text area when clicked; saves are automatic, so I just have to click back to the app I was in or the desktop. I find that I can detail ideas, to-do items and update project tasks (read: print from word and take to review meetings) very easily without the tool getting in my way. It also has a *very* small foot print.
  • Caffeine – I love my coffee (and it has to be decent percolated stuff, not instant) – a good brew can keep me going for a few hours and I never seem to crash. I *try* not to drink cans of coke, pepsi etc at my desk and when I’m not drinking coffee, I will try and have a bottle of chilled water close at hand. One of the commenter’s on Michaels blog mentioned green tea – apparently there is a mild caffeine buzz and no detrimental side effects or crashes (plus antioxidants!!) – Avey was right all along….
  • Music (at home) – I really love listening to the BBC radio programmes that are online (my licence fee goes to good use after all). One minute I may be listening to 1Xtra, Radio2, some politics show or comedy – unlike several people I’ve talked to, I never seem to get phased by the talking on some of the shows. I also recently started to listen to the DAB station Chill; there is no talking, just easy-listening Tai Chi for your ears!
  • Silence (is golden) – Or at least a lack of phone/colleague distraction!

I also like some of the other ideas that Michael mentioned, such as a cold office and limiting the number of apps I’m running (bad, bad Bloglines!! bad, bad Gmail!!); I’ve also just read about a nifty little app called Temptation Blocker that is worth a try (bugger, the link doesn’t work).

I’m going to try and make a more conscious effort and see whether it is possible to increase my productivity with a few changes to my work and lifestyle pattern. Roll on the green tea…

Schema Repository Live

Calling all BizTalk developers! I’ve just opened the Schema Repository with the first (XSD) schema – cXML OrderRequest (v1.2.008).

I’ve created the schema repository because of a lack of working schemas for some of the more common message sets (such as cXML, Basda etc.). You can view the schema repository, or jump straight to the cXML schemas.

Schemas should be compatible with both BizTalk 2004 and 2006.

Happy schema developing! Let me know your feedback.

Host Tracking and the BizTalkMsgBoxDb

I’ve just finished working with a client to resolve a problem with the size of their BizTalkMsgBoxDb database; the database was approximately 7Gb (the TrackingData table contained over 7.5 million rows) and growing. This growth was causing serious disk space issues – both the data file and log files were on the same partition* and as the disk ran out of space the BizTalkMsgBoxDb transaction log failed to grow, resulting in the suspension of over 1000 messages.

We tracked the culprit down to the lack of a running Host Instance with ‘Host Tracking’ enabled. After some digging and help from Microsoft UK Support (thanks Christine!) it became apparent that although the BizTalk environment was tracking data (specifically inbound and outbound message bodies in orchestrations) there were no running Host Instances with ‘Host Tracking’ enabled. As a result, tracking data was not moving from the TrackingData table into the BizTalkDTADb database.

The Tracking Data Delivery Service (TDDS)

BizTalk 2004 Message Tracking and the TDDS process work as follows:

  • Data and events are tracked and stored in the BizTalkMsgBoxDb based on the settings made in HAT (tracking can be enabled for Pipelines, Orchestrations, Policies (Rules) and Messages), even if there is no host running with ‘Host Tracking’ enabled.
  • Tracking Data Delivery Service (TDDS) requests are sent from hosts instances that causes data to be moved from the TrackingData table into to the BizTalkDTADb database for future tracking and report etc.
  • Starting a single host instance with ‘Host Tracking’ enabled [appears] to purge the TrackingData table. I plan on doing more testing here, butany MSFT clarification on this would be helpful to aid understanding of this process.

Back at the Coalface…

The fix for the client was relatively easy – simply start a host instance with ‘Host Tracking’ enabled and watch the purge happen. Unfortunately, it wasn’t quite that simple: because the SQL Server data and log partition was running extremely short of space we needed to maintain adequate space disk space for the purge to actually take place.

To sidestep this complication, we simply truncated the BizTalkDTADb database – using Mike Holdorf’s excellent blog entry on truncating this database as a starting point** – as disk space began to run low.

Eight hours and 7.5 million rows later, the MsgBox was back to normal. Don’t you just love BizTalk!!


* – Its best practice to locate data files and log files on separate disks as isolating the transaction log can allow the disk head to be ready for the next write by not having other files contend for the physical disk resource (for more information see Kimberly L. Tripps’s 8 steps to better transaction log throughput (among others).

** – I’ve attached the actual SQL script used to truncate the BizTalkDTADb database to this entry. WordPress doesn’t seem to let me want to upload a .sql file (understandable), I will write on this script shortly. All credit for this goes to Mike Holdorf.

Flat-File Schema – Only Returning One Row

A post on the BizTalk General MSDN Newsgroup relating to flat-file schemas recently caught my eye, primarily because I was working on a bitch of a flat-file schema at the time, but also because I’m looking to give something back to the BizTalk community as a whole.

The post went something along the lines of:

I have an XML schema that I am populating from a dataset:

<Schema>
<NewDataSet>
<Table>
<Field1>
<Field2>
<Field3>

I get multiple rows back from the database so that it looks like this:

<NewDataSet>
<Table>
<Field1>aaa</Field1>
<Field2>bbb</Field2>
<Field3>ccc</Field3>
</Table>
<Table>
<Field1>xxx</Field1>
<Field2>yyy</Field2>
<Field3>zzz</Field3>
</Table>
</NewDataSet>

All seems OK, until I try and map this to a flat file schema – then I only get one row in my resulting file. The schema for the flat file looks like:

<Schema>
<Root>
<Field1>
<Field2>
<Field3>

The fix was quick and simple – all the poster needed to do was insert a new element to hold his data, and tell the parser that the element could repeat, as follows:

[Schema]
<Root
>
<- Delimited (0x0D 0x0A Hex, Postfix)
<Record>
<- Positional/Delimited, 'Max Occurs' = Unbounded
<Field1 />
<Field2 />
<Field3 />
</Record>
<Root>

As a general rule of thumb, its always a good idea to test flat-file schemas before attempting to map them (it helps to stop the headaches when things go wrong). I’ve found that the best way is to generate an XML representation of the schema and use that as an input message to test various scenarios (e.g. repeating elements, data content etc.):

1. Right-click the schema and select ‘Properties’;

2. Change the ‘Generate Instance Output Type’ property to ‘XML’ and click apply;

3. Right-click the schema again and select ‘Generate Instance’, this will generate an XML instance of your schema – modify to your heart’s content!

To test the (modified?) XML instance against your schema:

1. Right-click the schema and select ‘Properties’;

2. Change the ‘Validate Instance Output Type’ property to ‘XML’ and click apply;

3. Set the XML instance filename in the ‘Input Instance Filename’ property and click apply;

4. Right-click the schema again and select ‘Validate Instance’, this will validate your XML instance against the schema;

Direct Msg. Box Binding – Filter Expression Oddities

While working with direct Message Box binding I became stumped with the following error:

error X2186: identifier ‘VALIDATION_COMPLETE’ does not exist in ‘TrackOrderMessage’; are you missing an assembly reference?
error X2007: cannot find symbol ‘VALIDATION_COMPLETE’
error X2163: an ‘activate’ predicate rvalue must be a string, character, boolean or numeric literal
error X2104: illegal ‘activate’ predicate

The resulting fix? wrap your subscription filters in string literals (i.e. “double quotes”) – as error 2163 indicates: ‘the predicate must be a string, character, numeric or boolean literal.’ Nice.

No support for ‘Well Formed Xml’ and ‘DTD’ in the Generate Schemas wizard ‘out of the box’

While creating a schema using the ‘Generate Schemas’ wizard I noticed that the ‘Well Formed XML’ and ‘DTD’ wizards are not loaded by default. If you try and use one you’ll received the following error message:

WFX to XSD schema generation module is not installed. Execute C:Program FilesMicrosoft BizTalk Server 2006SDKUtilitiesSchema GeneratorInstallWFX.vbs to install the WFX to XSD schema generation module.

To resolve this issue, simply run the aforementioned VB script; you’ll also need to run the InstallDTD.vbs script to add DTD support.

Support is available when you re-run the wizard – no need to restart VS.