feedburner
Enter your email address:

Delivered by FeedBurner

Pro BizTalk 2006 Not Book Review

Labels:

A book from Apress just came out called Pro BizTalk 2006 by George Dunphy and Ahmed Metwally. I don't know them, but I know Marty, who wrote the forward

This is not a book review. I don't do that. I don't have the patience to think deep enough about a book, to reflect on it, and comment on it with the depth a potential would expect. Jim does a great job doing that, so I will leave it to him.

But, I do categorize books on a simple continuum, and it goes something like this:

0- what? you killed a tree for this?

1- well thanks for the effort, I guess I wasn't the right audience

2- learned a lot, will keep for reference reasons, and loan it generously

3- my eyes are so wide open now they hurt in full sunlight

4- I will make whoever I can read this, even if it is against their will. I will buy them a copy and ship it to them without telling them.

This book is squarely in category four, if you are a BizTalk developer/architect. I am continually amazed at the quality of books from Apress. Wrox had good books, very timely, but they tended to be more 'beta' in content than solid.

So, if you are into BizTalk, this is a great book, but only breeze through the first chapter or two. If you are trying to get into BizTalk, it is also a great book. If you are looking for romance, I would probably direct you to the open source aisle.

How do get it? I prefer Barnes & Noble. I don't do the affiliate program or anything, just providing the link as a service.

BizTalk Performance Testing Tips

Labels:

In a lot of BizTalk Server environments, performance is critical. It is not uncommon to hear for a client that they need to be able to process a specific level of transactions in a certain time window. Unfortunately, it is usually followed by the question: "So, how much hardware do I need?"

There isn't anyway to answer that question because there are too many unknowns. How big are the messages? How complex are the pipelines and maps? What about the orchestrations, if any? What other systems or adapters will be involved?

There are several strategies for finding out how much hardware you need. The first is a 'grow as you can' model. You deploy your system on a good foundation. A good SQL Server and a good single or pair of BizTalk servers. Once in production, slowly increase the traffic or consumers of the business process. As limits are reached, add more servers to the BizTalk group. This is a very organic model, and allows you to add only what you need.

This model won't work in some enterprises where budgeting and accounting are more important than the properness of the solution. In these cases, they want a number up front (even before you could fairly SWAG it) and you have to stick with it. To that end, a lot of IT groups over estimate the cost of the project, almost in a negligent manner, and create this giant plan. This will either lead to the company spending more than it should (it's always a bad thing to have to go back for a second dip in the money well in these types of organizations), or the project gets canceled for costing too much money.

There is another way, and it is sort of a blend. You can prototype some of the processes on some trial hardware, and then extrapolate from there to determine the cost of the project. You will still get estimated figures, but they will be based on results, and not on beer and dreams.

Microsoft has finally made a document called Managing a Successful Performance Lab public, that helps you learn how to manage a performance lab test.

I don't want to cover what is clearly laid out in the paper, but I do want to add some of my own thoughts and some high level guidance.

First, make sure that you select a business process that is representative of the work the system will be handling. Build that process out as you would for production. But don't go so far that you end up actually writing the system. It is OK to cut corners. This is a prototype. Just make sure that you involve the adapters and third party systems you will use in production. Which adapters you use can really affect the systems performance.

Make sure you not only find a good process to test, but make sure you set realistic expectations about the traffic it will need to support. For example, a system might sit idle through most of the day, and then have to process large batch files at night as submissions are sent in from partners. Or, the system might receive small requests throughout the day (web service calls for example), and the occasional floodgate batch (5-10 month). So, sit down and think about the traffic shaping for the system.

Then, setup your test environment. You should have at least two servers, one for the SQL Server, and one for BTS. If you plan on have dedicate hosts (send, receive, exec), then extra boxes would help you model what you think your final production physical environment might be like.

Run the BPA! Download and run the BizTalk Best Practice Analyzer. Fix the first thing on the list, and then run it again. Repeat as necessary. This is a fabulous tool, and helps a great deal. Any issue found by it has a link to specific instructions to fix the issue. It will find practices that you can't or won't want to do (false positives). But it will catch a lot of configuration and environmental issues for you, including the MS-DTC trap, which is probably the most common issue asked about on support groups.

Develop a test plan! Boy, I sound like a PM saying that. Plan out what tests you will run, and what they will entail. Develop a way to track results. The key to running good tests is to only ever CHANGE ONE THING AT A TIME. If you change more than one thing, you won't be able to verify what impact each change truly had. Again, only change ONE THING AT A TIME. It will be tempting to cut corners, but if you are going to do that, you might as well not do the performance tests at all, forge the numbers, spend the budget at Best Buy, and call it a day.

The test plan should also include what tasks should be done at the beginning and end of each session, run, and iteration. The steps should be followed ruthlessly. Again, human laziness is your enemy here. Your best bet is to script or automate as much of this as possible. You should also have a printed checklist and a pencil. A team of people will be better for this than on geek in a corner as well. They can keep each other honest.

The test plan should include sample messages, and the performance counters that will be tracked for each run. You can always add more perf counters based on what you are looking for. The Perf Lab whitepaper can get you going in the right direction, but here are some you should do:

1. Spool depth

2. Throttling levels in the system

3. CPU %

4. Memory %

5. Disk Idle time on the SQL Server %

We usually track about 100 counters in our tests, as a baseline. A separate machine should be used to track the counters. After each test, the perf counters log should be saved for reference later. We usually assign a number to each test run, and name the log file with that number. This number is then used in Excel to track the results.

The best way to put a load on your system is to use a tool called Load Gen from Microsoft. It is very configurable and extensible. We usually configure it to drop files in the pickup folder at a certain rate for a specific period of time.

We usually break up the test plan into runs. Each run represents a specific traffic shape. For example, we might start with a batch of 100% good messages (no errors) with 10 transactions per batch. Then each iteration of that run would have progressively more load placed on the system. Each run should have the same progression. The progressions are usually 1, 10, 20, 50, 100, 250, 500, 1000, etc. The next run would have a different traffic shape. We will usually do several runs that only differ in how many transactions per file. Start with 10, then 100, then 500, etc. The traffic shape patterns should become more complex in successive phases of testing. We usually start with simple batches, and then evolve the configuration of LoadGen to generate more realistic scenarios with blends of traffic. For example, 20% traffic is steady and in small batches (real time requests), with 50% in regular, but spaced out medium sized messages, with 10% of traffic with significant errors, and then the rest of the traffic as a floodgate scenario. This mix should match your traffic shapes you worked out in your test plan.

Before each test, the various BizTalk databases should be cleaned out. There are scripts that can do this for you. You don't want later runs to be affected by slower inserts because the tracking database has grown very large. You should also reset any other systems that you are hitting. For example, if you are dropping failed batches to a sharepoint site for manual repair, that doclib in Sharepoint should be cleaned out after each test. Your goal is to start each test to start with the same environment so the test results reliable. With that in mind, you should grow your SQL databases before testing so that the early test runs don't pay the runtime grow tax on SQL performance.

Before each test a simple message should be run through the system to 'prime the pump.' We have found this helps to normalize the test results, making the test results of small batches more reliable.

After all of the test runs are completed, you will need to determine a scale factor for the system. This scale factor will be used to determine what the final production environment might have been able to sustain. For example, a factor to account for the real process being twice as complex to execute, and a second factor to account for dual SQL servers, and four quad servers in the BTS group.

Before the test you should become very comfortable with the topic of 'Maximum Sustainable Throughput' for your system. There are several blogs out there on this topic. It is also covered in the Performance Lab whitepaper mentioned above.

In short, MST is how many transactions your system can handle without creating a backlog that can't be recovered from. This is different from how many transactions can be completed per second because each part of the system will operate at different speeds. Many times, after a perf lab is completed, a second round will be run to specifically find the MST for that system. These tests are usually setup to overdrive different parts of the system to narrow down and define the MST for the system.

A quick list of things to change between runs?

1- Which server in the group is running which Host Instances? It is proven that breaking send/recv/exec into separate hosts, even on the same box helps improve performance. This is because each then gets its own pool of threads and memory.

2- Maybe rework maps, or the intake process on the receive side. A lot of times if performance is critical, a custom pipeline component will need to be developed.

3- Rework the orchestration to minimize the persistence points.

4- Tune the system in the host setup screens, or in the registry to better suit the majority of your traffic. BizTalk does come out of the box tuned very well for the typical business message. But if you end up processing a high number of tiny messages, or large messages, then you can get more performance by adjusting some of the tuning parameters.

That was a longer post than I expected, and I think I could keep on going. Maybe I will expand further in future postings, maybe with sample deliverables.

Announcing the CodeMash Conference 2007!

Labels:

It has been almost a year since a bunch of us had met at a Japanese restaurant last winter to discuss a new type of event. We had all delivered one day conferences before, and had success. But we wanted more, and we thought attendees did as well.

We wanted something that spoke to technology a little further up the though chain. Instead of a whole day on how to do .NET, we wanted sessions on topics that affect all developers. Better architecture. Better development practices. Better guidance.

We also wanted to learn more about our own platforms by learning about other platforms.

To that end, we have formed an Ohio Non Profit organization whose goal is to put on technical education conferences. The first conference will be CodeMash Conference 2007.

It will be held at the Kalahari, in Sandusky, Ohio on January 18th and 19th, 2007.

We have arranged some major sponsors, and even more importantly, some GREAT speakers.

Great Speakers? YES! Bruce Eckel, Neal Ford, and Scott Guthrie! Check the site out for more information. We have other speakers in process, but we can't announce them quite yet.

This is not another .NET or Microsoft conference. Those are great. We are trying to reach out and lift developers above their platform. These platforms are just the tools we are using today.

The site has just been launched in a basic form, with an upgrade coming soon to allow for attendee registration.

Please check it out and registration for email updates, or subscribe to the RSS feed.

www.codemash.org

Microsoft SOA and Business Process Conference 2006

Labels:

 

This post was going to be a daily post about the SOA conference, as it was happening. Unfortunately, I had been on the road for weeks during that period, and then my stay at the conference was cut short for personal reasons. Here are some of my notes, and I will probably blog about some the topics in more depth as time permits.

 

The annual SOA conference at MS has just kicked off. We are expecting some major announcements pertaining the future of the BizTalk platform. I have some ideas about what they will say, but I will hold out for the keynotes to see for sure.

A lot of well known speakers will be here this week, and I look forward to talking about SOA and what advances there are in helping driving business value.

David Chappell is giving the first keynote right now. He defines some fundamental aspects of SOA (which is universally difficult to put a good definition on) as:

- Standardize on SO communication protocol (SOAP is what made this 20 years long overnight success finally catching on)

- WCF and SCA (non-windows) are enhancing this.

- Create the necessary SO infrastructure

- Use BPM technologies effectively

Ultimately, this is all about providing business value. While SOA does this through many ways, the business agility enhancements is one of the best ones. The more easily IT can respond to the changing business needs, the better the business will do.

The first keynote was excellent, and I just added him to my 'Must see speaker regardless of topic' list. Keep in mind there are two David Chapell's that speak on SOA. How funny is that?

THEMES FOR V.NEXT

There was a session with Oliver (top manager for BizTalk Server) about what the next version of BizTalk will look like. He was speaking about the version that comes after BizTalk Server 2006 R2.

- mission critical enterprise

current theme- high scale, reliable, managed and controlled :

vnext->models raise the level of abstraction, distributed exec env, fill key gaps in platform; treat n machines to act as 1 machine. ability to scale up a normal business app

- people-ready process (people centric biz process)

now- people do the work, human interaction,ui,tasks and roles, office/db integration

vnext->rich tool support, pre-built common activities, portal for state and KPI's

-rich connected apps

now- internet, massive user scale, access from anywhere, compose services, multiple trust domains.

vnext -> pre-built services, network transparency, convenient secure identity, tool support across lifecycle

BizTalk Workgroup Edition

There might be a new edition that would fill a current need in enterprise environments. Some organizations are running a central core for operations, and need a BTS box out in each distribution center you have. The licensing is too expensive in this model. MS is thinking of releasing a version that would be less expensive, but would only work with other BTS servers in their workgroup, in the situation above, it would be only those core enterprise BTS servers. Interesting idea.

 

There was some great talk on the new adapter framework, but that is for another post.