DaveWentzel.com            All Things Data

Service Broker Demystified - CLOSED conversations

Those new to Service Broker are confused by CLOSED conversations.  They hang around even when they are CLOSED.  A quick google search will tell you this is for security purposes.  In this post I'll show you exactly why that is.  Also, invariably, one day a DBA will see a huge number of CLOSED conversations that never get cleaned out.  I'll also cover why that happens and how to avoid it.  

I've heard that people don't want to use Service Broker because it is too confusing. I started a blog series called Service Broker Demystified because SSB really isn't that difficult to understand. But there are some confusing aspects to SSB that I'm covering in this blog series.  Today we are going to cover why CLOSED conversations seem to stick around in your queues.  

Every CLOSED conversation will stay in your queue for at least 30 minutes.  Every one.  

This is easy to demo.  You can download the repro script to try on your own.  

Let's build some basic scaffolding:

(Figure 1)

That gives us a basic send and receive setup.  Now we'll send the most basic message to the receiver:  

(Figure 2)

We get two conversation_endpoint entries.  One for the initiator and one for the target.  Both are conversing.  Both have the same conversation_handle

(Figure 3)

The next step is for the receiver to ack the message.  We want to keep this simple so we merely perform an END CONVERSATION.  Again, we want to keep it simple:  

(Figure 4)

Now we see that our conversation state_desc has changed.  We see the receiver has been set to closed and the initiator is set to DISCONNECTED_INBOUND, meaning the far_service has disconnected.  

(Figure 5)

Finally, the sender should always close its conversation on its side.  This is usually done as an activated proc on the sending queue.  

(Figure 6)

Here I noted the time I executed the command.  We see the initiator conversation is GONE yet the receiver's ack message is listed as CLOSED.  Note also that the security_timestamp is set to about 30 minutes later, which is when this message will be purged.  

(Figure 7)

Using that basic dialog pattern we can learn a lot about how Service Broker messages and dialogs work.  For instance:

...So why do CLOSED conversations live for 30 minutes in your queues?  

To prevent replay attacks.  In a previous post I mentioned that whenever SSB communicates to another service it will, by default, encrypt the message unless you specifiy ENCRYPTION=OFF when you start the dialog.  This provides us with some interprocess security but it is still possible that the original dialog can be replayed (for instance, by a "man in the middle") and the valid dialog can be fraudulently repeated.  By maintaining the CLOSED conversation handle for 30 minutes SSB assures that if it sees that conversation handle again from (what appears to be) the same network host with the same encryption key then it knows that this might be replay attack (or just programmer error) and the conversation will not be reopened and processed.  

So how exactly would a man-in-the-middle attack work?  

After the target closes his conversation it is marked as CLOSED (see Figure 5).  That is the entry that is maintained for 30 minutes.  In this case we can rest assured that both the conversation_handle and the conversation will not be altered by a third party.  SSB doesn't need to maintain or track this for the sending side of the conversation.  There's nothing really that we can do if a third party spoofs that side of the conversation.  But even that can't be spoofed because the entire conversation_handle is marked as CLOSED anyway (see Figure 7 above).  We don't need to maintain both sides of the conversation.  Just one.  

In security engineering a nonce is an arbitrary number used only once in a cryptographic communication.  The conversation_handle is similar in spirit to a nonce word.  Both are random or pseudo-random "things" issued during an authentication protocol.  Once the nonce is seen by either side (the same conversation_handle in a CLOSED state), then the communication channel cannot be reused or replayed.  HTTP digest authentication works this way and this is how SSB was modeled.  The timestamp is included, just like with digest authentication to ensure timeliness of the messages.  This is also why you can declare a LIFETIME on a dialog in SSB.  

...so why does sys.conversation_endpoints CLOSED entries sometimes NOT disappear after 30 mins?  

When this happens you'll usually see thousands of entries just like Figure 5.  Lots of CLOSED messages.  This means you didn't model your dialogs correctly or you have a bug.  Here is the simple rule:  the target always does END CONVERSATION first.  No Exceptions!!!

This is the pattern that you SHOULD be using...if you aren't then you are susceptible to "conversation population explosion".  

  1. Initiator starts dialog and sends message
  2. Target RECEIVEs the request
    1. Target may do some processing and send another message back to the sender (optional)
    2. Initiator receives and processes that response and (optionally) sends another response or acknowledgement
  3. Target does END CONVERSATION
  4. Initiator processes END CONVERSATION and issues its own END CONVERSATION (usually via an activator on the initiator queue)

...so why does sys.conversation_endpoints sometimes get tons of DISCONNECTED_INBOUND entries?  

Look at Figure 5 again.  This happens when the Initiator issues the END CONVERSATION before the Target.  This is a bug or design flaw.  The cause is the initiator is doing END CONVERSATION before the sender.  

In a future post I'll show you how to properly model a monolog which is sometimes called a "fire and forget" message.  Let's say you have some process that you want to send a message to.  You don't care if it completes or even if it throws an error.  You model it this way:  

  1. Initiator starts dialog and sends message and immediately does END CONVERSATION
  2. Target RECEIVEs the request and does the work

What happens?  You leak conversation handles and sys.conversation_endpoints begins to fill up.  Don't model your conversations that way.  The initiator END'd first and the target never did any END CONVERSATION.

...so I found a bug in my design and I leaked a bunch of conversation_handles.  How do I fix it?  

END CONVERSATION conversation_handle WITH CLEANUP

How can you remember to model your dialogs properly?  

Just use CB protocol.  If you are too young to remember that, just go watch Smoky and the Bandit.  This is known as voice procedure and is used in other industries as well.  Here's an example:  

CB Slang Meaning SSB Equivalent
Bandit:  "Snowman, snowman, what's your twenty.  Over" Attention Snowman, where are you?  I am done talking and waiting for you to reply. BEGIN DIALOG FROM SERVICE Bandit TO SERVICE 'Snowman'; SEND ON CONVERSATION @h ('What's your twenty?');
Snowman:  "Hey Bandit I'm at a choke and puke on Route 80. Over" I'm at a diner on Route 80.  I am done talking and waiting for you to reply.   RECEIVE FROM Queue. SEND ON CONVERSATION @h ('At the choke and puke.')
Bandit:  "I'll meet you there in 20 minutes. Over" I'll meet you there in 20 minutes.  I am done talking and waiting for you to reply RECEIVE FROM SenderQueue.  SEND ON CONVERSATION ('I'll meet you there in 20 minutes.')
Snowman:  "10-4.  Over and out." I understand.  I am done talking and do not expect a reply.   END CONVERSATION --on target side.  
Bandit: "Over and out." I am also done talking END CONVERSATION --on sender side.  

 


You have just read "Service Broker Demystified - CLOSED conversations" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

12 comments

Comment: 
I wish I had found this article about 3 days earlier, it would have saved me hours of frustration. Your whole series is fantastic and lives up to its "demystified" title. Thank you!

Comment: 
Thanks for the kind words and best of luck to you. --dave

Comment: 
I run SSB between 2 servers, I wonder why I saw only 1 end points with Status 'CONVERSING' even when the receiver has received message and issued END CONVERSATION command. If I check conversation status at the initiator side , it is blank. How can i be sure to remove conversation that already received at the target but the endpoint at initiator is 'CONVERSING' ???

Comment: 
The intiator must also END CONVERSATION *after* the target has.  in most situations (not all, depends on your design) the easiest thing to do is run an activator on the initiator side that is simply runs END CONVERSATION on anything *not* an error in the q.  Without seeing your design I probably shouldn't comment but I would SERIOUSLY reconsider using SSB for inter-instance communication.  It's fraught with danger and gotchas.  Much better to use a tool BUILT for this purpose (RabbitMQ, JMS, whatever)

Comment: 
At first glance it seem to me that you want to say 'Don't afraid to use SSB' . But Your last comment seem to be the other way. But thank you every much for your document, it open my mind to other Message Service Broker like RabbitMQ.

Comment: 
SSB is great for asynch processing *within your database*.  It's simple to embed in triggers to do near real time ETL for instance.  These SSB patterns, while overly cumbersome to code up, are established and work fairly well if you are careful about error handling and alerting.  I would NEVER use SSB to communicate across db instances.  Never.  Why would you want to embed a queueing system into your database?  Considering SSB is a "headless" application, essentially, you are going to need to roll-your-own extensive monitoring.  Let's face it...for communicating between dbs there are simpler sql solutions like replication.  If you need full-blown asynchronous messaging why would you want to shoehorn that into SQL Server and SSB, which maybe a few hundred folks in the world really understand, instead of using a technology like JMS or something purpose-built for the use case?  Seems obvious to me.  Not everything belongs in the database.  Queues-in-the-db is considered an anti-pattern by just about everyone.  But that doesn't mean SSB is a technology that should be always be avoided.  Like I said, asynchronous triggers are a perfect use case for SSB.  You wouldn't want to do that, necessarily, using something like JMS.  SQL Agent Jobs are another approach as well.  

Comment: 
I was looking at using SSB for pushing data to another system instead of SSIS to get to as close to real time as possible but now i'm second guessing that. My issue is the Disconnected_Inbound state on the initiator when the Closed state is on the receiver. You've spelled out what happens but now I'd like to know why it happens. What's the process that tells the initiator to close the conversation premature to receiving the response from the target? is there a time to live when you don't include end conversation on the initiator and can it be modified?

Comment: 
Jason, these are all really good questions.  SSB is not meant, IMHO, for the use case you want (real-time pushing of data changes).  SSB would, IMO, work well to track the changes you need and allow another mechanism to make a pull request for them as needed.  As I've said in many articles on my site, using SSB across instance-boundaries opens a whole can of worms around sys.routes, endpoints, etc that I don't think SSB handles gracefully.  I would need to see the EXACT conversation pattern your application uses to understand WHY you are seeing (unwanted) disconnected_inbound states.  Again, the more complicated you make the application workflow using SSB, the more you ultimately run into the unexpected.  The best pattern to use is, for lack of a better term, the "modified fire-and-forget pattern".  As we all know fire-and-forget is unsupported in SSB "dialogs"...but 99% of all use cases/workflows can frankly be distilled down into a series of "monologs" that are strung together.  You can do monologs (fire-and-forget) in SSB if you follow my pattern.  This is no different than JMS, RabbitMQ, or anything else AMQP...the best patterns are the simplest...fire-and-forget is the easiest to implement...all workflows can be made into monologs with careful introspection...always assume "at least once" delivery semantics...make all conversations idempotent.  Every time I've gotten creative (read "cocky") with SSB I've paid the price.  Finally, I believe there is a TTL in newer releases of SQL Server/SSB...unfortunately I have no experience with them.  

Comment: 
Hi Very excellent blog! One question concerning "...so I found a bug in my design and I leaked a bunch of conversation_handles. How do I fix it? ". Which SIDE of conversation has to call "END CONVERSATION conversation_handle WITH CLEANUP", the sender or receiver ?

Comment: 
It depends on your "workflow".  I wrote a little cursor for times when this happens (and it should only be in dev) that finds all cases and does the END...WITH CLEANUP first from the initiator.  If that fails I have it try from the target.  

Comment: 
Are you sure that target should end the converasation first then only sender should end the conversation bcz in tcp that is the other way around.

Comment: 
I'm positiive

Add new comment