Monday, April 13, 2009

What are JMS Grid Network Connections?

One of the interesting features in JMS Grid (aka Sun Message Service Grid) is the concept of Network Connections.

JMS Grid has two types of connections that can exist between active daemons. Cluster connections are the more familiar of the connection types: these are the synchronous connections that exist between daemons in a cluster. Each daemon has its own data store, therefore these connections ensure the data store for each connection is maintained in a consistent state. This includes ensuring the same messages are available at each daemon, and that messages are not simultaneously dequeued from two different daemons.

JMS Grid also has the concept of Network connections. These are the connections that can exist between clusters. Network connections are asynchronous, and are recommended for links spanning large geographical areas or unreliable networks.

The following diagram shows the distinction between these types of connection:



I was initially drawn to Network connections as a way of solving a specific high availability solution. The solution in question consisted of two sites: a primary site and a disaster recovery site. The primary site needed two be highly available in its own right and consisted of two daemons on a LAN: this was an obvious cluster connections.

In addition, the disaster recovery site needed an up-to date store of all messages. In the case of a failover to disaster recovery, connections would be redirected to the DR daemons, and continue dequeuing any existing messages, or enqueuing new messages. Asynchronous network connections seemed like the perfect way of ensuring message state was persisted at the geographically remote disaster recovery site without impacting performance at the active site.

Sun's documentation implies this usage should succeeed:

"Clusters can be connected to each other to form networks of clusters.
These intercluster connections are known as network connections...
Networks differ from clusters in that they are loosely coupled."


Unfortunately the JMS Grid User Guide is vague on the exact intention of network connections. The following statements imply some of the caveats that apply to network connections:

"Messages are only routed between two clusters when a message sent by a client on one cluster needs to be delivered to a client on the other cluster."

This begged the question of how JMS Grid knows what messages should be routed between clusters - my initial inclination was that all messages would be routed to all clusters - thereby allowing messages to be enqueued onto any daemon and dequeued from any daemon.

The following provided a clue to how this works:

"A network filter explicitly controls the propagation of subscriptions and hence messages across a network connection."

I initially overlooked this statement, but it is the key to understanding the purpose of network connections. Network connections are specifically intended for cases where messages are enqueued in one cluster, and then dequeued in a different (but specific) cluster at the other end of a network connection. This is not what I needed: I needed messages to be enqueued and dequeued from the same cluster, unless there was a failover to the DR site, in which case dequeuing would occur in a different cluster. I specifically wanted all messages to be available for dequeuing from any daemon in the architecture.

The documentation for JMS Grid does not clearly state that network connections do not fulfil the use case I intended, but I have verified this through trial and error. Given this intended use, it is difficult to see many scenarios where network connections can provide any value. I have never encountered a scenario where a business always wanted to enqueue in one geographic region, and then dequeue in a seaparte geographic region. I have however come across many situations where the usage I intended is required.

It does seem reasonable that network connections cannot fulfill the purpose I intended. Network connections use asynchronous links - but send data to active daemons. The JMS Grid network could not manage transactional dequeuing of messages across such a network, since two clients may attempt to dequeue in different clusters at the same time. This scenario can only be handled with synchronous communication (cluster connections) between daemons, as the dequeuing daemon needs to check with all other daemons that the message hasn't been dequeued already.

My intended usage of network connections was more akin to the use of Dataguard with Oracle. This uses asynchronous messaging to keep two Oracle instances in sync. The reason Dataguard is able to accomplish this over an asynchronous channel is that only one of the Oracle instances is active at a time - and therefore data does not need to be synchronized in both directions simultaneously.

JMS Grid only allowed my required architecture to be implemented by placing a clustered daemon at the DR location. This however limited by flexibility, since only two active daemons are recommended in any cluster. The ultimate solution to this would be providing database backed storage to JMS Grid. This would allow me to use Dataguard to ensure my DR site had a copy of the message store, and remove the responsibility from JMS Grid.

cisdal - simply integration

Sunday, April 12, 2009

Is ebXML dead?

I have always found Google Trends a useful tool for analyzing the popularity of various technologies over time. I have been working with a client who has adopted ebXML as their business to business gateway technology. Although I had no say in this decision, I did not necessarily think that it was a poor decision.

I did not have much feel for the strength of ebXML in the B2B marketplace, so I was interested in performing some research.

ebXML seems a decent choice when compared to other possible choices such as EDI and AS2. I was however quite surprised when I looked at the trends for these technologies. The diagram below shows a comparison of:
EDI
AS2
ebXML
RosettaNet
from 2004 until 2009




EDI (dark blue line) has been decreasing in popularity to some degree over time, as might be expected, but it is still overwhelmingly more prominent than the other B2B technologies.

Because EDI is so much more prominent, I removed it from the equation for the graph below in order to get a clearer picture of how the other technologies stack up:




In 2004, ebXML was the most prominent of these technologies (light blue line). AS2 (yellow line) and RosettaNet (red line) were essentially neck-and-neck. Since 2007, however, AS2 has been gaining substantially in popularity. Throughout the entire time frame, ebXML has been losing popularity.

A better picture of the health of ebXML can be seen by graphing it individually:



ebXML has been in a long downward spiral for the last 4 years, and through that time its relevance seems to have diminished by a factor of 25.

Is ebXML dead? The outlook clearly does not look good, but subsequent posts will examine ebXML and its competitors in more detail. Google Trends does not provide a definitive answer to the question, but it does provide strong indications.

cisdal - simply integration