How is failover supported in the Borland EJB Container?

You can register the same sets of beans in each container to get a homogeneous clustering capability, or you can partition your beans across various containers to get a heterogeneous clustering capability.

You should then be able to kill instances of the Containers and your clients will fail over to the replicas. Your stateless session beans and entity beans would fail over immediately. You can configure your stateful session beans to fail over, by sharing a Stateful Session Storage Service among them. To do this, run a single instance of the container with only the -jss flag and don't specify the -jss flag to all the other containers. This will centralize stateful session state in that single jss instance. This should allow failover of stateful session beans, as well.

Failover is supported by leveraging the Naming Service and the Smart Agent. We support failover of all types of EJBs. To summarize the behavior:

  • Stateless session beans. If you run multiple containers (for example, VMs) with the same stateless session bean, they will act as failover replicas of each other. So call (1) to a stateless session bean will go to VM 1. If VM 1 shuts down and the session bean is installed in VM 2, then call 2 will go to VM 2. The client will be unaware of the failover. Since the beans are stateless, this is correct behavior.
  • Entity beans. From the Container standpoint, entity beans are similar to stateless session beans, except that, before being able to use a replica entity, it needs to be loaded in. The current transaction will fail, as it should. Both the resource and the synchronization objects associated with the original (failed) entity in VM 1 will be unavailable and the transaction manager, presuming rollback (as per the OTS/JTS specification), will roll back the current transaction. However, subsequent transactions will simply failover to use the replica entity in VM 2 and will continue correctly.
  • Stateful session beans. Stateful session beans are passivated into a JDatastore database, which is available via an IDL interface. This Session Storage service can either run in-process, or out-of-process. If run out-of-process, multiple Containers can share a single Session Storage service. So, let's assume we have two Containers (one in VM 1, and one in VM 2) running replicated stateful session beans, and the Session Storage service running in VM 3. Let's say a client creates a shopping cart (a stateful session bean) in VM 1. The shopping cart will be automatically passivated every 5 seconds (a tuneable parameter). Then let's say that the client puts a book into the shopping cart and then thinks for a while. The shopping cart will be stored persistently within 5 seconds in the Session Storage. Now, lets say VM 1 crashes. The shopping cart EJBObject will automatically failover to VM 2. This container will see that it doesn't have the state for the user's shopping cart activated and it loads the state from the storage server (VM 3). The client now continues using the shopping cart with its contents intact.

This implementation has some problems:

  • What happens to state that changed between the last passivation and the VM crash?
  • It is lost. By default, you can lose up to 5 seconds of work. That is, you can lose all the items you put in your shopping cart within the last 5 seconds. This doesn't seem like a big problem. If need be, you can set passivation to occur more frequently.
  • What happens if VM 3 crashes?
  • Your shopping cart will be unavailable until the Session Storage service is brought back up. Once the service is back up, you are back to where you were before VM 3 crashed. This is really the same issue that you would have if any database failed. Note, however, that since the Session Storage API is IDL, it is possible for you to write your own highly-available service, possibly built on Oracle, or something extremely robust. That being said, note that it is in fact less likely that VM 3 crash, than either VM 1 or VM 2, because VM 3 is not running user code. VM 1 and VM 2 are both running Containers with user-code in it. VM 3 is only running JDatastore and the ORB, which both tend to be quite stable, as far as Java code is concerned.

Note: This implementation does not imply that every bean gets published to the Smart Agent. This would be a horrendous (albeit typcial) use of the Smart Agent. Rather, services representing the beans get published (one per type of bean). This is accomplished via some POA/VisiBroker 4.x magic under the covers. It is conceptually similar to the VisiBroker 3.x notion of service activators, but considerably more sophisticated. Only the bean's POA is registered.

Also, if a large number of clients are all using a primary container, which goes down, all the clients will have to failover to the replica simultaneously. This might be painful. That being said, if you have a "a large number of clients", then typically you will be running through some sort of concentrator (probably a servlet in a web server) and in reality there is only a small number of clients.

Are you assuming that the beans, to which you hope to failover, have already been created in the other container?

No assumption beyond those inherent in the EJB specification is being made:
  • For stateless session beans, the container "magically" has as many beans available as needed. They don't need to be pre-created.
  • For entity beans, the container loads in a bean as needed. Since they already exist in the database, they don't need to be pre-created.
  • For stateful session beans, yes, the object must be created before it is used (as per the EJB specification) and after a timeout it will be stored in a shared repository. Once the bean is created, and passivated, it can be activated into any other container sharing the repository. It does not need to be pre-created in the other container.