Bugzilla – Bug 1384
Transparent reinvocation does not occur if a system exception is thrown by a server (rather than generated by the client ORB).
Last modified: 2003-11-16 17:02:16
You need to log in before you can comment on or make changes to this bug.
We are developing a fault tolerant system using the TAO FaultTolerance service. We are using IOGRs as reference to replicated CORBA objects. In some circumstances, the primary instance decides it is not ready to handle requests and then throws a TRANSIENT system exception to transparently redirect a call to a backup instance, just like if the primary instance was down. We then expected the call to reach a backup instance, but we realized that the exception was directly thrown to the client application. By looking at the code (TAO_GIOP_Synch_Invocation::invoke_i() in file tao/Invocation.cpp) and following the executing within the debugger, we see that if a system exception is received as a response, the exception is simply demarshalled and thrown to the client application. Transparent reinvocation occurs correctly (a backup instance is called) when initial connection to the server fails or if an error occurs while sending the request (before raising a COMM_FAILURE - in that particular case, the FaultTolerance service callback is called, if loaded). I understand that TAO1.2.1 is getting old, but the code of TAO1.2.6 appears to provide the same behavior. According to the CORBA specification (2.5 and above, as well as CORBA FaultTolerance Specification v1.0 - ptc/2000-04-04, section X.2.6 "Extensions to CORBA Failover Semantics"), transparent reinvocation should occur minimaly for system exceptions COMM_FAILURE, TRANSIENT, NO_RESPONSE and OBJ_ADAPTER, with completion status COMPLETED_NO and COMPLETION_MAYBE (which mechanisms to insure at-most-once semantic). Are we wrong by assuming that transparent reinvocation should take place if the server throws explicitely a SystemException, or is it something that TAO does not [yet] address ? REPEAT BY: Have the primary instance of the IOGR to throw a system exception (e.g. TRANSIENT), the call will never reach a backup instance in the IOGR, rather the exception will be caught in the client code. SAMPLE FIX/WORKAROUND: In GIOP_Synch_Invocation::invoke_i(...), insert a piece of code after the demarshalling of the system exception and before raising it. The piece of code compare the type_id and the completion status of the exception to determine if transparent reinvocation should occcur and, if so, switch to the next profile (by doing "this->stub ()->next_profile_retry()") and, if any, return "TAO_INVOKE_RESTART" instead of raising the exception.
This is not a bug. An exception raised by the server is a valid (and complete) response for a request. You might be able to reach the desired effect by responding with an adequate LOCATION_FORWARD message from the server, for example, using interceptors.
The original bug report is correct is correct (and well written.) A quote from x.2.6 of the standards document (as referenced in the original bug report): Each time a client ORB attempts to invoke a method, it must not abandon the invocation and raise an exception to the client application until it has tried to invoke the server using all of the alternative IIOP addresses in the interoperable object group reference, or has received a "non-failover" condition, or the request duration has expired. A TRANSIENT exception is not a "non-failover" condition (see the same section of the standard), and therefore the request must be reinvoked.
Should be fixed now. Sun Nov 16 16:49:33 2003 Balachandran Natarajan <bala@dre.vanderbilt.edu>