Oracle7 Server Distributed Systems Volume I: Distributed Data

Contents Index Home Previous Next

Manually Overriding In-Doubt Transactions

A database administrator can manually force the COMMIT or ROLLBACK of a local in-doubt distributed transaction. However, a specific in-doubt transaction should be manually overridden only when the following situations exist:

Normally, a decision to locally force an in-doubt distributed transaction should be made in consultation with administrators at other locations. A wrong decision can lead to database inconsistencies which can be difficult to trace and that you must manually correct.

If the conditions above do not apply, always allow the automatic recovery features of Oracle7 to complete the transaction. However, if any of the above criteria are met, the administrator should consider a local override of the in-doubt transaction. If a decision is made to locally force the transaction to complete, the database administrator should analyze available information with the following goals in mind:

		COMMIT COMMENT 'Finance/Accts_pay/Trans_type 10B';

		ALTER SESSION ADVISE COMMIT;
		INSERT INTO emp@hq ... ;    /*advice to commit at HQ */
		ALTER SESSION ADVISE ROLLBACK;
		DELETE FROM emp@sales ... ; /*advice to roll back at SALES*/
		ALTER SESSION ADVISE NOTHING;

Manual Override Example

The following example shows a failure during the commit of a distributed transaction and how to go about gaining information before manually forcing the commit or rollback of the local portion of an in-doubt distributed transaction. Figure 5 - 7 illustrates the example.

Figure 5 - 7. An Example of an in-Doubt Distributed Transaction

In this failure case, the prepare phase completed. However, during the commit phase, the commit point site's commit message (the message telling the global coordinator that the transaction was committed at the commit point site) never made it back to the global coordinator, even though the commit point site committed the transaction.

You are the WAREHOUSE database administrator. The inventory data locked because of the in-doubt transaction is critical to other transactions. However, the data cannot be accessed because the locks must be held until the in-doubt transaction either commits or rolls back. Furthermore, you understand that the communication link between sales and headquarters cannot be resolved immediately. Therefore, you decide to manually force the local portion of the in-doubt transaction using the following steps:

The following sections explain each step in detail for this example.

Step 1: Record User Feedback

The users of the local database system that conflict with the locks of the in-doubt transaction get the following error message:

ORA-01591: lock held by in-doubt distributed transaction 1.21.17

Here, 1.21.17 is the local transaction ID of the in-doubt distributed transaction in this example. The local database administrator should request and record this ID number from the users that report problems to identify in-doubt transactions that should be forced.

Step 2: Query DBA_2PC_PENDING

Query the local DBA_2PC_PENDING (see also [*]) to gain information about the in-doubt transaction:

SELECT * FROM sys.dba_2pc_pending
   WHERE local_tran_id = '1.21.17';

For example, when the previous query is issued at WAREHOUSE, the following information is returned.

Column Name            Value
---------------------- --------------------------------------
LOCAL_TRAN_ID          1.21.17
GLOBAL_TRAN_ID         SALES.ACME.COM.55d1c563.1.93.29
STATE                  prepared
MIXED                  no
ADVICE
TRAN_COMMENT           Sales/New Order/Trans_type 10B
FAIL_TIME              31-MAY-91
FORCE_TIME
RETRY_TIME             31-MAY-91
OS_USER                SWILLIAMS
OS_TERMINAL            TWA139:
HOST                   system1
DB_USER                SWILLIAMS
COMMIT#

Figure 5 - 8. Results of Querying DBA_2PC_PENDING

The global transaction ID is the common transaction ID that is the same on every node for a distributed transaction. It is of the form:

global_database_name.hhhhhhhh.local_transaction_id

Here, global_database_name is the database name of the global coordinator (where the transaction originates), hhhhhhhh is an internal database ID at the global coordinator (8 hexadecimal digits), and local_tran_id is the corresponding local transaction ID assigned on the global coordinator. Therefore, the last portion of the global transaction ID and the local transaction ID match at the global coordinator. In the example, you can tell that WAREHOUSE is not the global coordinator because these numbers do not match.

The transaction on this node is in a prepared state. Therefore, WAREHOUSE awaits its coordinator to send either a commit or a rollback message.

The transaction's comment or advice may include information about this transaction. If so, use this comment to your advantage. In this example, the origin (the sales order entry application) and transaction type is in the transaction's comment. This information may reveal something that would help you decide whether to commit or rollback the local portion of the transaction.

If useful comments do not accompany an in-doubt transaction, you must complete some extra administrative work to trace the session tree and find a node that has resolved the transaction.

Step 3: Query DBA_2PC_NEIGHBORS

The purpose of this step is to climb the session tree so that you find coordinators, eventually reaching the global coordinator. Along the way, you might find a coordinator that has resolved the transaction. If not, you can eventually work your way to the commit point site, which will always have resolved the in-doubt transaction.

The DBA_2PC_NEIGHBORS view provides information about connections associated with an in-doubt transaction. Information for each connection is different, based on whether the connection is inbound or outbound:

Additionally, the INTERFACE column tells whether the local node or a subordinate node is the commit point site.

To trace the session tree, you can query the local DBA_2PC_NEIGHBORS view. In this case, you query this view on the WAREHOUSE database.

SELECT * FROM sys.dba_2pc_neighbors
   WHERE local_tran_id = '1.21.17'
   ORDER BY sess#, in_out;
Column Name            Value
---------------------- --------------------------------------
LOCAL_TRAN_ID          1.21.17
IN_OUT                 in
DATABASE               SALES.ACME.COM
DBUSER_OWNER           SWILLIAMS
INTERFACE              N
DBID                   000003F4
SESS#                  1
BRANCH                 0100

The columns of particular interest in this view are the IN_OUT, DATABASE, DBUSER_OWNER, and INTERFACE columns. In this example, the IN_OUT column reveals that the WAREHOUSE database is a server for the SALES database, as specified in the DATABASE column. The connection to WAREHOUSE was established through a database link from the SWILLIAMS account, as shown by the DB_OWNER column, and WAREHOUSE, nor any of its descendants, was the commit point site, as shown by the INTERFACE column.

At this point, you can contact the administrator at the located nodes and ask them to repeat Steps 2 and 3, using the global transaction ID.

Note: If you can directly connect to these nodes with another network, you can repeat Steps 2 and 3 yourself.

For example, the following results are returned when Steps 2 and 3 are performed at SALES and HQ, respectively.

Manually Checking the Status of Pending Transactions at SALES.ACME.COM

SELECT * FROM sys.dba_2pc_pending

   WHERE global_tran_id = 'SALES.ACME.COM.55d1c563.1.93.29';
Column Name            Value
---------------------- --------------------------------------
LOCAL_TRAN_ID          1.93.29
GLOBAL_TRAN_ID         SALES.ACME.COM.55d1c563.1.93.29
STATE                  prepared
MIXED                  no
ADVICE
TRAN_COMMENT           Sales/New Order/Trans_type 10B
FAIL_TIME              31-MAY-91
FORCE_TIME
RETRY_TIME             31-MAY-91
OS_USER                SWILLIAMS
OS_TERMINAL            TWA139:
HOST                   system1
DB_USER                SWILLIAMS
COMMIT#
SELECT * FROM dba_2pc_neighbors
   WHERE global_tran_id = 'SALES.ACME.COM.55d1c563.1.93.29'
   ORDER BY sess#, in_out;

At SALES, there are three rows for this transaction (one for the connection to WAREHOUSE, one for the connection to HQ, and one for the connection established by the user). Information corresponding to the rows for the SALES and HQ connections is listed below:

Column Name            Value
---------------------- --------------------------------------
LOCAL_TRAN_ID          1.93.29
IN_OUT                 OUT
DATABASE               WAREHOUSE.ACME.COM
DBUSER_OWNER           SWILLIAMS
INTERFACE              N
DBID                   55d1c563
SESS#                  1
BRANCH                 1
LOCAL_TRAN_ID          1.93.29
IN_OUT                 OUT
DATABASE               HQ.ACME.COM
DBUSER_OWNER           ALLEN
INTERFACE              C
DBID                   00000390
SESS#                  1
BRANCH                 1

The information from the previous query reveals several facts:

Manually Checking the Status of Pending Transactions at HQ.ACME.COM:

SELECT * FROM dba_2pc_pending

   WHERE global_tran_id = 'SALES.ACME.COM.55d1c563.1.93.29';
Column Name            Value
---------------------- --------------------------------------
LOCAL_TRAN_ID          1.45.13
GLOBAL_TRAN_ID         SALES.ACME.COM.55d1c563.1.93.29
STATE                  COMMIT
MIXED                  NO
ACTION
TRAN_COMMENT           Sales/New Order/Trans_type 10B
FAIL_TIME              31-MAY-91
FORCE_TIME
RETRY_TIME             31-MAY-91
OS_USER                SWILLIAMS
OS_TERMINAL            TWA139:
HOST                   SYSTEM1
DB_USER                SWILLIAMS
COMMIT#                129314

At this point, you have found a node that resolved the transaction. It has been committed. Therefore, you can force the in-doubt transaction to commit at your local database (see the following section for information on manually committing or rolling back in-doubt transactions). It is a good idea to contact any other administrators you know that could also benefit from your investigation.

Step 4: Check for Mixed Outcome

After you manually force a transaction to commit or roll back, the corresponding row in the pending transaction table remains. The STATE of the transaction is changed to forced commit or forced abort, depending on how you forced the transaction.

Furthermore, once connections between the instances resume, RECO checks the global outcome of the transaction. The MIXED column is changed to yes and the row for the transaction is not deleted if you forced the transaction the wrong way.

If you ever see a transaction forced the wrong way, you should be aware that some global data inconsistency may exist. Eventually, you can purge unnecessary rows from the pending transaction table.

The Pending Transaction Table (DBA_2PC_PENDING)

Every Oracle7 database has a pending transaction table which is a special table that stores information about distributed transactions as they proceed through the prepare/commit phases. You can query a database's pending transaction table by referencing the DBA_2PC_PENDING data dictionary view.

Each transaction with an entry in the pending transaction table is classified in one of the following categories (as indicated in DBA_2PC_PENDING.STATE):

collecting This category normally applies only to the global coordinator or local coordinators. The node is currently collecting information from other database servers before it can decide whether it can prepare.
prepared The node has prepared and may or may not have acknowledged this to its local coordinator with a prepared message. However, no commit message has been received. The node remains prepared, holding any local resource locks necessary for the transaction to commit.
committed The node (any type) has committed the transaction, but other nodes involved in the transaction may not have done the same. That is, the transaction is still pending at one or more nodes.
forced commit A pending transaction can be forced to commit at the discretion of a database administrator. This entry occurs if a transaction is manually committed at a local node by a database administrator.
forced abort (rollback) A pending transaction can be forced to roll back at the discretion of a database administrator. This entry occurs if this transaction is manually rolled back at a local node by a database administrator.
Also of particular interest in the pending transaction table is the mixed outcome flag (as indicated in DBA_2PC_PENDING.MIXED). The database administrator can make the wrong choice if a pending transaction is forced to commit or roll back (for example, the local administrator rolls back the transaction, but the other nodes commit it). Incorrect decisions are detected automatically, and the damage flag for the corresponding pending transaction's record is set (MIXED=yes).

The RECO (Recoverer) background process uses the information in the pending transaction table to finalize the status of in-doubt transactions. The information in the pending transaction table can also be used by a database administrator, who decides to manually override the automatic recovery procedures for pending distributed transactions.

All transactions automatically resolved by RECO are automatically removed from the pending transaction table. Additionally, all information about in-doubt transactions correctly resolved by an administrator (as checked when RECO reestablishes communication) are automatically removed from the pending transaction table. However, all rows resolved by an administrator that result in a mixed outcome across nodes remain in the pending transaction table of all involved nodes until they are manually deleted.


Contents Index Home Previous Next