Oracle7 Server Distributed Systems Volume II: Replicated Data
Developing a Conflict Resolution Strategy
Before selecting or writing a conflict resolution routine, you should first ensure that you have done everything possible to avoid the conflict in the first place. This section outlines how to
- identify and avoid conflicts
- select an appropriate conflict resolution routine
Define Functional Boundaries
When designing a replicated environment, the guidelines to good single-database schema design apply:
- the application should be modular, with functional boundaries and dependencies that are clearly defined (for example, order-entry, shipping, billing)
- data should be normalized to reduce the amount of hidden dependencies between modules.
In addition, to reduce the potential for conflicts, consider using
- a basic primary site model for data shared between modules, which allows only one module to update the data, while other modules read the data
- an advanced primary site model, where ownership of the data is horizontally partitioned: for example, the server in New York owns customers in New York, and the server in California owns customers in California
Use Generated Primary Key
Use generated sequence numbers for the primary key of each table. By using unique sequence numbers at each site, you can avoid uniqueness conflicts and determine ownership of rows based on the primary key. Although you could simply partition the sequence numbers among the sites, this can become problematic as the number of sites, or number of entries, grows. Instead, allow each site to use the full range of sequence values, and include a unique site identifier as part of the primary key.
Conflict Resolution Methods for Dynamic Ownership
If primary site ownership or distributed access to the data is not appropriate, consider dynamic ownership of data. Dynamic ownership permits only one database (the owner) to update the data at a time. Ownership of the data is allowed to move between sites, but only in a way that guarantees that the owner has the most recent data. Non-owners can have out-of-date data, and ordering conflicts can occur, but such conflicts are easily and correctly resolved using a method such as "priority group" or "maximum." Note that single-master methods, such as "overwrite", would result in inconsistencies.
Dynamic ownership is most useful in cases in which
- correctness of data is crucial (such as, salary), and
- there is low contention of data, or
- there is reference locality (that is, the site that most recently updated the data is the most likely site to do the next update to the data).
Additional Information: See for more information on dynamic ownership.
Using Timestamp Resolution Method
Dynamic ownership is unnecessarily restrictive for many types of data. Data such as "date-of-birth" or "address" are rarely crucial to the correct operation of an application. Once this information is inserted into the database, it is rarely updated. Therefore the probability of a conflict is very low. Furthermore, the real world often has checks-and-balances (such as a forwarding address) that can compensate for slightly out-of-date information. Often, you can resolve conflicts with these types of data by using the "latest timestamp" method. (Designate a backup method, such as site priority, in case of identical timestamps.)
The timestamp method is particularly useful because the data will converge regardless of the number of sites, but special care must be taken:
- use consistent time zones
- ensure increasing timestamps for local updates
User-Defined Methods
The timestamp method is not appropriate for all data with shared ownership. User-provided conflict resolution routines can be used when the semantics of how data is used do not match those provided by Oracle's predefined conflict resolution routines. User-provided routines can also be used for monitoring and notification in case of conflicts.
Note: The conflict resolution methods you assign need to ensure data convergence and provide results that are appropriate for how your business uses the data.
Avoid Deletes
The replicated application should not overuse deletes. Conflicts involving deletes are difficult to resolve because they require a history about deleted rows. Oracle symmetric replication does not maintain this history.
Instead, the application should mark a row as deleted (for example, by using a timestamp column that is filled in only upon delete). Periodically, the rows marked as deleted can be purged from the system using procedural replication, as described .
Setting the Propagation Interval
Where conflicts are possible, define a propagation interval that is less than the average interval between updates to the same row. Use small propagation intervals to minimize the probability of conflicts.
Suggestion: Make a table (or a diagram) similar to that shown to analyze the implications of the conflict resolution methods you select.