Oracle7 Server Distributed Systems Volume I: Distributed Data

Contents Index Home Previous Next

Concepts and Terminology

The following sections outline some of the general terminology and concepts used to discuss distributed systems.

Nodes

A node in a distributed system can be a client, a server, or both. Every computer in a system is a node.

For example, in Figure 1 - 5, the HQ node acts as a server when the DELETE statement is issued against the table DEPT.

It acts as a client when it issues the INSERT and SELECT statements against remote data in the table EMP which resides in the SALES database.

Replication

The ability to insure reliable data replication is an extremely important (and potentially complex) factor in a distributed system. Data replication means that any given data object can have several stored representatives at several different sites and that, if each representative is potentially updatable, there must be a mechanism for insuring that all representatives reflect the changes.

Oracle7 Server provides a variety of mechanisms for replicating your data. The methods you select will depend on your specific needs:

If you simply need to view the data at multiple sites, without updating it, you might choose to use read-only snapshots. If you need to be able to update multiple copies of the same data, you will need to use Oracle's symmetric replication facility.

See [*] for a brief introduction to replication and how it fits into the scheme of a distributed system.

Oracle7 Server Distributed Systems, Volume II provides an introduction to the Oracle7 Server replication capabilities and detailed instructions on how to implement and maintain replication for your system.

Direct and Indirect Connections

A client can connect directly or indirectly to a server. In Figure 1 - 5, when the client application issues the first and third statements for each transaction, the client is connected directly to the intermediate HQ database and indirectly to the SALES database that contains the remote data.

Site Autonomy

Site autonomy means that each server participating in a distributed system is administered independently (for security and backup operations) from the other servers. Although all the servers can work together, they are distinct, separate repositories of data and are administered individually. Some of the benefits of site autonomy are:

Name Resolution

A schema object (for example, a table) is accessible from all nodes that form a distributed system. Therefore, just as a non-networked local database architecture must provide an unambiguous naming scheme to distinctly reference objects within the local database, so must a distributed system use a naming scheme to ensure that objects throughout the system can be uniquely identified and referenced.

To resolve references to objects (a process called name resolution) within a single database, the database usually forms object names using a hierarchical approach. For example, within a single database, each schema has a unique name, and that within a schema, each object has a unique name.

Because uniqueness is enforced at each level of the hierarchical structure, an object's local name is guaranteed to be unique within the database, and references to the object's local name can be easily resolved.

Distributed systems simply extend the hierarchical naming model by enforcing unique database names within a network. As a result, an object's global object name is guaranteed to be unique within the distributed system, and references to the object's global object name can be resolved among the nodes of the system.

See [*] for more information on global naming issues.

Remote/Distributed Queries and Updates

A remote query is a query that selects information from one or more remote tables, all of which reside at the same remote node.

A remote update is an update that modifies data in one or more tables, all of which are located at the same remote node.

Note: A remote update may include a subquery that retrieves data from one or more remote nodes. Because the update is performed at only a single remote node, however, the statement is classified as a remote update.

A distributed query retrieves information from two or more nodes.

A distributed update modifies data on two or more nodes. A distributed update is possible using a procedure or trigger, which includes two or more remote updates that access data on different nodes. Statements in the program unit are sent to the remote nodes, and the execution of the program succeeds or fails as a unit.

Remote and Distributed Transactions

A remote transaction is a transaction that contains one or more remote statements, all of which reference the same remote node. A distributed transaction is any transaction that includes one or more statements that, individually or as a group, update or query data on two or more distinct nodes of a distributed system. If all statements of a transaction reference only a single remote node, the transaction is remote, not distributed.

Remote Procedure Calls (RPCs)

Oracle7 supports RPCs with full PL/SQL datatypes as parameters and return values. Because PL/SQL datatypes are a superset of SQL datatypes, PL/SQL procedures and functions are ideal for managing Oracle7 services on remote servers. All parameters and return values provided by a remote server can be examined and stored by the calling server. This allows the local server to maintain the client's calling interface and semantics - even if the service is reimplemented, or the remote site decides to change the RPC's interface.

Access to Non-Oracle Data and Services through Oracle Open Gateway Technology

Oracle Open Gateway Technology provides access to non-Oracle data through gateway servers, which are a part of a distributed system like any other distributed server. Open Gateway technology is tightly integrated with the Oracle7 Server. This permits integration of both SQL and non-SQL data and services. For more information, see your Oracle Open Gateway documentation.

Gateway servers access the target system directly. Oracle7 client applications do not connect directly to a gateway server, but indirectly by first connecting to an integrating server. An integrating server communicates with a gateway server in the normal Oracle7 server-to-server manner using SQL*Net. See Figure 1 - 1.

A gateway server is a single process and does not start background processes. On some platforms, such as MVS, the gateway server starts once, and maintains multiple user sessions in memory, where one session handles the requests from a single Oracle7 client application. On other platforms, such as UNIX platforms, a gateway server starts for each user session.

Transparent Gateway Server

A transparent gateway server emulates an Oracle7 Server and usually resides in the target system environment. The database administrator creates database links and local synonyms at all Oracle7 Servers that require access to the data source. The gateway server is then transparent for Oracle7 client applications that access what appear to be Oracle7 tables or views.

A client application connects directly to an integrating Oracle7 Server, which is responsible for connecting to the transparent gateway server. A transparent gateway does not execute PL/SQL stored procedures, but a stored procedure on an Oracle7 Server can issue SQL statements that access the data source via the gateway.

An Oracle7 client application queries and modifies a data source using ANSI/ISO SQL via the transparent gateway. Transparent gateways always perform automatic data conversion. In some cases, this is driven by gateway data definition language (GDDL) for non-SQL data sources.

A target system is unlikely to have all Oracle7 functionality. For queries, missing functionality is often fulfilled by the gateway server or integrating server. For example, where a target system has a limited capability to conditionally retrieve data, the gateway can make up for this missing functionality by means of a post-filter.

Procedural Gateway Server

A procedural gateway server emulates Oracle7 Server's remote procedural capabilities and usually resides in the target system environment. The database administrator creates database links and local synonyms at all Oracle7 Servers that require access to the data source. The gateway server is then transparent for Oracle7 client applications, which access what appear to be Oracle7 PL/SQL stored procedures.

A client application connects directly to an integrating Oracle7 Server, which is responsible for connecting to the procedural gateway server.

A procedural gateway does not execute SQL requests. An Oracle7 client application executes calls at a target system using PL/SQL remote procedure calls. Automatic data conversion to and from the datatypes of the arguments in the procedure call and the call at the target system is driven by gateway data definition language.

Transaction Recovery Management

An efficient system, distributed or non-distributed, must guarantee that all statements in a transaction are either committed or rolled back as a unit, so that the data in the logical database can be kept consistent. The effects of a transaction should be either visible or invisible to all other transactions at all nodes. This should be true for transactions that include any type of operation, including queries, updates, or remote procedure calls.

The general mechanisms of transaction control in a non-distributed system are discussed[*]. In a distributed system, Oracle7 must coordinate transaction control over a network and maintain data consistency, even if a network or system failure occurs.

Transaction recovery management guarantees that all database servers participating in a distributed transaction either all commit or all roll back the statements in the transaction. Transaction recovery management also protects implicit DML operations performed by integrity constraints, remote procedure calls, and triggers. Transaction recovery management is described[*].

Transparency

The functionality of a distributed system must be provided in such a manner that the complexities of the system are transparent to both users and administrators.

For example, a distributed system should provide methods to hide the physical location of objects throughout the system from applications and users. Location transparency exists if a user can refer to the same table the same way, regardless of the node to which the user connects. Location transparency is beneficial for the following reasons:

A distributed system should also provide query, update, and transaction transparency. For example, standard SQL commands, such as SELECT, INSERT, UPDATE and DELETE, should allow users to access remote data without the requirement for any programming. Transaction transparency occurs when the DBMS provides the functionality described below using standard SQL COMMIT, SAVEPOINT, and ROLLBACK commands, without requiring complex programming or other special operations to provide distributed transaction control:

A distributed architecture should also provide facilities to transparently replicate data among the nodes of the system. Maintaining copies of a table across the databases in a distributed system is often desired so that:

A database server that manages a distributed system should make table replication transparent to users working with the replicated tables.

Finally, the functional transparencies explained above are not sufficient alone. The distributed system must also perform with acceptable speed.

National Language Support (NLS)

Oracle7 supports heterogeneous client/server environments where clients and servers use different character sets. The character set used by a client is defined by the value of the NLS_LANG parameter for the client session. The character set used by a server is its database character set. Data conversion is done automatically between these character sets if they are different. For more information about National Language Support features, see the Oracle7 Server Reference.


Contents Index Home Previous Next