Oracle Enterprise Manager Concepts Guide

CHAPTER 3. Jobs and Events

This chapter describes the following topics:

Job Control System

Event Management System

Job Control System

The Job Control System allows you to schedule and manage operations on remote sites.

1. You submit a job from the Console.

2. The communication daemon sends the job information to the appropriate intelligent agent(s).

3. The agent executes the job on schedule.

4. The agent returns any related job messages back to the daemon for display in the Console

Enterprise Manager includes a variety of pre-defined jobs for you to select from. Some examples of pre-defined jobs are

backing up a database

starting up or shutting down

databases

listeners

running

scripts

programs

You can also submit your own custom jobs to the Job Control System.

Centralized Job Control

The Job Control System is simple to use, because the task of submitting and managing jobs is centralized in the Console. You only need to submit a job once, regardless of the number of nodes on which the job will run or how often it needs to be executed.

To schedule a job, you need not connect to the node on which the job will be run. Instead, you submit the job from the Console and specify the nodes or services on which it should run.

OraTcl can be used to

invoke operating system facilities, such as programs or shell scripts

execute SQL and PL/SQL scripts

start up and shut down Oracle databases

directly access an intelligent agent's cache of MIB variables

use SNMP to access host, network, and other products' MIBs

communicate with the intelligent agent

Any Tcl script can be submitted to the Job Control System. Therefore, any operation that can be implemented in OraTcl can be scheduled as a job.

Stored on Agent Nodes

Although you submit a job from the Console, the job scripts themselves reside on the agent nodes. Because the manner in which a job is implemented may depend on the platform, each agent keeps its own set of job scripts.

This allows you to submit a job (such as backing up a database) without worrying about specifics of the platform. For example, you can select a group of databases residing on UNIX and VMS machines, and send one backup job request to back up the databases. The agents on those nodes run backup job scripts that are specific to their platforms.

Composite Jobs

Some DBA jobs involve more than one task. For example, before making schema changes to a database, you may want to back up the database. To accommodate these types of jobs, the Job Control System allows you to combine two or more pre-defined jobs into one job. This is called a composite job. Each of the pre-defined jobs contained in the composite job is called a task.

Composite jobs can contain test conditions based on the success of a task. For example, if a composite job consists of two tasks, starting up a database and then running a SQL script, you can specify that the script be run only if the database was successfully started.

Scalability

The Job Control System allows you to run jobs efficiently on multiple remote nodes. When you submit a job to run on a remote node, all the information needed to run the job is transferred to the agent servicing the node. When the job is executed, it is run by the agent on that node. Therefore, network traffic between the remote node and the Console and daemon is minimized. The only communication between the agents and the Console and daemon are the initial transmission of job information and any subsequent messages about job status changes.

Because jobs are run independently by agents, you can submit any number of jobs on multiple nodes without affecting the Console. For example, you can submit several jobs and then immediately start another task without waiting for the agents to schedule the jobs.

In addition, because there is an intelligent agent residing on each managed node, jobs can be run on multiple nodes simultaneously. For example, you can submit a job to run a report on multiple databases worldwide. The job is scheduled and run independently by the agent that services each database. Therefore, the jobs can be executed by their respective agents at the same time.

Job Queuing

When you submit a job to a one or more destinations, it is possible that any one of those sites may be down. If a site or its agent is down, the communication daemon queues job requests that could not be delivered to the site. Once the site can be contacted, the daemon will submit the queued job to the agent.

Security and Jobs

Jobs are normally run with your preferred credentials. Therefore, jobs cannot be used to perform functions that you could not perform if logged into the machine directly.

Because jobs are categorized by the type of service they act on, the Job Control System knows which credentials to pass to the agent. If the job runs on a node, the Job Control System passes either your preferred credentials for the node or, if none are specified, the username and password you used when you logged into the Console. If the job runs on a service (such as a database), the Job Control System also passes your preferred credentials to the service.

A job can also be run with the agent's credentials. This flexibility allows a site to easily incorporate the Job Control System's authentication methods with existing security policies.

Event Management System

The Event Management System allows you to monitor for specified events throughout the network (such as problems on a node, database, or other service) and optionally to execute a job to fix the problem. The Event Management System, therefore, automates problem detection and correction.

1. From the Console, you register an event set.

2. The communication daemon sends the event information to the appropriate intelligent agent(s).

3. The agent does the monitoring and alerts you if the event occurs.

4. Optionally, you can specify a fix-it job to execute if the event occurs.

Event Sets

Pre-defined event sets come standard with Enterprise Manager. Advanced event sets are included with the Performance Pack. You can also submit their own custom events.

The standard pre-defined events are:

fault management events

database up/down

listener up/down

node up/down

Some of pre-defined event sets that come with the Performance Pack are:

space and resource management events, such as a disk becoming too full or a tablespace running out of extents

performance management events, such as CPU load abnormal or a database system statistic too high

Notification

When an event occurs, you can be notified in various ways, such as electronic mail or paging. Also, events are always logged in the repository and can be viewed in the Console.

Event Scripts

As with jobs, events are OraTcl scripts that are stored on the agent node. Refer to the section "Job Scripts"

for a description of OraTcl scripts.

Event scripts can save state information. Saving a state between executions of an event script allows the agent to remember if it has detected a certain event already and eliminates redundant event messages to the Console. It also allows event scripts to keep a history of a database and adjust to behavior that is typical.

Note: Unlike job scripts, event scripts are run with the permissions of the agent.

Extensibility

The Event Management System does not require that the intelligent agent be the only mechanism for error detection. The Event Management System can include other tools and applications that detect events independent of the intelligent agents. These tools and applications can be integrated into the Event Management System and communicate directly with the intelligent agents.

For example, a third-party application can detect an event on a node and report that event to the intelligent agent on that node. The agent then sends the message back to the Console as usual.

Fix-it Jobs

When an event set is registered, you have the option of specifying a fix-it job for the event. A fix-it job is simply a job that the agent runs if it detects the event. Events and fix-it jobs used together allow you to automate problem detection and correction.

Scalability

The Event Management System allows one person to monitor a large system. For example, if you are responsible for 100 databases, you cannot connect to each database every day to check on its performance. However, using the Event Management System you can effectively monitor all the databases 24 hours a day, and be alerted if a problem is detected.

The Event Management System also allows you to focus on select systems and events. This control is vital in a large system. Rather than monitor all sites or a large number of sites, you can pinpoint only those services they wish to monitor.

On the other hand, an administrator can monitor a large number of sites, with minimal performance impact on the Console. Because the intelligent agents perform the monitoring independent of the Console, an administrator has the option of monitoring many sites without slowing other tasks.

Minimal Performance Impact

The intelligent agent has been optimized to monitor large numbers of systems and events efficiently. Event tests are generally executed by the agent process directly, and can be run quickly.

The agent has direct access to the database system global area (SGA), so event tests based on SGA information execute quickly. In addition, the agent keeps a cache of database MIB variables, so any event tests using these variables can be run quickly as well.

CHAPTER 3. Jobs and Events

Job Control System

Centralized Job Control

Automating Tasks

Fix-It Jobs

Parallelizing Tasks

Job Scripts

Stored on Agent Nodes

Composite Jobs

Scalability

Job Queuing

Security and Jobs

Event Management System

Event Sets

Notification

Event Scripts

Extensibility

Fix-it Jobs

Scalability

Minimal Performance Impact