Default zztat Metrics

From zzat
Jump to: navigation, search

The following table lists the default metrics that zztat ships with. Everything you see here can be changed through the zztat administration user interface. The provided defaults are designed to be suitable for most environments, but you may want to adjust them to better fit your exact needs. This includes thresholds (for example "react if a session was idle for 10 minutes" or "react if more than 50% of the memory is allocated"), and any filters you can add. It's quick and easy to restrict a certain gauge to only trigger if a condition you define is true (for example only react if the sessions' PROGRAM='TOAD').

Many of the metrics zztat ships with are disabled by default, but can easily be enabled if you so choose.

Available Metrics

Group Metric Metric Type Snapshot Freq. Gauge Reaction Loc

1)

Enabled by default Description
Options OPTUSE_MISC AUTOSYNC 4 hrs Yes The OPTUSE_MISC metric collects data about database objects directly which may be using features that require the Advanced Compression or Advanced Security Option. It does not use DBA_FEATURE_USAGE_STATISTICS but scans the data dictionary directly to find the details.
OPTUSE_CHECK STANDARD_ALERTING T Yes This gauge triggers a reaction when any of the following are found:
  • Table, partition or sub-partition compression other than basic is in use
  • Index, index partition or index sub-partition compression other than basic is in use
  • Transparent Data Encryption is used to encrypt data in table columns
  • Transparent Data Encryption is used to encrypt tablespaces
  • Securefile lob deduplication, encryption or compression is in use at table, partition or sub-partition level
  • RMAN compression other than basic is in use
  • Flashback Data Archives (Total Recall) with optimization is in use
OPTUSE_STAT AUTOSYNC 4 hrs n/a Yes Collects a snapshot of the DBA_FEATURE_USAGE_STATISTICS view and archives it on the repository.
Performance LATCH AUTOSYNC 1 mins No Collects a snapshot of the v$latch dynamic performance view and archives it on the repository. This is a supplemental metric that can be enabled for more frequent collection. The SES_WAIT metric and its gauges are handling latch contentions in a default zztat installation.
LOCK AUTOSYNC 5 mins No Collects a snapshot of the v$lock dynamic performance view and archives it on the repository.
OPEN_CURSOR AUTOSYNC 5 mins No Collects a snapshot of the v$open_cursor dynamic performance view and archives it on the repository.
SESSION AUTOSYNC 1 min Yes Collects the currently active sessions and processes.
SES_IDLE_TX SESSION_RO T Yes This gauge triggers a reaction when idle sessions are discovered that have uncommitted transactions. By default, this only raises awareness by sending out an email, but it can easily be adjusted to automatically terminate sessions before they become potential blockers by editing the gauge to filter out sessions coming from certain hosts, or certain programs and terminating them automatically. If you find yourself frequently killing off developers accessing production and leaving their TOAD sessions open forever, this gauge is for you.
SES_POOLS SESSION_RO T No This gauge triggers a reaction when the number of sessions coming from known connection pools falls outside the specified range. The user names being watched can be configured in the gauge query.
SES_IDBLTX SESSION_RO T Yes This gauge triggers a reaction when there is an idle session detected, which has uncommitted transactions and is blocking at least two other sessions.
SESSTAT AUTOSYNC 1 min Yes Collects a snapshot of v$sesstat.
SES_PGA SESSION_MEMORY_RO T SES_PGA will alert if a session consumes more than 25% of the total system memory. If a run-away session exceeds more than 75% of total system memory, the SESSION_HIGH_MEMORY_CHAIN fires, sends an alert and produces a detailed report. You can configure this to automatically kill the session as well.
SES_WAIT AUTOSYNC 1 min Yes Collects a snapshot of v$session_wait.
SES_LATCH SESSION_LATCH_RO T Yes This gauge triggers a reaction when there are more than 50% of user sessions waiting on a latching event.
SQL AUTOSYNC 5 mins Yes Collects the top SQL statements, based on the following criteria:
  • Top 20 most executed SQL
  • Top 20 SQL with the most buffer gets
  • Top 20 SQL with the most disk reads
  • Top 20 SQL with the highest elapsed time
  • Top 20 SQL with the most CPU time consumed
  • Top 20 SQL with the highest number of parse calls
  • SQL statements which were marked by zztat
SQL_WPLAN AUTOSYNC 5 mins No Collects the top SQL statements identical to the SQL metric. In addition, it also captures the v$sql_plan_statistics_all data for each SQL statement and stores it in an XMLTYPE in zz$sql.plan_stats.

Do not enable both SQL and SQL_WPLAN - use one or the other. They both do the same thing, SQL_WPLAN is just capturing the plan details in addition to the top SQL.

SQL_GETS SESSION_RO T No Triggers a reaction when a SQL statement is reading more than 5000 buffers per row returned.
Storage ASM_DG AUTOSYNC 5 mins Yes Collects data from v$asm_diskgroup_stat
ASM_DG_FREE STANDARD_ALERTING T Yes Alerts if ASM diskgroup space runs low. Default thresholds are 20/10% and 100G/50G for warning/critical respectively.
ASM_DG_STATE STANDARD_ALERTING T Yes Alerts if an ASM diskgroup reports a state other than CONNECTED or MOUNTED
ASM_DG_DISKS STANDARD_ALERTING T Yes Alerts if an ASM diskgroup reports offline disks
ASM_DISK AUTOSYNC 5 mins No Collects more detailed data on ASM disks and watches for offline disks in v$asm_disk_stat.
ASM_DISKS STANDARD_ALERTING T No Alerts if a disk goes offline.
ARCH_DEST AUTOSYNC 15 mins Yes Collects data from v$archive_dest.
ARCH_DEST STANDARD_ALERTING T Yes Triggers a reaction when an archive destination shows status ERROR.
ARCH_LOG AUTOSYNC 15 mins Yes Collects archive log data from v$archived_log.
ARCH_SIZE STANDARD_ALERTING R No Triggers a reaction when a redo thread has produced more than 10 GB in the last 15 minutes.
EXTENTS STANDARD 4 hrs Yes Collects a snapshot of the dba_extents view. This is retained for zztat reactions to avoid scanning dba_extents directly. Data is collected at low frequency by default to avoid an impact on larger databases. This is automatically replicated in the background to the repository, with only the latest snapshot remaining on the target.
FILESPACE AUTOSYNC 2 mins Yes Collects data on tablespace space usage.
TS_SPACE TABLESPACE_RO T Yes Triggers a reaction if a tablespace is at risk of running full. Note that if you wish to automatically add datafiles to tablespaces that exceed a certain threshold, see here. Please note that the referenced post was written for a command-line only setup, that you can do all the steps shown through the administrative user interface as well.
FRA_USAGE AUTOSYNC 15 mins Yes Collects a snapshot of v$recovery_area_usage as well as the total size as configured by db_recovery_file_dest_size.
RP AUTOSYNC 1 hour No Collects a snapshot of v$restore_point. Please note that this metric requires the zz$sys_helper package to be installed. This is due to a privilege issue accessing v$restore_point (for details, see this blog post: (external link)
TABLESPACE AUTOSYNC 5 mins Yes Collects details of v$tablespace and dba_tablespaces. Does not include space usage data. See the FILESPACE metric for space usage data.
System ALLINIPAR AUTOSYNC 1 hour No Collects a snapshot of all init.ora parameters, including hidden (underscore) parameters. All attributes (IS_DEFAULT, ISSES_MODIFIABLE, etc) are encoded to optimize this for long term storage. Requires the zz$sys_helper package.
ENQUEUE AUTOSYNC 5 mins
INIPAR AUTOSYNC 1 hour Yes Collects a snapshot of v$parameter. All attributes (IS_DEFAULT, ISSES_MODIFIABLE, etc) are encoded to optimize this for long term storage. See the metric query for details.
OSSTAT AUTOSYNC 5 mins Yes Collects a snapshot of v$osstat. To optimize storage, the stat_name is obtained from the on-demand metric OSSTAT_NAME and is not stored here.
SEGSTAT AUTOSYNC 20 mins Yes Collects a snapshot of v$segstat. To optimize storage, the statistic_name is obtained from the on-demand metric STATNAME and is not stored here.
SYSINIPAR AUTOSYNC 1 hour Yes Collects a snapshot of v$system_parameter. All attributes (IS_DEFAULT, ISSES_MODIFIABLE, etc) are encoded to optimize this for long term storage.
SYSSTAT AUTOSYNC 5 mins Yes Collects a snapshot of v$sysstat. Statistic names are not stored and can be obtained from the STATNAME metric.
Users DB_USERS AUTOSYNC 1 hour No Collects a snapshot of the dba_users view. Does not store the encrypted password. Requires the zz$sys_helper package to be installed.
SESS_CINFO AUTOSYNC 15 mins Yes Collects details on clients connected to the database obtained from v$session_connect_info. Includes client character set, client version, OCI library and driver details.

1) The Loc column shows where the gauge runs - T for Target database or R for Repository database.

On-Demand Metrics

On-Demand metrics are triggered by the framework when a certain condition arises.

Available Metrics

Group Metric Gauge Enabled by default Description
On-demand SQL_MARK n/a Yes Fired by the MARK_SQL reaction and causes the specified sql_id to be marked. Marked SQL is automatically collected as part of the top-SQL metric. Once a statement is marked

and not seen for a configurable period of time, it will automatically age out and become un-marked. This time frame is configurable through zz$manage.reaction_config.

1) The Loc column show

On-Upgrade Metrics

A special type of on-demand metric is the on-upgrade metric. These are triggered by the framework on specific conditions:

  • On a new zztat target database installation
  • When an Oracle version change was detected
  • When a zztat patch requires a refresh of the data
  • When a zztat patchset is applied
  • When a new on-upgrade metric is added

Most of these metrics are used to optimize the long term storage of standard metric data by mapping static identifiers to their respective names. They can also be used as an aid when developing gauges, to avoid accessing these data dictionary objects, and instead provide a static table to read the data from.

Note: Those metrics are executed by zztat automatically and are a part of the core framework. Do not disable or remove those metrics. zztat automatically manages these metrics and refreshes them only when needed.

Group Metric Gauge Enabled by default Description
On upgrade AUDITACTIONS n/a Yes Collect a snapshot of the audit_actions view. Data is used in reporting combined with aud$ if enabled.
On upgrade ENQ_NAME n/a Yes Collect data to map enqueue type codes to enqueue names and event#. Based on v$enqueue_statistics
On upgrade EVENTNAME n/a Yes Collect a snapshot of v$event_name. Used to map event# or event_id to the event name.
On upgrade LATCHNAME n/a Yes Collect a snapshot of v$latchname. Used to map a latch# or hash to a latch name.
On upgrade LOCK_TYPE n/a Yes Collect a snapshot of v$lock_type to store static lock names and descriptions.
On upgrade METRICNAME n/a Yes Collect a snapshot of v$metricname. Used to map group and metric IDs to names.
On upgrade NLS_VALUES n/a Yes Collect a snapshot of v$nls_valid_values. Provides all known time zone names and other static NLS data.
On upgrade OBJ_TYPES n/a Yes Build a mapping of type# to object type names. Used to reduce storage where zztat stores objects.
On upgrade OSSTAT_NAME n/a Yes Build a mapping of osstat_id to stat_name in v$osstat. Used to dramatically reduce storage needs for v$osstat metrics.
On upgrade SQLCOMMAND n/a Yes Collect a snapshot of v$sqlcommand. Used to map command types to command names.
On upgrade SQL_CURS_MAP n/a Yes Collect a snapshot of the columns in v$sql_shared_cursor to enable zztat to store them in a single column (akin to x$kkscs).
On upgrade STATNAME n/a Yes Collect a snapshot of v$statname. Used to map statistic# and stat_id values to statistic names.
On upgrade SYS_OPTENV n/a Yes Collect a snapshot of v$sys_optimizer_env.
On upgrade WAIT_CLASS n/a Yes Collect a snapshot of v$system_wait_class. Used to map class# and class_id to the wait class name.

Hi-Speed Metrics

Another special type of on-demand metric are the high-speed sampling metrics. These metrics are triggered by the framework when the corresponding reaction is fired.

Group Reaction Hi-Speed Metric Enabled by default Description
Performance LATCH_HISPEED_SAMPLE HS_LATCH Yes Collects latch holder data at high speeds when latch contention is detected. The data source used is x$ksuprlat. Number of samples is configurable through zz$manage.reaction_config.
Performance MUTEX_HISPEED_SAMPLE HS_MUTEX Yes Collects mutex wait data at high speeds when mutex contention is detected. The data source used is x$mutex_sleep. Number of samples is configurable through zz$manage.reaction_config.
Performance SESSION_WAIT_HISPEED_SAMPLE HS_SESWAIT Yes Collects session wait data at high speeds when high waits are observed. The data source used is x$kslwt. Number of samples is configurable through zz$manage.reaction_config.

Note: These metrics require the zz$sys_helper package to be installed in order to function correctly.