Insights and Strategies for DBA: LATCH

What Are Latches?

Latches are serialization mechanisms that protect areas of Oracle’s shared memory (the SGA). In simple terms, latches prevent two processes from simultaneously updating and possibly corrupting the same area of the SGA.

A latch is a type of a lock that can be very quickly acquired and freed. Latches are typically used to prevent more than one process from executing the same piece of code at a given time.

Oracle sessions need to update or read from the SGA for almost all database operations. For nstance:

• When a session reads a block from disk, it must modify a free block in the buffer cache and adjust the buffer cache LRU chain

• When a session reads a block from the SGA, it will modify the LRU chain.

• When a new SQL statement is parsed, it will be added to the library cache within the SGA.

• As modifications are made to blocks, entries are placed in the redo buffer.

• The database writer periodically writes buffers from the cache to disk (and must update their status from dirty to clean).

• The redo log writer writes entries from the redo buffer to the redo logs. Latches prevent any of these operations from colliding and possibly corrupting the SGA.

Latches vs Enqueues

Enqueues are another type of locking mechanism used in Oracle. An enqueue is a more sophisticated mechanism which permits several concurrent processes to have varying degree of sharing of "known" resources.

Any object which can be concurrently used, can be protected with enqueues. A good example is of locks on tables.

We allow varying levels of sharing on tables e.g. two processes can lock a table in share mode or in share update mode etc. One difference is that the enqueue is obtained using an OS specific locking mechanism.

An enqueue allows the user to store a value in the lock, i.e the mode in which we are requesting it.

The OS lock manager keeps track of the resources locked. If a process cannot be granted the lock because it is incompatible with the mode requested and the lock is requested with wait, the OS puts the requesting process on a wait queue which is serviced in FIFO. Another difference between latches and enqueues is that in latches there is no ordered queue of waiters like in enqueues. Latch waiters may either use timers to wakeup and retry or spin (only in multiprocessors). Since all waiters are concurrently retrying (depending on the scheduler), anyone might get the latch and conceivably the first one to try might be the last one to get

How Latches Work

Because the duration of operations against memory is very small (typically in the order of nanoseconds) and the frequency of latch requests very high, the latching mechanism needs to be very lightweight. If the latch is already in use, Oracle can assume that it will not be in use for long, so rather than go into a passive wait (e.g., relinquish the CPU and go to sleep) Oracle will retry the operation a number of times before giving up and going to passive wait. This algorithm is called acquiring a spinlock and the number of spins before sleeping is controlled by the Oracle initialization parameter _spin_count. The first time the session fails to acquire the latch by spinning, it will attempt to awaken after 10 milliseconds. Subsequent waits will increase in duration and in extreme circumstances may exceed one second. In a system suffering from intense contention for latches, these waits will have a severe impact on response time and throughput.

Causes of contention for specific latches

If a required latch is busy, the process requesting it spins, tries again and if still not available, spins again. The loop is repeated up to a maximum number of times determined by the initialization parameter _SPIN_COUNT. If after this entire loop, the latch is still not available, the process must yield the CPU and go to sleep. Initially is sleeps for one centisecond. This time is doubled in every subsequent sleep. This causes a slowdown to occur and results in additional CPU usage, until a latch is available. The CPU usage is a consequence of the "spinning" of the process. "Spinning" means that the process continues to look for the availability of the latch after certain intervals of time, during which it sleeps.

Latch statistics in AWR report contain following sections

Firstly Look at on AWR TOP wait event then take a action behalf of latches.

1. Latch Activity

2. Latch Sleep Breakdown

3. Latch Miss Sources

4. Parent Latch Statistics

5. Child Latch Statistics

----------------------------------------------------------------------------------------

Reducing Contention for internal latches

------------------------------------------------------------------

1. Buffer Cache Latches (Cache Buffer Chain Latch Family)

*. Cache buffer chain latch

- Reduce hot blocks (reorganize table or index / delete and reinsert the data into a table).

- Before incrementing the parameter DB_BLOCK_BUFFERS check that specific blocks are not causing the contention avoiding memory wasting.

- Avoid order clause

- Increase sequence cache size if use this.

- Poor Disk I/O.

*. Cache buffers LRU chain latch

- Multiple Buffer pools and adjusting the parameter DB_BLOCK_LRU_LATCHES to have multiple LRU latches will help on reducing latch contention.

- Lots of physical and logical read.

- Large full table scan & large index range scan.

- Unslective indexes.

- Increase the size of buffer cache.

---------------------------------------------------------------------------------------------------------------------

Identify the HOT Block / Detect Cache Buffer Chain Wait

Oracle Metalink Note # 163424.1

----------------------------------------------------------------------------------------

select count(*) child_count, sum(gets) sum_gets, sum(misses) sum_misses, sum(sleeps) sum_sleeps from v$latch_children where name = 'cache buffers chains';

@ this script to locate a hot block:

select /*+ RULE */ e.owner ||'.'|| e.segment_name segment_name, e.extent_id extent#, x.dbablk - e.block_id + 1 block#, x.tch, l.child# from sys.v$latch_children l, sys.x$bh x, sys.dba_extents e

where x.hladdr = 'ADDR' and e.file_id = x.file# and x.hladdr = l.addr and x.dbablk between e.block_id and e.block_id + e.blocks -1 order by x.tch desc;

-------------------------------------

Reduce Cache Buffer Chain

-------------------------------------

suggests using the _db_block_hash_buckets and _db_block_hash_latches undocumented parameters have been suggested as a remedy,

but always check with MOSC before using any undocumented parameters.

Depending on the TCH column (The number of times the block is hit by a SQL statement), you can identify a hotblock. The higher the value of the TCH column, the more frequent the block is accessed by

SQL statements. In order to reduce contention for this object the following mechanisms can be put in place:

1) Examine the application to see if the execution of certain DML and SELECT statements can be reorganized to eliminate contention on the object.

2) Decrease the buffer cache -although this may only help in a small amount of cases.

3) DBWR throughput may have a factor in this as well. If using multiple DBWR's then increase the number of DBWR's

4) Increase the PCTUSED / PCTFREE for the table storage parameters via ALTER TABLE or rebuild. This will result in less rows per block.

5) Consider implementing reverse key indexes (if range scans aren't commonly used against the segment)

2. Redo-log Buffer Latches

*. Redo allocation latch

- Contention for this latch in Oracle7 can be reduced by decreasing the value of LOG_SMALL_ENTRY_MAX_SIZE

on multi-cpu systems to force the use of the redo copy latch.

- In Oracle8i this parameter is obsolete, so you need to consider to increase the size of the LOG_BUFFER or

reduce the load of the log buffer using NOLOGGING features when possible.

*. Redo copy latch

- This latch is waited for on both single and multi-cpu systems. On multi-cpu systems, contention can be reduced

by increasing the value of LOG_SIMULTANEOUS_COPIES (Hidden in Oracle8i) and/or increasing
LOG_ENTRY_PREBUILD_THRESHOLD (undocumented in Oracle7).

- Increase the size of LOG_BUFFER.

3. Library Cache

*. Library cache latch

- Ensure that the application is reusing as much as possible SQL statement.

- If the application is already tuned, increase the SHARED_POOL_SIZE.

*. Library cache pin latch

- This latch is acquired when a statement in the library cache is reexecuted

4. Shared pool latches

*. Shared pool latches

- Specify the parameteres to reduce it CURSOR_SPACE_FOR_TIME, CURSOR_SHARING=FORCE

Use Bind variables instead of Literals.

- Ways to reduce the shared pool latch are, avoid hard parses when possible.

*. Row cache objects latch

- In order to reduce contention for this latch, we need to tune the data dictionary cache.

In Oracle7 this basically means increasing the size of the shared pool (SHARED_POOL_SIZE)

as the dictionary cache is a part of the shared pool.

References:

http://www.toadworld.com/platforms/oracle/w/wiki/1300.reducing-redo-copy-latch-contention

http://oracleinaction.com/latche-lock-pin-mutex/

http://oraclegurukul.blogspot.in/2016/02/latch-free-waits.html

Insights and Strategies for DBA

Labels

Tuesday, 23 August 2016

LATCH

No comments: