ORA-04030 AND ORA-04031

ORA-04030 AND ORA-04031

 

 ORA-04031 indicates that shared pool configuration is not sufficient to meet that instance requirements. So, shared pool is to be increased. ORA-04031 can be caused if the User profile is NOT as required by Oracle on Oracle Sun Solaris, AIX and HP. I have not used True Unix and hence I do not know.

But there are bugs that can cause too. The server CPU architecture is multi-threaded and Virtual CPU is beyond core, then also this ORA-04031 can be encountered. The reason is Virtual CPU and CPU makes no difference when you use “psrinfo” command with no operators/switches. That output is the input for cpu_count parameter at Oracle RDBMS level. And parallel_max_servers is based on an algorithm

The algorithm is based on

a. parallel_threads_per_cpu
b. cpu_count
c. pga_aggregate_target
d. sga_target
e. processes

parallel_max_servers = a*2b*5. Default value is = min(processes - 15, 160). Default processes = 150. min (150-15 ,160) = 135 is the value of parallel_max_servers. now on t4, t5 architectures are multithreaded. “psrinfo” on t5-2 lists 256 CPU. Per parallel server we are to provide 2MB shared memory. That means 256*2 should be provided in shared pool. If shared pool is configured 512 M, all that memory goes to provisioning for parallel_max_servers. so Oracle throws up ORA-04031. If you have 2GB earlier, when you migrate the database to T5-2 server, the shared pool is to be configured to 2.5 GB minimum.

From 11gR2, this algorithm has changed.

Derived from the values of CPU_COUNT, PARALLEL_THREADS_PER_CPU, and PGA_AGGREGATE_TARGET (2 MB per process)

How to set the Parameters parallel_max_servers and parallel_min_servers (Doc ID 124452.1)

parallel_max_servers - if you set some big number, for example 255, Oracle will limit effective (real) max number of processes taking into calculation number of processors on that machine and number of threads per processor.

parallel_min_servers - basic purpose of this parameter is to tell Oracle how many processes to start on instance startup and to keep in memory, in order to save time needed for process creation for each query.

For every process, one chunk of pga memory is allocated. If process is loaded in memory during instance startup because parallel_min_servers requested that, chunk remains allocated until instance shutdown. That chunk has attribute "permanent".

On the opposite, there are "temporary" processes and "temporary" chunks. Example: parallel_min_servers=10, parallel_max_servers=40, user requested for query parallelism: alter ... parallel 20. There will be 10 permanent processes and permanent chunks. Also, Oracle will dynamicaly start additional processes to satisfy user request for parallel execution. Let's say 10 temporary processes and 10 temporary chunks. If user requests more than allowed by parallel_max_servers or system limits, alter command will return an ORA error. If user requested alter ... parallel 6, no temporary processes will be created. Only permanent processes will be used.

Setting parallel_min_servers=0 will make all processes and chunks temporary. At the end of operation, the chunk used for one query is destroyed. The next query will use new, clean set of chunks.

This means to me that we need to configure User Limits of Unix to suit this changes. Oracle docs read that shm_mem_ids= 256. Means min of those chunks 256. These chunks vary as under:

a. 2MB
b. 4MB
c. 8MB
d. 16MB

Based on the requirement the chunks are spawned. Control file blocks are also like this and have different blocks based on the structures of control files we configure. (those structures is not our subject of discussion and hence left here). The default value for oracle is 128. This is now considered a bug. We need to set based on the T4 or T5 flavors bump up the value to 2048. And also provide for the parallel_max_servers. the value runs into 10s of GB. Then also we hit ORA-04031. So we are to define the value to max_parallel_servers based on instance requirements and then default will not be effective. And the instance can be started with no errors.

This was a proved case on bring up multiple instances on T5-2 for me. I warned the engineering on hosting OLTP databases on T5-2 with no testing and arriving at the value for parallel_max_servers. if the tables are configured with parallelism, say 6 for our discussion sake and we assume that that can be 10 concurrent SQL running. 60 are the parallel servers we are to set the value to avoid the default value to take precedence.

ORA-04030

I discussed ORA-04031 even though subject was ORA-04030. I intended to discuss both. But forgot to that in the subject as thought flow left behind the physical flow to write.

ORA-04030 event is spit when PGA target area is not optimally configured. PGA max is to be sized based on concurrent session and data sets for those concurrent sessions. It is not possible to set the value in one go the value and we need to study the data for the histograms for a period and then determine.

PGA is a combination of shared memory and volatile memory. And it is possible to swap the process and impact performance.

Automatic PGA Memory Management (Doc ID 223730.1)

Automatic PGA Memory Management
-------------------------------

Process Global Area, often known as the Program Global Area (PGA) resides in the process private memory of the server process. It contains global variables and data structures and control information for a server process. example of such information is the runtime area of a cursor. Each time a cursor is executed, a new runtime area is created for that cursor in the PGA memory region of the server process executing that cursor.

The performance of complex long running queries, typical in a DSS environment, depend to a large extent on the memory available in the Program Global Area (PGA), which is also called work area.

The size of a work area can be controlled and tuned. Generally, bigger work areas can significantly improve the performance of a particular operator at the cost of higher memory consumption. Ideally, the size of a work area is big enough that it can accommodate the input data and auxiliary memory structures allocated by its associated SQL operator. This is known as the optimal size of a work area (e.g. a memory sort). When the size of the work area is smaller than optimal (e.g. a disk sort), the response time increases, because an extra pass is performed over part of the input data. This is known as the one-pass size of the work area. Under the one-pass threshold, when the size of a work area is far too small compared to the input data size, multiple passes over the input data are needed. This could dramatically increase the response time of the operator. This is known as the multi-pass size of the work area.

In Oracle8i administrators sized the PGA by carefully adjusting a number of initialization parameters, such as, SORT_AREA_SIZE, HASH_AREA_SIZE, BITMAP_MERGE_AREA_SIZE, and CREATE_BITMAP_AREA_SIZE, etc.

Starting with Oracle9i, an option is provided to completely automate the management of PGA memory. Administrators merely need to specify the maximum amount of PGA memory available to an instance using a newly introduced initialization parameter PGA_AGGREGATE_TARGET.

The database server automatically distributes this memory among various active queries in an intelligent manner so as to ensure maximum performance benefits and the most efficient utilization of memory. Furthermore, Oracle9i and newer releases can adapt itself to changing workload thus utilizing resources efficiently regardless of the load on the system. The amount of the PGA memory available to an instance can be changed dynamically by altering the value of the PGA_AGGREGATE_TARGET parameter making it possible to add to and remove PGA memory from an active instance online. Since the database engine itself is better equipped to determine SQL execution memory requirements, database administrators should use this feature and not try to tune the PGA manually. This should translate to better throughput for large number of users on the system as well as improved response time for queries.

The automatic SQL execution memory management feature is enabled by setting the parameter WORKAREA_SIZE_POLICY to AUTO and by specifying a size of PGA_AGGREGATE_TARGET in the initialization file. These two parameters can also beset dynamically using the ALTER SYSTEM command. In the absence of either of these parameters, the database will revert to manual PGA management mode. In Oracle9iRelease 2, an advisory for PGA_AGGREGATE_TARGET was introduced. Just like in Buffer Cache Advisory, the PGA Advisory will suggest the appropriate size for PGA memory and thus make PGA tuning an even simpler task.

An important point to understand is that PGA_AGGREGATE_TARGET does not set a hard limit on PGA size. It is only a target value used to dynamically size the process work areas. It also does not affect other areas of the PGA that are allowed to grow beyond this limit. See Document:399497.1 - FAQ: ORA-4030 that has more discussion related to PGA.

Version specific notes:

Until 9iR2, PGA_AGGREGATE_TARGET parameter controls the sizing of work areas for all dedicated server connections, but it has no effect on shared servers (aka MTS) connections and the *_AREA_SIZE parameters will take precedence in this case.

In 10g, PGA_AGGREGATE_TARGET controls work areas allocated by both dedicated and shared connections.

As of 11g, Automatic Memory Management (AMM) expands to managing both SGA and PGA memory. Under memory pressure for PGA memory, SGA memory will be re-allocated for use by a process to accommodate workarea needs. On the flip-side, if PGA memory is under allocated, memory can be added to the auto-tuned components in the SGA beyond the original SGA configuration.

NOTE: With AMM, setting an explicit value for PGA_AGGREGATE_TARGET will act as a minimum setting that AMM will not shrink below. See Note:443746.1for more information.

How To Tune PGA_AGGREGATE_TARGET
----------------------------------

The first question we will have when we set this parameter is what is the best value for it?

To determine the appropriate setting for PGA_AGGREGATE_TARGET parameter were commend to follow the following steps

1- Make a first estimate for PGA_AGGREGATE_TARGET based on the following rule

- For OLTP systems

PGA_AGGREGATE_TARGET = (physical memory * 80%) * 20%

- For DSS systems

PGA_AGGREGATE_TARGET = (physical memory * 80%) * 50%

So for example, if we have an Oracle instance configured on system with 16G of Physical memory, then the suggested PGA_AGGREGATE_TARGET parameter value we should start with incase we have OLTP system is (16 G * 80%)*20% ~= 2.5G and incase we have DSS system is (16 G * 80%)* 50% ~= 6.5 G.

In the above equation, we assume that 20% of the memory will be used by the OS, and in OLTP system 20% of the remaining memory will be used for PGA_AGGREGATE_TARGET and the remaining memory is going for Oracle SGA memory and non-oracle processes memory. So make sure that you have enough memory for your SGA and also for non-oracle processes

2- A second step in tuning the PGA_AGGREGATE_TARGET is to monitor performance using available PGA statistics and see if PGA_AGGREGATE_TARGET is under sized or oversized.

Several dynamic performance views are available for this purpose:

V$SYSSTAT and V$SESSTAT
-----------------------

The following statistics have been introduced which would help in monitoring and tuning:

'workarea memory allocated' - the total amount of PGA memory dedicated to workareas allocated in Kb.

'workarea executions - optimal' -

the cumulative count of work areas which had an optimal size. For example optimal size is defined if the sort does not need to spill to the disk

'workarea executions - onepass' -

the cumulative count of work areas using the one pass size. One pass is generally used for big work areas where spilling to disk cannot be avoided.

'workarea executions - multipass' - the cumulative count of work areas running in more than one pass. This should be avoided and is the symptom of poorly tuned system.

A workarea execution is either optimal, onepass or multipass.

Monitoring Private Memory using V$SYSTAT
----------------------------------------

The following query returns a percentage of work areas used with optimal memory size.

select
trunc (
(sum(case when name like 'workarea executions - optimal'
then value else 0 end) *100) /
(
sum(case when name like 'workarea executions - optimal'
then value else 0 end) +
sum(case when name like 'workarea executions - one pass'
then value else 0 end) +
sum(case when name like 'workarea executions - multipass'
then value else 0 end)
)
) optimal_percent
from v$sysstat
where name like 'workarea executions - %'
/

Note: If this feature is not enabled the previous query will fail with ORA-01476: divisor is equal to zero

- V$PGASTAT

This view provides instance-level statistics on the PGA memory usage and the automatic PGA memory manager. For example:

SELECT * FROM V$PGASTAT;

NAME VALUE
--------------------------------------------------------
aggregate PGA target parameter 524288000 bytes
aggregate PGA auto target 463435776 bytes
global memory bound 25600 bytes
total PGA inuse 9353216 bytes
total PGA allocated 73516032 bytes
maximum PGA allocated 698371072 bytes
total PGA used for auto workareas 0 bytes
maximum PGA used for auto workareas 560744448 bytes
total PGA used for manual workareas 0 bytes
maximum PGA used for manual workareas 0 bytes
over allocation count 0 bytes
total bytes processed 4.0072E+10 bytes
total extra bytes read/written 3.1517E+10 bytes
cache hit percentage 55.97 percent

Main statistics to look at

(a) aggregate PGA auto target : This gives the amount of PGA memory Oracle can use for work areas running in automatic mode. This part of memory represent thetunable part of PGA memory,i.e. memory allocated for intensive memory SQL operators like sorts, hash-join, group-by, bitmap merge and bitmap index create. This memory part can be shrinked/expanded in function of the system load. Other parts of PGA memory are known as untunable, i.e. they require a size that can't be negociated (e.g. context information for each session, for each open/active cursor, PL/SQL or Java memory).

So, the aggregate PGA auto target should not be small compared to the value of PGA_AGGREGATE_TARGET. You must ensure that enough PGA memory is left for work areas running in automatic mode.

(b) total PGA used for auto workarea: This gives the actual tunable PGA memory used bythe system. The 'maximum PGA used for auto workareas' gives the maximum reachedby previous statistic since instance startup.

(c) total PGA in used: This gives the total PGA memory in use. The detail of this value can be found in the PGA_USED_MEM column of the v$process view.

Oracle92, 10g, 11g:

(d) over allocation count:

Over-allocating PGA memory can happen if the value of PGA_AGGREGATE_TARGET is too small to accommodate the untunable PGA memory part plus the minimum memory required to execute the work area workload. When this happens, Oracle cannot honor the initialization parameter PGA_AGGREGATE_TARGET, and extra PGA memory needs to be allocated. over allocation count is the number of time the system was detected in this state since database startup. This count should ideally be equal to zero.

Oracle92,10g, 11g:

(e) cache hit percentage: This metric is computed by Oracle to reflect the performance of the PGA memory component. It is cumulative from instancestart-up. A value of 100% means that all work areas executed by the system since instance start-up have used an optimal amount of PGA memory. This is, of course, ideal but rarely happens except maybe for pure OLTP systems. In reality, some work areas run one-pass or even multi-pass, depending on the overall size of the PGA memory. When a work area cannot run optimally, one or more extra passes is performed over the input data. This reduces the cache hit percentage in proportion to the size of the input data and the number of extra passes performed. this value if computed from the "total bytes processed"
and "total extra bytes read/written" statistics available in the same view using the following formula:

total bytes processed * 100
PGA Cache Hit Ratio = ------------------------------------------------------
(total bytes processed + total extra bytes read/written)

- V$SQL_WORKAREA_HISTOGRAM (Oracle92, 10g, 11g)

This view shows the number of work areas executed with optimal memory size, one-pass memory size, and multi-pass memory size since instance start-up. Statisticsin this view are subdivided into buckets that are defined by the optimal memory requirement of the work area. Each bucket is identified by a range of optimal memory requirements specified by the values of the columns LOW_OPTIMAL_SIZE andHIGH_OPTIMAL_SIZE.

Example :

The following query shows statistics for all nonempty buckets.

SELECT LOW_OPTIMAL_SIZE/1024 low_kb,(HIGH_OPTIMAL_SIZE+1)/1024 high_kb,
optimal_executions, onepass_executions, multipasses_executions
FROM v$sql_workarea_histogram
WHERE total_executions != 0;

The result of the query might look like the following:

LOW_KB HIGH_KB OPTIMAL_EXECUTIONS ONEPASS_EXECUTIONS MULTIPASSES_EXECUTIONS
------ ------- ------------------ ------------------ ----------------------
8 16 156255 0 0
16 32 150 0 0
32 64 89 0 0
64 128 13 0 0
128 256 60 0 0
256 512 8 0 0
512 1024 657 0 0
1024 2048 551 16 0
2048 4096 538 26 0
4096 8192 243 28 0
8192 16384 137 35 0
16384 32768 45 107 0
32768 65536 0 153 0
65536 131072 0 73 0
131072 262144 0 44 0
262144 524288 0 22 0

The query result shows that, in the 1024 KB to 2048 KB bucket, 551 work areas usedan optimal amount of memory, while 16 ran in one-pass mode and none ran inmulti-pass mode. It also shows that all work areas under 1 MB were able to run inoptimal mode.

You can also use V$SQL_WORKAREA_HISTOGRAM to find the percentage of times work areas were executed in optimal, one-pass, or multi-pass mode since start-up.

Example :

SELECT optimal_count, round(optimal_count*100/total, 2) optimal_perc,
onepass_count, round(onepass_count*100/total, 2) onepass_perc,
multipass_count, round(multipass_count*100/total, 2) multipass_perc
FROM
(SELECT decode(sum(total_executions), 0, 1, sum(total_executions)) total,
sum(OPTIMAL_EXECUTIONS) optimal_count,
sum(ONEPASS_EXECUTIONS) onepass_count,
sum(MULTIPASSES_EXECUTIONS) multipass_count
FROM v$sql_workarea_histogram
WHERE low_optimal_size > 64*1024); ---- for 64 K optimal size

- V$SQL_WORKAREA_ACTIVE

This view can be used to display the work areas that are active (or executing) in the instance. Small active sorts (under 64 KB) are excluded from the view. Use this view to precisely monitor the size of all active work areas and to determine if these active work areas spill to a temporary segment.

Example :

SELECT to_number(decode(SID, 65535, NULL, SID)) sid,
operation_type OPERATION,trunc(EXPECTED_SIZE/1024) ESIZE,
trunc(ACTUAL_MEM_USED/1024) MEM, trunc(MAX_MEM_USED/1024) "MAX MEM",
NUMBER_PASSES PASS, trunc(TEMPSEG_SIZE/1024) TSIZE
FROM V$SQL_WORKAREA_ACTIVE
ORDER BY 1,2;

SID OPERATION ESIZE MEM MAX MEM PASS TSIZE
--- ----------------- --------- --------- --------- ----- -------
8 GROUP BY (SORT) 315 280 904 0
8 HASH-JOIN 2995 2377 2430 1 20000
9 GROUP BY (SORT) 34300 22688 22688 0
11 HASH-JOIN 18044 54482 54482 0
12 HASH-JOIN 18044 11406 21406 1 120000

This output shows that session 12 (column SID) is running a hash-join having itswork area running in one-pass mode (PASS column). This work area is currentlyusing 11406 KB of memory (MEM column) and has used, in the past, up to 21406 KBof PGA memory (MAX MEM column). It has also spilled to a temporary segment ofsize 120000 KB. Finally, the column ESIZE indicates the maximum amount of memorythat the PGA memory manager expects this hash-join to use. This maximum is dynamicallycomputed by the PGA memory manager according to workload.

When a work area is deallocated, that is, when the execution of its associated SQL
operator is complete, the work area is automatically removed from the V$SQL_WORKAREA_ACTIVE view.

Note :

3- The Third and last step is tuning the PGA_AGGREGATE_TARGET. In Oracle 9i Release 2 we have 2 new views that help us in this task

- V$PGA_TARGET_ADVICE
- V$PGA_TARGET_ADVICE_HISTOGRAM

By examining these two views, you will be able to determine how key PGA statistics will be impacted if you change the value of PGA_AGGREGATE_TARGET.

To enable automatic generation of PGA advice performance views, make sure the following parameters are set:

- PGA_AGGREGATE_TARGET
- STATISTICS_LEVEL. Set this to TYPICAL (the default) or ALL; setting this
parameter to BASIC turns off generation of PGA performance advice views.

The content of these PGA advice performance views is reset at instance start-up or when PGA_AGGREGATE_TARGET is altered. NOTE: PGA_AGGREGATE can change automatically over time starting with 11g as part of the Automatic Memory Management enhancements available at 11g. See Note:443746.1 for more details.

V$PGA_TARGET_ADVICE view predicts how the statistics cache hit percentage and over allocation count in V$PGASTAT will be impacted if you change the value of the initialization parameter PGA_AGGREGATE_TARGET.

The following select statement can be used to find this information

SELECT round(PGA_TARGET_FOR_ESTIMATE/1024/1024) target_mb,
ESTD_PGA_CACHE_HIT_PERCENTAGE cache_hit_perc,
ESTD_OVERALLOC_COUNT
FROM v$pga_target_advice;

The output of this query might look like the following:

TARGET_MB CACHE_HIT_PERC ESTD_OVERALLOC_COUNT
---------- -------------- --------------------
63 23 367
125 24 30
250 30 3
375 39 0
500 58 0
600 59 0
700 59 0
800 60 0
900 60 0
1000 61 0
1500 67 0
2000 76 0
3000 83 0
4000 85 0

From the above results we should set the PGA_AGGREGATE_TARGET parameter to a value where we avoid any over allocation, so lowest PGA_AGGREGATE_TARGET value we can set is 375 ( where ESTD_OVERALLOC_COUNT is 0)

After eliminating over-allocations, the goal is to maximize the PGA cache hit percentage, based on your response-time requirement and memory constraints.

V$PGA_TARGET_ADVICE_HISTOGRAM view predicts how the statistics displayed by the performance view V$SQL_WORKAREA_HISTOGRAM will be impacted if you change the value of the initialization parameter PGA_AGGREGATE_TARGET. You canuse the dynamic view V$PGA_TARGET_ADVICE_HISTOGRAM to view detailedinformation on the predicted number of optimal, one-pass and multi-pass workarea executions for the set of PGA_AGGREGATE_TARGET values you use for the prediction.

Additional queries for monitoring of work areas:

 

Lightshot screenshot

3. Finding the top ten biggest work areas currently allocated in the system:

select c.sql_text,
w.operation_type, top_ten.wasize
From (
Select *
From (
Select workarea_address,
actual_mem_used wasize
from v$sql_workarea_active
Order by
actual_mem_used
)
Where ROWNUM <=10 ) top_ten, v$sql_workarea w, v$sql c Where w.workarea_address=top_ten.workarea_address And c.address=w.address And c.child_number = w.child_number And c.hash_value=w.hash_value; 4. Finding the percentage of memory that is over and under allocated: select total_used, under*100/(total_used+1) percent_under_use, over*100/(total_used+1) percent_over_used From ( Select sum(case when expected_size > actual_mem_used
then actual_mem_used else 0 end) under,
sum(case when expected_size < actual policy='AUTO'>