Oracle database automatically performs sorting operations on row data as requested by any of the following operations
- CREATE INDEX, ALTER INDEX ... REBUILD
- DISTINCT
- ORDER BY
- GROUP BY
- UNION
- INTERSECT
- MINUS
- IN, NOT IN
- Aggregation functions (MAX, MIN, AVG, SUM)
- Certain UN indexed joins
Sorts that can’t fit into the sort_area_size will be paged out into the TEMP table spaces for a disk sort. Disk sorts are about 14,000 times slower than the memory sorts.
Recursive calls can be generated by any or all of the following activities
· An object requiring an additional extent for storage (dynamic extension)
· Misses on the dictionary cache
· Firing of database triggers
· DDL statements
· Execution of SQL statements within stored procedures, packages, functions, and anonymous PL/SQL blocks
· Enforcement of referential integrity constraints
Performance Tuning
The below are some of the common methods to minimize the quantity of above discussed(Physical I/O, Recursive Calls, disk sorts) in the oracle database and there by improve the performance of the system
Index Suppression
Any function that modifies the column name in a WHERE clause will suppress the corresponding index. Many common functions that are used to suppress a standard index are
NOT / IS NULL / != or <>
Comparing a number field to a character field
Any modification to the Indexed Column Name
(TO_CHAR, TO_DATE, +0, || '', SUBSTR, DECODE...)
Suppression Example; despite the intended hint to use the index, the SUBSTR function will suppress the index on the EMP_ID column below:
select /*+ index(customer custidx) */ EMP_ID, ZIP_CODE
from EMPLOYEE
where SUBSTR(EMP_ID,1,4) = '2502';
Execution Time - 225 seconds
The SUBSTR function was re-written with a LIKE instead and part of the index is used and the performance is substantially increased:
select EMP_ID, ZIP_CODE
from EMPLOYEE
where EMP_ID LIKE '2502%';
Execution Time - 5 seconds
Comparing wrong data types
If you compare the wrong data types, your index may be suppressed internally. This is because Oracle will re-write the query so that the comparison is correct. This problem is at times difficult to track down.
Comparing Characters to Numbers:
where char_data = 123
could be rewritten to:
where To_Number(char_data) = 123
Comparing Numbers to Characters:
where num_data = ‘123’
could be rewritten lik:e
where To_Char(num_data) = ‘123’
Tip: Comparing mismatched data types could cause an internal index suppression that is difficult to track down. Oracle will often place a function on the column that fixes the mismatch, but suppresses the index.
Function-based Indexes
Function-based indexes allow you to create an index based on a function or expression. Function-based indexes can involve multiple columns, arithmetic expressions or may be a PL/SQL function or C callout. The following example shows an example of a function based index.
Creating the Function-based Index:
CREATE INDEX emp_idx ON emp (UPPER(ename));
An index has been created on the ename column when the UPPER function is used on this column.
Query the emp table using the Function-based Index:
select ename, job, deptno
from emp
where upper(ename) = ‘ELLISON’;
The function-based index (emp_idx) can be used for the query above. For large tables where the condition retrieves a small amount of records, the query yields substantial performance gains over a full table scan.
To comprehend the advantages of function-based indexes consider the following queries.
We run the query, executing a full table scan.
select count(*)
from sample
where ratio(balance,limit) >.5;
Elapse time: 20.1 minutes
from sample
where ratio(balance,limit) >.5;
Elapse time: 20.1 minutes
We create a functional index.
create index ration_idx on sample ( ratio(balance, limit));
We re-run the query using the function-based index.
select count(*)
from sample
where ratio(balance,limit) >.5;
Elapse time: 7 seconds!!!
from sample
where ratio(balance,limit) >.5;
Elapse time: 7 seconds!!!
Note that the function RATIO simply divides argument 1 by argument 2.
using the ‘minus’ operator
The MINUS operator, for example, can be much faster than using WHERE NOT EXISTS or WHERE NOT IN (SELECT). Following is an example of the power of the MINUS operator versus the NOT IN construct. First, the NOT IN approach:SELECT customer_id
FROM customers
WHERE area_code IN (402, 310)
AND zip_code NOT IN (68116, 68106);
Even if we have indexes on both the AREA_CODE and ZIP_CODE columns, the NOT IN predicate, to eliminate two zip codes from the result set, will necessitate a full table scan. On the other hand:SELECT customer_id
FROM customers
WHERE area_code IN (402, 310)
MINUS
(SELECT customer_id
FROM customers
WHERE zip_code IN (68116, 68106);
3.5 using the ‘union’ operator
The UNION operator, which is standard SQL and not peculiar to Oracle, is also a potential shortcut, especially for a self-join with two non contiguous index range values.Following is an example of the UNION operator retrieving two non contiguous result sets in a similar business situation:
SELECT customer_id
FROM customer
WHERE area_code IN (402, 310)
UNION
SELECT customer_id
FROM customers
WHERE zip_code IN (31326, 31327);
Bear in mind that this only helps if the AREA_CODE and ZIP_CODE columns are left-most in the indexes. Using the ‘ROWNUM’
Take advantage of ROWNUM. ROWNUM is a special pseudo-column that exists for every result set. It is quite useful for limiting a potential runaway query and avoiding application grief. It refers to the relative row for a given query, before any ORDER BY clause is applied. This is important to understand. If your statement looks like:SELECT COUNT(*)
FROM customers
WHERE ROWNUM < 100
Oracle will select and return the first 99 rows and the query will halt. If you have a name search on a large table, selecting WHERE NAME LIKE S% could easily return 100,000 rows or more. Rather than forcing your users to logically qualify the query, you can add this row-limit qualifier to end the search when the upper limit is reached:SELECT name, address, city
FROM customers
WHERE name LIKE 'S%' AND ROWNUM < 1000
will return no more than 999 rows. More important, the query will return when the upper limit is reached, before executing any sorts. This is a wonderful saving grace that you should use more often.
Try to avoid ‘OR’ if possible
Placing indexes on statements having an OR clause and multiple WHERE conditions can be difficult. While in previous versions it was essential to index at least one column in each clause OR’ed together, the merging of indexes in the later versions of Oracle (V8+) becomes hazardous to the performance. Experiment with potentially suppression all indexes except the most limiting (retrieves the least amount of rows). Consider the following examples:
Given: Indexes on EMPNO, ENAME and DEPTNO
select ENAME,DEPTNO,CITY,DIVISION
from EMP1
where EMPNO = 1
or ENAME = 'LONEY'
or DEPTNO = 10;
Execution Time: 4400 Seconds
Execution Plan:
TABLE ACCESS EMP1 FULL
The Solution:
SELECT /*+ INDEX(EMP EMP11) */
ENAME, DEPTNO, CITY, DIVISION
FROM EMP1
WHERE EMPNO = 1
OR ENAME = 'LONEY'
OR DEPTNO = 10;
Execution Time: 280 Seconds
Execution Plan:
TABLE ACCESS EMP1 ROWID
TABLE ACCESS EMP11 INDEX RS
Dealing with Inequalities
The cost-based optimizer tends to have problems with inequalities. Since Oracle records the high and low value for a column and assumes a linear distribution of data, problems occur when an inequality is used on a table with a non-linear distribution of data. This can be solved by overriding the optimizer or by using histograms
Given:
The ORDER_LINE Table has 10,000 rows between 1 and 10,000
There are 5000 records (half the table) with an item number > 9990
There is an index on item_no
The Optimizer chooses to use the index, since it believes there are only 10 rows to be retrieved:
SELECT SIZE, ITEM_NO
FROM ORDER_LINE
WHERE ITEM_NO > 9990;
Execution Time: 530 Seconds
The data and half the table will be retrieved by the query, and then we must suppress the index and substantially increase performance. We suppress the index (and override the optimizer) since the query retrieves 50% of the table (which is much more than the 5% or less rule for using the index)!
SELECT /*+ FULL(ORDER_LINE) */ SIZE, ITEM_NO
FROM ORDER_LINE
WHERE ITEM_NO > 9990;
Execution Time: 5 Seconds
Tip: Strongly consider using hints to override the optimizer when using the “<“ and “>“ when the distribution of data is not linear between the high & low values of a column. Histograms may also be employed.
Nested Sub queries
Using nested sub queries instead of joining tables in a single query can lead to dramatic performance gains. Only certain queries will meet the criteria for making this modification. When you find the right one, this trick will take performance improvement to an exponentially better height. The conditions for changing a query to a nested sub query occur when:
Tables are being joined to return the rows from ONLY one table.
Conditions from each table will lead to a reasonable percentage of the rows to be retrieved (more than 10%)
The original query:
SELECT A.COL1, A.COL2
FROM TABLE1 A, TABLE2 B
WHERE A.COL3 = VAR
AND A.COL4 = B.COL1
AND B.COL2 = VAR;
The new query:
SELECT A.COL1, A.COL2
FROM TABLE1 A
WHERE A.COL3 = VAR
AND EXISTS
(SELECT ‘X’
FROM TABLE B
WHERE A.COL4 = B.COL1
AND B.COL2 = VAR);
A real life example:
SELECT ORDER.ORDNO, ORDER.CUSTNO
FROM ORDER_LINE OL, ORDER
WHERE ORDER.ORDNO = OL.ORDNO
AND ORDER.CUSTNO = 5
AND OL.PRICE = 200;
Execution Time: 240 Minutes
The solution:
SELECT ORDNO, CUSTNO
FROM ORDER
WHERE CUSTNO = 5
AND EXISTS
(SELECT ‘X’
FROM ORDER_LINE OL
WHERE ORDER.ORDNO = OL.ORDNO
AND OL.PRICE = 200);
Execution Time: 9 Seconds
Join Methods
The following are the various ways to join row sources together.
Nested Loops Joins
In a nested loops join, Oracle reads the first row from the first row source and then checks the second row source for matches. All matches are then placed in the result set and Oracle goes on to the next row from the first row source. This continues until all rows in the first row source have been processed. The first row source is often called the outertable or driving table, while the second row source is called the innertable. This is one of the fastest methods of receiving the first records back from a join.
Nested loops joins are ideal when the driving row source (the records that you’re looking for) is small and the joined columns of the inner row source are uniquely indexed or have a highly selective non-unique index. Nested loops joins have an advantage over other join methods in that they can quickly retrieve the first few rows of the result set without having to wait for the entire result set to be determined.
Reducing SQL Parsing in Recursive Procedures
Reduce the parse-to-execution ratio in your applications.Description: Extensive SQL parsing can become a serious problem for a heavily loaded Oracle instance. This is especially true if a SQL statement executed many times is also parsed many times. There are a number of standard techniques and guidelines, which help reduce unnecessary parsing. In some special cases, though, even following all the guidelines does not save you from extensive parsing. Find out below how to apply a workaround in a specific situation to further reduce the parse-to-execute ratio of a statement.
The effect can be observed when a SQL cursor is used in a recursively-structured procedure. In such procedures, very often a cursor is opened and for each fetched row the same procedure is called, which opens the same cursor, etc. In most of the cases the recursive call is executed before the cursor is closed. As a result of this, for each new "cursor open" a parse call is executed resulting in as many parses as executions.
Take a look at a simplified recursive procedure using the SCOTT schema:
PROCEDURE recurs (p_mgr IN emp.mgr%TYPE)
IS
CURSOR emp_mgr IS
SELECT empno
FROM emp
WHERE mgr = p_mgr;
BEGIN
FOR c IN emp_mgr
LOOP
recurs(c.empno);
END LOOP;
END recurs;
As you can see the recursive call is executed before the (implicit) cursor is closed. The main idea for reducing the parse calls is to first collect the results of the cursor (for example in a PL/SQL table), then close the cursor and finally cycle through the results and perform the recursive procedure calls. See below and example of such procedure (In this procedure I have used a bulk bind select, but a normal cursor loop can be used too):
PROCEDURE recurs_close (p_mgr IN emp.mgr%TYPE)
IS
CURSOR emp_mgr IS
SELECT empno
FROM emp
WHERE mgr = p_mgr;
TYPE t_empno IS TABLE OF NUMBER(4) INDEX BY BINARY_INTEGER;
p_empno t_empno;
i PLS_INTEGER := 0;
BEGIN
OPEN emp_mgr;
FETCH emp_mgr BULK COLLECT INTO p_empno;
i := emp_mgr%ROWCOUNT;
CLOSE emp_mgr;
FOR j IN 1..i
LOOP
recurs_close(p_empno(j));
END LOOP;
END recurs_close;
In the excerpts of the trace files generated during the procedure execution can be seen that the first procedure has as many parses as executions (14), while the second has 1 parse only.
exec cursor_parse.recurs(7839);
SELECT empno
FROM emp
WHERE mgr = :b1
call count cpu elapsed disk query current rows
------- ------ ----- ---------- ------- ---------- ---------- ----------
Parse 14 0.02 0.15 0 0 0 0
Execute 14 0.00 0.00 0 0 0 0
Fetch 27 0.00 0.05 1 26 28 13
------- ------ ----- ---------- ------- ---------- ---------- ----------
total 55 0.02 0.20 1 26 28 13
exec cursor_parse.recurs_close(7839);
SELECT empno
FROM emp
WHERE mgr = :b1
call count cpu elapsed disk query current rows
------- ----- ---- ------- ---- ----- ------- --------
Parse 1 0.00 0.00 0 0 0 0
Execute 14 0.00 0.00 0 0 0 0
Fetch 14 0.00 0.00 0 14 28 13
------- ----- ---- ------- ----- ------ -------- --------
total 29 0.00 0.00 0 14 28 13
Most of the important statistics are better for the execution of the recurs_close than the recurs procedure.
Statistic name recurs recurs_close
opened cursors cumulative 26 12
recursive calls 89 50
session logical reads 84 72
consistent gets 41 29
no work - consistent read gets 32 20
cursor authentications 2 1
parse count (total) 26 12
Try to minimize the disk sorts
A sort operation can become problematic if it requires disk I/O for it to complete. The memory space that can be used to perform a sort operation is controlled by the system parameter SORT_AREA_SIZE and PGA_AGGREGATE_TARGET. If the sort is too large to be contained in the space determined by the above parameters, oracle will continue the sort on disk. This is where performance problems can begin to develop.
The below is the short list of some of the most commonly used SQL commands that can cause sorts
- CREATE INDEX, ALTER INDEX ... REBUILD
- DISTINCT
- ORDER BY
- GROUP BY
- UNION
- INTERSECT
- MINUS
- IN, NOT IN
- Aggregation functions (MAX, MIN, AVG, SUM)
- Certain unindexed joins
- Certain correlated subqueries
Since disk sort involves both physical reads and physical writes, try to avoid using distinct, order by, group by, union, intersect, minus and some aggregate functions if the application permits.
Watch Non-Indexed WHERE Conditions
Oracle evaluates Non-Indexed conditions linked by AND bottom upBad: select * from address where
areacode = 972 and
type_nr = (select seq_nr from code_table where type = ‘HOME’)
Good: select * from address where
type_nr = (select seq_nr from code_table where type = ‘HOME’) and
areacode = 972
Oracle evaluates Non-Indexed conditions linked by OR top down
Bad: select * from address where
type_nr = (select seq_nr from code_table where type = ‘HOME’) or
areacode = 972
Good: select * from address where
areacode = 972 or
type_nr = (select seq_nr from code_table where type = ‘HOME’)
Consider IN or UNION in place of OR
if columns are not indexed, stick with ORif columns are indexed, use IN or UNION in place of OR
IN example
Bad: select * from address where
state = 'TX‘ or
state = 'FL‘ or
state = 'OH‘
Good: select * from address where
state in ('TX','FL','OH')
UNION example
Bad: select * from address where
state = ‘TX’ or
areacode = 972
Good: select * from address where
state = ‘TX’
union
select * from address where
areacode = 972
Weigh JOIN versus EXISTS Sub-Query
use table JOIN instead of EXISTS sub-querywhen the percentage of rows returned from the outer sub-query is high
select e.name, e.phone, e.mailstop
from employee e, department d
where e.deptno = d.deptno
and d.status = ‘ACTIVE’
use EXISTS sub-query instead of table JOIN
when the percentage of rows returned from the outer sub-query is low
select e.name, e.phone, e.mailstop
from employee e
where e.deptno in (select d.deptno
from department d
where d.status != ‘ACTIVE’)
Consider EXISTS in place of DISTINCT
avoid joins that use DISTINCT, use EXISTS sub-query insteadBad: select distinct deptno, deptname from emp, dept where
emp.deptno = dept.deptno
Good: select deptno, deptname from dept where
exists (select ‘X’ from emp where
emp.deptno = dept.deptno)
Note – only has to find one match
Consider NOT EXISTS in place of NOT IN
avoid sub-queries that use NOT IN, use NOT EXISTS instead Bad: select * from emp where
deptno not in (select deptno from dept where
deptstatus = ‘A’)
Good: select * from emp where
not exists (select ‘X’ from dept where
deptstatus = ‘A’ and
dept.deptno = emp.deptno)
Note – only has to find one non-match
Use PL/SQL to reduce network traffic
Utilize PL/SQL to group related SQL commands and thereby reduce network trafficBad:
select city_name, state_code
into :v_city, :v_sate
from zip_codes where zip_code = ‘75022’;
insert into customer (‘Bert Scalzo’,’75022’, :v_city, v_state);
Good:
begin
select city_name, state_code
into :v_city, :v_sate
from zip_codes where zip_code = ‘75022’;
insert into customer (‘Bert Scalzo’,’75022’, :v_city, v_state);
end;