Parallel PL/SQL Functions and Global Temporary Tables... and Wrong Results

Datetime:2016-08-23 04:04:52          Topic: SQL           Share

Recently I got a question from a customer about a parallel query which produces wrong results. The query involves a global temporary table and a parallel enabled PL/SQL function. Before talking about this specific query I want to briefly show the effect of using PL/SQL functions in a parallel query.

PL/SQL functions in parallel queries

When you use a PL/SQL function, as a predicate for example, in a parallel query the function is executed by the query coordinator (QC). This can cause some parts of the query or the whole query to be serialized which means significantly worse performance. Here is an example.

create table s as 
select rownum id,rpad('X',1000) pad
from dual 
connect by level<=10000;

create or replace function f_wait(id in number) return number 
is
begin
    dbms_lock.sleep(0.01);
    return(id);
end;
/

I have a table with 10K rows and a PL/SQL function that takes an input, waits 0.01 seconds and returns the input back. Let's compare a serial and a parallel query using this function.

SQL> set timing on
SQL> select count(*) from s where id=f_wait(id);

  COUNT(*)
----------
     10000

Elapsed: 00:01:40.24

SQL> select /*+ parallel(4) */ count(*) from s where id=f_wait(id);

  COUNT(*)
----------
     10000

Elapsed: 00:01:40.28

Both the serial and the parallel query ran for around 100 seconds, this is because we called the function by using the column ID as input and the function was executed for every row of table S (10K rows) and we waited 0.01 seconds for each row. We can understand why the parallel query did not improve the response time by looking at the execution plan.

Even though there is a parallel hint in the query the execution plan is a serial plan. This is because the PL/SQL function can only be executed by the QC, this makes the whole query go serial.

Parallel enabled PL/SQL functions

How can we change this behavior? How can we make sure the function is executed in parallel so that the query runs faster? The way to tell Oracle that a function can be executed in parallel is to use the keywordPARALLEL_ENABLE in the function definition. This keyword tells Oracle that this function is safe to be executed by an individual PX server. Here is what happens when we add that keyword.

create or replace function f_wait(id in number) return number 
parallel_enable
is
begin
dbms_lock.sleep(0.01);
return(id);
end;
/

SQL> select /*+ parallel(4) */ count(*) from s where id=f_wait(id);

  COUNT(*)
----------
     10000

Elapsed: 00:00:25.81

The elapsed time dropped to a quarter of what it was before, from 100 seconds to 25 seconds. This is because we had 4 PX servers running the function concurrently. Here is the plan this time.

Now the plan is fully parallel and operation #6, which is the filter operation running the function, is executed in parallel.

If you have to use a PL/SQL function make sure to set it as PARALLEL_ENABLE if you know that it is safe to be executed by PX servers. This is required to prevent serialization points in the execution plan.

Now, to the customer question I mentioned before.

Parallel enabled PL/SQL functions and global temporary tables

As you may already know, the data in a global temporary table is private to a session. You can only see the data populated in your own session, you cannot see the data inserted by other sessions. So, what happens if you populate the temporary table in your session and then run a parallel query on it? As parallel queries use multiple PX servers and multiple sessions, can PX servers see the data in the temporary table?

create global temporary table ttemp (col1 number) on commit delete rows;

insert into ttemp 
select rownum from dual
connect by level<=10000;

select /*+ parallel(2) */ count(*) from ttemp; 

  COUNT(*)
----------
     10000

In this case we had two PX servers scanning the temporary table and the reported count is correct. This indicates individual PX servers were able to see the data populated before by the user session. Parallel queries are different in the sense that parallel sessions working on a temporary table can see the data populated by the QC before. When running a query against the temporary table, the QC is aware of the temporary table and sends the segment information to the PX servers so that they can read the data.

PL/SQL functions querying temporary tables change this behavior. Here is a simplified version of the customer problem.

create table t1 (id number);
insert into t1 values (1000);
commit;

create global temporary table tempstage (col1 number) 
on commit preserve rows;

create or replace function f_test
return number
parallel_enable
is
v_var number;
begin
select col1 into v_var from tempstage;
return v_var;
end;
/

Here we have a regular table T1 , a temporary table TEMPSTAGE , and a parallel enabled PL/SQL function F_TEST that queries the temporary table. Let's populate the temporary table, and compare the results of a parallel and a serial query using the function as a predicate.

SQL> insert into tempstage values (100);

1 row created.

SQL> commit;

Commit complete.

SQL> select /*+ parallel(2) */ * from t1 where id>f_test;

no rows selected

SQL> select * from t1 where id>f_test;

        ID
----------
      1000

Things do not look good for the parallel query here, it returned wrong results to the user. This is because the function is declared as safe to be executed by individual PX servers. Each PX server uses its own session and as a result they cannot see the data populated by the user session. This is different than the previous example where the query was running against the temporary table, the QC in that case knew a temporary table was involved, here it only sees a function call which is parallel enabled.

So, be careful when declaring functions as parallel enabled, be aware that the function will be executed by PX servers which can cause some unintended behavior.Think about how the function can behave when executed by multiple sessions and processes. Only declare it as parallel enabled when you are sure it is safe.





About List