PL/SQL Parallelism made easier in 11g

Today was the first day of the two day AIOUG Event Sangam 2010. Jonathan Lewis was the Key Speaker this year and I presented a session on “Developing a Scalable Application”. One of my slide mentioned about PL/SQL Parallelism. This is in continuation to my previous blog where I posted about Manual Parallelism that takes the ROWID’s of the table and distributes it across multiple chunks. The limitation with Manual Parallelism is that even though a pl/sql block needs to process 5% to 10% of the rows from a huge table, the Manual Parallelism will select and split the ROWIDs of the entire table. This means, many of the chunks may process very less data or almost zero data.

Oracle Database 11g Release 2 introduced a package dbms_parallel_execute, that makes it easier to split a table into multiple chunks, based on ROWID’s or SQL or by Number Column. During my AIOUG session, I demonstrated the use of this package to split a table into multiple chunks and schedule a batch process in parallel. The best part of this package is that the chunks can be split based on a condition and only the chunks that contain to-be-processed data will be created. Thus the rows to be processed by each of the chunks will have uniform distribution.

As mentioned in my earlier blog, I will reiterate the benefit of pl/sql parallelism, which is, optimal use of available resources. In this blog, I will post the test case that I executed during the AIOUG session and this should help you reproduce this in your environment.

In this example, consider a Nightly Batch Job NIGHTLY_JOB and this job processes the rows from VIVEK_TEST Table using an approach which is widely implemented by the developers or the dba’s. This is single job processing the required rows and thus is a very slow and an inefficient approach. There are 7200 rows in this table to be processed and assuming that the business logic in the pl/sql takes atleast 1 second and then moves on to another row. Which means, at a minimum, each row to be processed will take atleast 1 second, thus the entire process will take atleast 7200 second (2 Hours). This is the code of a batch process :

create table vivek_test pctfree 50 pctused 50 as
select * from all_objects a
where rownum=1000 then
    commit;
    l_limit:=0;
  end if;
  end loop;
end;
/
exec nightly_batch;

Once the job is executed, you can monitor the progress of this batch process from another session. V$SESSION will report the progress in MODULE column, as posted by dbms_application_info procedure.

The changes required to parallize the process is pasted below. To monitor the progress and maintain the log of each of the CHUNKS, I have created a process_status table. One thing worth noticing is the PCTFREE and PCTUSED setting, which is set to 50. The reason I have this setting is that the number of chunks to be created is dependant on the number of extents in the table and I wanted multiple extents for my table. In a production environment, number of extents should not be an issue.

create table process_status (
process_name		varchar2(30),
job_id			number,
s_rowid			rowid,
e_rowid			rowid,
start_time		date,
end_time		date,
processed_rows		number,
completed_yn		varchar2(1),
status			varchar2(1000));

create or replace procedure parallel_job(chks in number) is
l_statement	varchar2(1000);
l_jobno		number;
l_retry		number;
l_status	   number;
l_chunkid	   number;
l_srowid	   rowid;
l_erowid	   rowid;
l_anyrows   	boolean;
l_errcode	   pls_integer;
begin

  execute immediate 'truncate table process_status';
  /* CREATE A PARALLEL TASK */
  dbms_parallel_execute.create_task('NIGHT_JOB');

  dbms_parallel_execute.create_chunks_by_rowid('NIGHT_JOB','VIVEK','VIVEK_TEST',true, chks);
 /* INSERT THE CHUNKS INTO PROCESS_STATUS TO LOG THE STATUS OF EACH JOB */
  insert into process_status(process_name, job_id, s_rowid, e_rowid)
  select 'NIGHT_JOB', chunk_id, start_rowid, end_rowid from user_parallel_execute_chunks
  where task_name='NIGHT_JOB' order by 1;
  commit;
  /* Run the Procedure here */
  LOOP
      -- Get a chunk to process; if there is nothing to process, then exit the 
      -- loop;
      DBMS_PARALLEL_EXECUTE.GET_ROWID_CHUNK('NIGHT_JOB',l_chunkid, l_srowid, l_erowid,l_anyrows);
      IF (l_anyrows = false) THEN EXIT; END IF;
      BEGIN
	l_statement:='nightly_batch('||l_chunkid||','||''''||l_srowid||''''||','||''''||l_erowid||''''||');';
	dbms_job.submit(l_jobno,l_statement);
	update process_status set completed_yn='P', start_time=sysdate where job_id=l_chunkid;
	commit;
        DBMS_PARALLEL_EXECUTE.SET_CHUNK_STATUS('NIGHT_JOB',l_chunkid,DBMS_PARALLEL_EXECUTE.PROCESSED);
      EXCEPTION WHEN OTHERS THEN
        DBMS_PARALLEL_EXECUTE.SET_CHUNK_STATUS('NIGHT_JOB', l_chunkid,
          DBMS_PARALLEL_EXECUTE.PROCESSED_WITH_ERROR, SQLCODE, SQLERRM);
	  l_errcode:=sqlcode;
  	update process_status set completed_yn='F', end_time=sysdate, status='FAILED WITH ERROR '||l_errcode||'.' where job_id=l_chunkid;
      END;
  commit;
  END LOOP;
  dbms_parallel_execute.drop_task('NIGHT_JOB');
end;
/

The only change required in our NIGHTLY_BATCH is the introduction of INPUT parameters and update of the process_status at the completion or failure of a chunk.

create or replace procedure nightly_batch(jobid number, low_rowid rowid, high_rowid rowid) is
l_count		number:=0;
l_module	    varchar2(30);
l_limit		number:=0;
l_errcode	    pls_integer;
cursor c_main is
select rowid, owner, object_name, object_type from vivek_test
where nvl(processed_flag,'N')='N'
and	rowid between low_rowid and high_rowid;
begin
  for i in c_main
  loop
  l_count:=l_count+1;
  l_limit:=l_limit+1;
  l_module:=l_count||' out of 7200/40=180';
  dbms_application_info.set_module('PL_SQL',l_module);
  dbms_lock.sleep(1); /* Assuming Processing of a row takes 1 second */
  update vivek_test set processed_flag='Y' where rowid = i.rowid;
  if l_limit>=1000 then
    commit;
    l_limit:=0;
  end if;
  end loop;
  update process_status set completed_yn='Y', end_time=sysdate, processed_rows=l_count 
  where job_id = jobid;
  commit;
  exception when others then
  l_errcode:=sqlcode;
  update process_status set completed_yn='F', status='FAILED WITH '||l_errcode||'.' where job_id = jobid;
  commit;
end;
/
exec parallel_job(100);

Monitor the progress of this job and you would see a significant improvement as the job completes in less than 5 minutes, as against 2 Hours, and utilizes the hardware resources optimally.

Manual Parallelism – Rowid Based for Partitioned / Sub Partitioned Tables

Currently, I am working on an optimization project where few Scheduler Jobs are Scheduled and Run every night at 9:00 pm using dbms_scheduler. These jobs are then stopped every day morning 9:00 a:m, so that, these do not impact the OLTP sessions. Customer has implemented VPD and each of these job is scheduled from multiple users, so that, the records are processed based on the policy function and the access predicates for that user. The flaw with this logic is that the total rows to be processed across each of the users are not uniform, which means, if a job for all the users takes 45-50 minutes, one user processing takes almost 3-4 hours.

For optimizing these, the first thought came to the mind was to distribute the rows to be processed uniformly across each of the jobs and then to apply vpd predicates at runtime. This required splitting of the main processing cursor into multiple chunks. Therefore, the table involved in the main cursor query was taken out with an intention to use ROWID Manual Parallelism, introduced by Thomas Kyte. This would help us distribute the rows uniformly across each of the jobs with an efficient use of available resources. I demonstrated a case study on one of my previous blog, where I used this to purge data from a table. The link is as under :

I have come across many pl/sql procedures that processes data and take certain amount of time that can be brought down. Moreover, whenever these procedures are scheduled or run, the CPU Utilization is very less. I remember a discussion with a customer, wherein, they mentioned that one of my “Annual Interest Calculation” job is running very slow and the CPU is 90 to 95% IDLE. This is an expected behaviour as a job, when run as a single process, will be scheduled on only one Processor and will not take the advantage of available resources. Therefore, splitting the job into multiple chunks should help in reducing the process completion time by way of better resource utilization.

As an example, a pl/sql block, as mentioned below, can be optimized and run in parallel. If a single job process 10000 Rows and takes 30 minutes, running it into 10 parallel stream should bring it down to 3-5 minutes.

create procedure test_job as
cursor main is
select column1, column2, column3
from	test
where x=1
and   y=2;
begin
  for i in main
   loop
     some processing;
   end loop;
end;


create procedure test_job(job_id number, low_rid rowid, hi_rid rowid) as
cursor main is
select column1, column2, column3
from	test
where x=1
and   y=2
and   rowid between low_rid and hi_rid;
begin
  for i in main
   loop
     some processing;
   end loop;
end;

This change may require minor changes in the procedure which is worth implementing.

Back to the Original case study. As mentioned earlier, for splitting the rows into multiple chunks, I executed the Query on dba_extents and dba_objects, to get the Start and End ROWID based on the number of chunks. This Query usually takes 2-3 minutes, but was taking almost 45-50 minutes and had to be cancelled. I checked the plan of this Query and it was doing a Merge Join Cartesian. Further digging into the issue, I could notice that the table which was to be splitted was a partitioned and subpartitioned table. This was for the first time I faced this issue and therefore had to modify the logic of the Query to accomodate Partitioned and Sub-Partitioned Objects.

I created a pl/sql block to check whether the table is a Non-Partitioned or Partitioned or Sub-Partitioned and based on this, a loop is executed. Below is the code that accomplished this task for me. An explanation on this is as under :

  1. Input Values are Table_Owner, Process_name, Table_name.
  2. L_CHKS, in my case, is a Global Variable declared in Package specification and defaults to the number of chunks to be created.
  3. For a Partitioned or Sub-Partitioned Objects, the number of chunks will be more than L_CHKS, therefore, at the end of the loop, these are updated again using mod function.
  procedure manual_parallelism(t_owner varchar2, p_name varchar2, t_name varchar2) as
   /* l_chks is dynamic and should be based on available CPU's and is standard for all the jobs */
   /* This value can be changed at later stage if the hardware is upgraded */
   l_part	varchar2(3);
   l_subpart	varchar2(1);
   part_name	varchar2(30);
   l_dataobjectid number;
   cursor partitioned(t_owner varchar2, t_name varchar2) is
   select partition_name from dba_tab_partitions
   where table_owner = t_owner
   and   table_name = t_name
   order by partition_position desc;

   cursor sub_partitioned(t_owner varchar2, t_name varchar2, part_name varchar2) is
   select subpartition_name from dba_tab_subpartitions s
   where s.table_owner = t_owner
   and   s.table_name = t_name
   and   s.partition_name = part_name;
  begin
   /* Assumption in this case is that at any given point in time l_chks Jobs can be easily scheduled */
     select partitioned into l_part from dba_tables where owner=t_owner and table_name=t_name;
     if l_part='YES' then
       begin
	   select 'Y' into l_subpart from dba_tab_subpartitions 
	   where	table_owner=t_owner 
	   and		table_name=t_name
	   and		rownum=1;
       exception when no_data_found then
	   l_subpart:='N';
       end;
       if l_subpart='Y' then /* if Sub Partitioned */
	   for i in partitioned(t_owner, t_name)
	   loop
	     for j in sub_partitioned(t_owner, t_name, i.partition_name)
	     loop
	     insert into ast.process_status(process_name, job_id, s_rowid, e_rowid)
	     select p_name, grp,
  		   dbms_rowid.rowid_create( 1, data_object_id, lo_fno, lo_block, 0 ) min_rid,
		   dbms_rowid.rowid_create( 1, data_object_id, hi_fno, hi_block, 10000 ) max_rid
	     from (
		   select distinct grp,
		   first_value(relative_fno) over (partition by grp order by relative_fno, 
		   block_id rows between unbounded preceding and unbounded following) lo_fno,
		   first_value(block_id) over (partition by grp order by relative_fno, 
		   block_id rows between unbounded preceding and unbounded following) lo_block,
		   last_value(relative_fno) over (partition by grp order by relative_fno, 
		   block_id rows between unbounded preceding and unbounded following) hi_fno,
		   last_value(block_id+blocks-1) over (partition by grp order by relative_fno, 
		   block_id rows between unbounded preceding and unbounded following) hi_block,
		   sum(blocks) over (partition by grp) sum_blocks
		   from (
		   select	relative_fno,
				block_id,
				blocks,
				trunc( (sum(blocks) over (order by relative_fno, block_id)-0.01) /
				(sum(blocks) over ()/l_chks) ) grp
		   from		dba_extents
		   where	segment_name = t_name
		   and		owner = t_owner
		   and		partition_name=j.subpartition_name
		   order by block_id)
		   ),
	     (select data_object_id from dba_objects 
	     where	object_name = t_name 
	     and	subobject_name=j.subpartition_name 
	     and	owner=t_owner);
	     end loop; /* End loop of Sub Partitioned cursor */
	   end loop; /* End loop of Partitioned Cursor */
	   commit;
       else /* it means only Partitioned no Sub Partitions */
	   for i in partitioned(t_owner, t_name)
	   loop
	     insert into ast.process_status(process_name, job_id, s_rowid, e_rowid)
	     select p_name, grp,
		   dbms_rowid.rowid_create( 1, data_object_id, lo_fno, lo_block, 0 ) min_rid,
		   dbms_rowid.rowid_create( 1, data_object_id, hi_fno, hi_block, 10000 ) max_rid
	     from (
		   select distinct grp,
		   first_value(relative_fno) over (partition by grp order by relative_fno, 
		   block_id rows between unbounded preceding and unbounded following) lo_fno,
		   first_value(block_id) over (partition by grp order by relative_fno, 
		   block_id rows between unbounded preceding and unbounded following) lo_block,
		   last_value(relative_fno) over (partition by grp order by relative_fno, 
		   block_id rows between unbounded preceding and unbounded following) hi_fno,
		   last_value(block_id+blocks-1) over (partition by grp order by relative_fno, 
		   block_id rows between unbounded preceding and unbounded following) hi_block,
		   sum(blocks) over (partition by grp) sum_blocks
		   from (
		   select	relative_fno,
				block_id,
				blocks,
				trunc( (sum(blocks) over (order by relative_fno, block_id)-0.01) /
				(sum(blocks) over ()/l_chks) ) grp
		   from		dba_extents
		   where	segment_name = t_name
		   and		owner = t_owner
		   and		partition_name=i.partition_name
		   order by block_id)
		   ),
	     (select data_object_id from dba_objects 
	     where	object_name = t_name 
	     and	subobject_name=i.partition_name 
	     and	owner=t_owner);
	   end loop; /* End loop of Partitioned Cursor */
	   commit;
       end if; /* End if of l_Subpart='Y' */
     else /* Insert Statement for Non-Partitioned Object */ 
  	   insert into process_status(process_name, job_id, s_rowid, e_rowid)
	   select p_name, grp, 
		   dbms_rowid.rowid_create(1,data_object_id, lo_fno, lo_block,0) min_rid,
		   dbms_rowid.rowid_create(1,data_object_id, hi_fno, hi_block,0) max_rid
	   from (
	   select 	distinct grp,
			first_value(relative_fno) over (partition by grp order by relative_fno, 
			block_id rows between unbounded preceding and unbounded following) lo_fno,
			first_value(block_id) over (partition by grp order by relative_fno, 
			block_id rows between unbounded preceding and unbounded following) lo_block,
			last_value(relative_fno) over (partition by grp order by relative_fno, 
			block_id rows between unbounded preceding and unbounded following) hi_fno,
			last_value(block_id) over (partition by grp order by relative_fno, block_id 
			rows between unbounded preceding and unbounded following) hi_block,
			sum(blocks) over (partition by grp) sum_blocks from (
	   select 	relative_fno, block_id, blocks, 
			sum(blocks) over (order by relative_fno, block_id) cum_blocks,
			sum(blocks) over () tot_blocks,
			trunc((sum(blocks) over (order by relative_fno, block_id)-0.01) /
			(sum(blocks) over ()/l_chks)) grp
	   from		dba_extents
	   where 	segment_name=t_name
	   and		owner=t_owner
	   order by relative_fno, block_id)),
	   (select data_object_id from dba_objects where owner=t_owner and object_name=t_name);
	   commit;
     end if; /* END IF of L_PART='Y' */
     update ast.process_status set job_id=mod(rownum,l_chks) where process_name=p_name;
     commit;
  end manual_parallelism;

You can define the value of L_CHKS in the procedure and the procedure will split the table into multiple chunks with minimum group as 0 and maximum group as LCHKS-1. Once these chunks are created, additional intelligence can be built to schedule these L_CHKS jobs and spawn the new jobs based on the completion status of the running job. The Inputs to the Scheduled procedure will be the the Job_ID, which is GRP in our case, Starting Rowid, Ending Rowid, Process_Name. In my case, I scheduled 80 jobs at a time and wrote a logic at the end of the scheduled procedure wherein the process immediately starts the next job with the same job_id, which means, if a process with job_id 1 completes, it will spawn another job with job_id as 1 and so on. This will ensure that at any given point of time, there are 80 jobs running and no manual intervention is required.

Once this pl/sql parallelism was implemented, the total completion time of the process was around 30-35 minutes. Further, ROWID parallelism also ensured that there was no Block Level Contention between each of the jobs thus improving the performance.

Oracle 11g provides a package to accomplish this very easily using dbms_parallel_execute and developers may leverage the benefit of this readymade package to split a long running pl/sql block in parallel, thus improving the performance.

Rowid based Parallelism (Manual) v/s Oracle Parallelism !

This blog was one of my most appreciated writeup, with a good amount of hits. Since, my previous blog url is unaccessible and I want my readers to have access to this blog, thought of re-posting it again. Another reason, to re-post this blog, is to make the dba’s, and more importantly, the developers aware of a very effective and failure free strategy, when it comes to accessing a Huge Table.

A background on a recent Purging Activity (Reason behing re-posting my previous blog)
On 17th August, I wrote on “Efficient Purging”, which discussed on purging data from a 60GB table. This purging was achieved in a very efficient and timely manner during my visit to a Customer in July 2009. Recently, came across a similar purging activity from a partitioned table of 75 GB in size. The strategy that I used, in this case, was Approach 1, as mentioned in my previous blog. Reason being, rows from all the partitions were to be accessed and therefore, CTAS seemed to be a better approach here. Since, I worked on two purging activities, blogs on these two topic were worth posting.

The recent purging activity was for one of my customer. August 15th, being a National Holiday in India, and this year, it coincided with a long weekend, the customer took a downtime of 24 hours to purge the data from some set of tables. At a very last moment, changing the strategy seemed to be non-feasible, therefore, customer went ahead with the pre-decided strategy of NORMAL Delete. While 90% of the tables were small in size, 8-9 tables were huge, especially one table that was 75 GB in size. This was a partitioned table. The biggest challenge, in this activity was, maintaining Data Consistency. This means, that once the activity starts and data from some of the tables is purged and commited, then the activity is either to be completed 100% or rolledback. Therefore, before the activity, backup of the entire database was taken.

Once you know the strategy used by the application team to get the data purged, you will get the link between my previous blog and this.

Strategy used and planned by the Application Team
The strategy planned by the Application Team for the huge tables were :

  1. Populate 5 Tables with the ROWID of the rows to be Purged.
  2. Drop all the Indexes from the to-be-purged table.
  3. Delete from the Table from 5 different sessions.
  4. Re-Org the Table and Create all required Indexes.

Each of the 5 tables, as mentioned in Step 1 above, were populated with “Total Rows to be Purged” / 5 number of rows, with a motive of achieving parallelism. These tables were populated before the downtime. Once these tables were populated, the DELETE statement was executed from 5 different sessions. Clearly, this logic is very resource intensive, time consuming and an inefficient way of achieving the goal, reasone being :

  1. The to-be-purged table, a huge table, is scanned five times.
  2. Each of these scans were Full Table Scans.
  3. Delete generates heavy redo and is not an efficient way to delete large number of rows.
  4. Table re-org and Index Creation will be required after the activity.

The most inefficient step was Full Table Scan of a huge table, and this is where the activity was once at a stage, where it was on a verge of roll back. The problem occured for one table of 75 GB and at this stage, Approach 1 i.e. Creating a New Table with the data to be retained, was adopted. The DELETE statements, from 5 different sessions, that ran for almost 4-5 hours, completed in less than 2 hours using CTAS approach.

Once the activity was over, customer requested a so-called learnings meeting to discuss, Why and Where the activity went wrong ? While the conceptual issues with NORMAL Delete, as mentioned in my previous and this blog, was explained, one step that took most of the time and seems to be one of the reason for this failure, was the way parallelism was adopted by the Application Team. This was during this review discussion that the ROWID Parallelism was explained to the Application Team as an efficient way of spawning multiple sessions. The benefit of ROWID parallelism is that it splits huge table into smaller chunks and each session works on its own chunk. This would have been also a better choice, but needs to be adopted based on the resources available. If I am short of resources, then, instead of using Oracle Parallelism, manual parallelism would be a better choice as it eliminates the chances of any failure. This is well explained in my ROWID based Parallelism blog posted on my previous Blog URL.

For the benefit of my readers, I am again posting this on this site and it explains the way I efficiently used this on a huge table and on a low capacity server.

The Extract from my BLOG

Recently, I had come across an interesting challenge of purging data from a 500gb table. This was not a plain PURGE but was based on a join and filtering on two other tables. The reason I term it as a challenge is because the database was on a low end server with 12 CPU and 52 GB RAM.The Original Query used for purging is as under :

create table wia_values_new_1 nologging parallel 24
tablespace A_TS_IDX as
select a.*from wia_values a,
(select a.it_type ,a.it_key ,b.name from wi a, wia b
where a.begin_date > ’01-NOV-2007′ and a.it_type=b.it_type) vv
where a.it_type=vv.it_type
and a.it_key=vv.it_key
and a.name=vv.name;

To accomplish this task, it was obvious to run the queries with parallel processes. On a 12 CPU Machine, the Nested Loop Join took more than 10-12 hours and had to be aborted as this much time was unacceptable. With a HASH Join, the query used to fail with UNABLE TO Extent Temporary Segments after 3-4 hours of execution. This means, restarting of the entire process.

Oracle 9i onwards, one can use dbms_resumable to take care of these errors. Once fixed, the process will continue from the point where it failed.

Finally, I recollected a very good methodology, called as ROWID Parallelism, introduced by Thomas Kyte. I read this long back in his (and one of my favourite) book “Effective Oracle by Design”, but have practically used and implemented this for the first time. Believe me, with this parallelism, I got a the tremendous performance benefit and could accomplish this challenging task in less than 2 hours with great ease.

Unfortunately, I could not use these on production because due to the challenge involved, the customer provided us with additional CPU’s and Memory, and a NESTED Loop query did the purge in 3 hours. But, based on the comparision, these rowid parallelism could have achieved the task in less than an hour and half.

The steps that I performed were as under :

  1. Split the table into multiple chunks (based on the number of rowid parallelism) that would be used. In my case, the extents of 500 GB table were scattered across 208 datafiles, hence, I splitted the table into 200 chunks.
  2. Create a Job_Table, to store the low and high rowids of each chunk. This table will store some additional data that can be used later.
  3. Spawn and Schedule parallel jobs using dbms_job. Each job will process its own chunk.
  4. Merge the data into one single table.

The advantage of using this parallelism is :

  1. Easier to split the table into multiple chunks.
  2. Easier to manage, based on the CPU Utilization.
  3. If anyone of the script or job fails, it is easier to restart that single job rather than restarting the entire process all again.

These are the queries that were used to accomplish this purging task using ROWID Parallelism.

Job_Table

create table job_table(
job_id		int primary key,
lo_rid		rowid,
hi_rid		rowid,
completed_yn	varchar2(1),
rows_processed	number(20),
start_time	date,
end_time	date);

Spawning and storing of multiple chunks into job_table

insert into job_table(job_id, lo_rid, hi_rid)
select  grp, dbms_rowid.rowid_create(1,data_object_id, lo_fno, lo_block,0) min_rid, 
	dbms_rowid.rowid_create(1,data_object_id, hi_fno, hi_block,0) max_rid
from	(select  distinct grp, first_value(relative_fno) over (partition by grp order by relative_fno, 
	block_id rows  between unbounded preceding and unbounded following) lo_fno, 
	first_value(block_id) over (partition by grp order by relative_fno, 
	block_id rows  between unbounded preceding and unbounded following) lo_block, 
	last_value(relative_fno) over (partition by grp order by relative_fno, 
	block_id rows  between unbounded preceding and unbounded following) hi_fno, 
	last_value(block_id) over (partition by grp order by relative_fno, 
	block_id rows  between unbounded preceding and unbounded following) hi_block, 
	sum(blocks) over (partition by grp) sum_blocks 
	from	(select  relative_fno, block_id, blocks, sum(blocks) over (order by relative_fno, block_id) cum_blocks, 
		sum(blocks) over () tot_blocks, trunc((sum(blocks) over (order by relative_fno, block_id)-0.01) / (sum(blocks) over ()/200)) grp
		from	dba_extents
		where	segment_name='WIA_VALUES'
		and	owner=user
		order by relative_fno, block_id)),
		(select data_object_id from dba_objects
		where	owner=user
		and	object_name='WIA_VALUES');
commit;

Now I have 200 rows in this table with low and high rowids that take care of the entire table.

GRP MIN_RID            MAX_RID
0   AABDksAALAAAAAJAAA AABDksAB2AAB9f5AAA         
1   AABDksAB2AAB9gZAAA AABDksACkAACSFJAAA         
2   AABDksACkAACSFZAAA AABDksADSAAAsxJAAA         
3   AABDksADSAAAsx5AAA AABDksADzAACT6pAAA         
4   AABDksADzAACT7JAAA AABDksAEkAAAADJAAA         
5   AABDksAEkAAAADpAAA AABDksAFMAAAKFZAAA         
6   AABDksAFMAAAKFpAAA AABDksAGmAAAYVJAAA         
7   AABDksAGmAAAYV5AAA AABDksAHTAAAnsJAAA         
8   AABDksAHTAAAnsZAAA AABDksAH6AACS0pAAA         
9   AABDksAH6AACS1JAAA AABDksAIeAABo65AAA        
10  AABDksAIeAABo+5AAA AABDksAJRAAAwmZAAA        
11  AABDksAJRAAAwnZAAA AABDksAJ1AACDppAAA

I created 200 tables in which, a parallel job will insert required rows into.

declare
l_statement varchar2(200);
begin
  for i in 0..199
  loop
  l_statement:='create table wia_new_'i' tablespace A_TS_IDX as select * from wia_values where 1=2';
  execute immediate l_statement;
  end loop;
end;

Then I created a Database Procedure that will be executed in parallel and the input value will be the job_id. Based on this input job_id, the insert script will pickup the low and high rowids and will do a rowid range scan of the table. Based on the status (completion or failure), the job_table will be updated.

create or replace procedure p_job(p_job in number) as
l_job_table  job_table%rowtype;
l_completed varchar2(1);
l_statement varchar2(2000);
l_rows  number(20):=0;
l_date  date;
begin
  select * into l_job_table from job_table
  where job_id=p_job
  and   nvl(completed_yn,'N')!='P'
  and	(completed_yn!='Y' or completed_yn is null);
  l_rows:=sql%rowcount;
  if l_rows>0 then 
    execute immediate 'alter session set db_file_multiblock_read_count=128'; 
    select sysdate into l_date from dual; 
    update job_table set completed_yn='P', start_time=l_date where job_id=p_job; 
    commit; 
    l_statement:='insert /*+ append */ into wia_new_'||p_job||' nologging select  a.* from  wia_values a, (select a.it_type ,a.it_key  ,b.name  from  wi a,wia b  where  a.begin_date > ''''''01-NOV-2007''''''  and  a.it_type=b.it_type) vv where  a.it_type=vv.it_type and  a.it_key=vv.it_key and  a.name=vv.name and a.rowid between'''''l_job_table.lo_rid''''' and'''''l_job_table.hi_rid''''; 
    begin 
       execute immediate l_statement; 
       l_rows:=sql%rowcount; 
       select sysdate into l_date from dual; 
       update  job_table set  completed_yn='Y',  rows_processed=l_rows,  end_time=l_date where  job_id=p_job; 
       commit;  
       exception when others then  
       update  job_table  set  completed_yn='F',   start_time=null  where  job_id=p_job;  
       commit; 
    end;
  end if;
end;

DBMS_JOB is used to schedule jobs at regular interval and based on the CPU Utilization.

declare
l_jobno number;
BEGIN
  for i in 0..30
  loop   
  DBMS_JOB.SUBMIT(l_jobno,'p_job('i');', trunc(sysdate)+16/24+00/(24*60));
  end loop;   
  COMMIT;
END;

Based on the Utilization, I spawned 30 parallel jobs. Each Job took 5-7 minutes to complete a rowid range scan of wia_values. This I observed from v$session_longops. Once the rowid range scan was complete, hash join and insertion took another 10 minutes. Hence, one job took almost 20-25 minutes. Hence, I scheduled another set of 30 jobs at 10 minute interval.The overall start time and end time, as an example, I followed was
10:00 – 10:25 30 Jobs,
10:10 – 10:35 30 Jobs,
10:35 – 11:00 30 Jobs (After 100% Completion of Previous 60 Jobs).
10:45 – 11:10 30 Jobs,
11:10 – 11:35 30 Jobs (After 100% Completion of Previous 60 Jobs)
11:20 – 11:45 30 Jobs
11:45 – 12:10 20 Jobs (Remaining)

A Sample output after completion of all the scripts…

JOB_ID	LO_RID             HI_RID             C	PROCESSED	START_TIME		END_TIME
0	AABDksAALAAAAAJAAA AABDksAB2AAB9f5AAA Y	29429239	02-03-2008 16:00:06	02-03-2008 16:19:26         
1	AABDksAB2AAB9gZAAA AABDksACkAACSFJAAA Y	208119		02-03-2008 16:00:06	02-03-2008 16:15:51         
2	AABDksACkAACSFZAAA AABDksADSAAAsxJAAA Y	199466		02-03-2008 16:00:06	02-03-2008 16:13:23         
3	AABDksADSAAAsx5AAA AABDksADzAACT6pAAA Y	10776162	02-03-2008 16:00:06	02-03-2008 16:18:48         
4	AABDksADzAACT7JAAA AABDksAEkAAAADJAAA Y	18897287	02-03-2008 16:00:06	02-03-2008 16:19:03

Conclusion : Oracle Parallelism uses same rowid parallelism mechanism to accomplish a task, but manual parallelism works well when you are short of resources, like, in my case, CPU’s and Temporary Space. Though, these require some manual work, it makes a task easier and faster.

END OF THE BLOG