In-Memory Store – Push Down Optimization
October 23, 2015 Leave a comment
Last week, I concluded my first ever Event in Gurgaon. This was a full day event on Performance Tuning. During the session, on Query Transformation, I mentioned about “Testing rather than Believing”. The rationale behind this is : It is not always guaranteed that the test case and the transformation discussed will be reproduced in your environment. It is not that the transformation and the test cases are wrong. It is just that there are environmental differences that can cause this. In that case, you would be able to work on the solution to make the transformation happen and therefore, testing will give you good enough knowledge of various transformations or Database Features / Options. My current blog in on one such Database feature : In-Memory Store – Push Down Predicate.”
In-Memory Store is introduced in 12.1.0.2 and is intended to read the required data from a new memory area, which is also a part of the SGA. This memory is sized by way of INMEMORY_SIZE database parameter. In-Memory Store caches the data in a Columnar format as against Row format used by our traditional Buffer Cache. There are many benefits to it. Buffer Cache is still used for our OLTP applications, whereas, queries processing large data and few columns can benefit from In-Memory Store. Biggest advantage being, both these can be implemented on a Single Database thus requiring no complex Data Transformation.
While working on some of the features of In-Memory, I came across Push-Down Optimization. This optimization pushes the predicates, aggregations and group-bys to the access layer i.e.at the time of scanning the column or group of columns, returning just the small subset of data to the query layer. The number of rows returned to the Query layer depends on the number of In-Memory Compression Unit. Thus, the amount of data to be process by the Query layer is reduced drastically making the queries much more efficient and faster. At this point, I will direct you to In-Memory Blogs which is maintained by Oracle Development Folks. The explanation in this blog is self explanatory, therefore, I would not publish this in my blog. I don’t want to make a copy of well-written blog:). The idea behind this blog post is to let the readers know of the issue that I faced while testing this.
As mentioned earlier, I test whatever is published and this gives me a better understanding of the feature. While testing this, I could reproduce the results mentioned in this Blog Post.
For this, I created my own LINEORDERS table and executed the queries mentioned in the blog. The results are published as under :
## STATS table to hold the Statistics from V$MYSTAT
create global temporary table stats on commit delete rows as
select s.sid, n.name, s.value
from v$mystat s, v$statname n
where s.statistic#=n.statistic#
and 1=2;
CREATE TABLE LINEORDER
( LO_ORDERID NUMBER,
LO_PRODUCTNAME VARCHAR2(128) NOT NULL ENABLE,
LO_SHIPQTY NUMBER,
LO_ORDERVALUE NUMBER,
LO_SHIPMODE VARCHAR2(4)
)
TABLESPACE USERS
INMEMORY PRIORITY CRITICAL MEMCOMPRESS FOR QUERY LOW
DISTRIBUTE AUTO NO DUPLICATE
NO INMEMORY (LO_ORDERID)
NO INMEMORY (LO_PRODUCTNAME);
insert into lineorder
select rownum lo_orderid, a.object_name lo_productname,
round(dbms_random.value(1,100),0) lo_shipqty,
round(dbms_random.value(1000,10000),2) lo_ordervalue,
case when mod(rownum,3)=0 then 'AIR'
when mod(rownum,7)=0 then 'SHIP'
when mod(rownum,10)=0 then 'RAIL'
else 'ROAD' end lo_shipmode
from all_objects a, all_objects b
where rownum<=1e+7;
commit;
exec dbms_stats.gather_table_stats(user,'LINEORDER'); SQL> select num_rows, blocks, inmemory from dba_tables where table_name='LINEORDER'; NUM_ROWS BLOCKS INMEMORY ---------- ---------- -------- 10000000 41717 ENABLED SQL> select segment_name, bytes, inmemory_size, bytes_not_populated, populate_status from v$im_segments; 2 SEGMENT_NAME BYTES INMEMORY_SIZE BYTES_NOT_POPULATED POPULATE_ ------------------------------ ---------- ------------- ------------------- --------- LINEORDER 343932928 133562368 0 COMPLETED
On my laptop, I have 4 GB memory and therefore, had to restrict my SGA Size. The tablespace size is around 343MB. I have sized my In-Memory Store to 200MB and therefore, the table was created with PCFTREE as 1, so as to, have this table as small as possible. With default PCTFREE, the entire table is not populated into the Store.
Now, lets start our test, execute the queries as per the blog and validate the results.
SQL> insert into stats select s.sid, n.name, s.value from v$mystat s, v$statname n where s.statistic#=n.statistic# and n.name like 'IM%'; 211 rows created. SQL> select /*+ VIVEK_IMCU */ lo_shipmode, count(*) from lineorder group by lo_shipmode; LO_S COUNT(*) ---- ---------- RAIL 571429 SHIP 952381 ROAD 5142857 AIR 3333333 SQL>select a.name, m.value - a.value value from v$mystat m, stats a, v$statname b where a.name = b.name and m.statistic#=b.statistic# and m.value - a.value >0 order by 1; NAME VALUE ---------------------------------------------------------------- ---------- IM scan CUs columns accessed 21 IM scan CUs columns theoretical max 105 IM scan CUs memcompress for query low 21 IM scan CUs no cleanout 21 IM scan CUs split pieces 23 IM scan bytes in-memory 116692649 IM scan bytes uncompressed 126565381 IM scan rows 10000000 IM scan rows projected 10000000 IM scan rows valid 10000000
As per the Blog, the value for “IM scan rows projected” statistics should have been 84 (in my case). The statistics show that the query accessed 21 Compression Units (IM scan CUs memcompress for query low). I have 4 values for lo_shipmode i.e AIR, SHIP, ROAD & RAIL. So, this make 21 x 4 = 84. However, the statistics “IM scan rows projected”, from my testing, is the number of rows in the table (10 Million). Why is this discrepancy ? Is my test case wrong ?
I executed other queries as well and the results were the same i.e.I could not reproduce the statistics mentioned in the blog. I was sure that there is some mismatch in the configuration. Usually, for any such issues, customers are advised to be on latest Bundle Patch. Therefore, I applied Bundle Patch 10 as well, but the results were same. It took a while to investigate the cause of this.
I reported this to the author of the blog and they immediately started investigation on this. For analysis, I had sent the table creation script, along with the run time execution plan. Interestingly, the issue was drilled down to database parameter setting STATISTICS_LEVEL. This setting on my database was ALL. With default or TYPICAL, I was able to reproduce the results as per the blog.
SQL> show parameter statistics_level NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ statistics_level string ALL SQL> alter session set statistics_level=typical; Session altered. SQL> insert into stats select s.sid, n.name, s.value from v$mystat s, v$statname n where s.statistic#=n.statistic# and n.name like 'IM%'; 211 rows created. SQL> select /*+ VIVEK_IMCU */ lo_shipmode, count(*) from lineorder group by lo_shipmode; LO_S COUNT(*) ---- ---------- RAIL 571429 SHIP 952381 ROAD 5142857 AIR 3333333 SQL> select a.name, m.value - a.value value from v$mystat m, stats a, v$statname b where a.name = b.name and m.statistic#=b.statistic# and m.value - a.value >0 order by 1; NAME VALUE ---------------------------------------------------------------- ---------- IM scan CUs columns accessed 21 IM scan CUs columns theoretical max 105 IM scan CUs memcompress for query low 21 IM scan CUs no cleanout 21 IM scan CUs split pieces 23 IM scan bytes in-memory 116692649 IM scan bytes uncompressed 126565381 IM scan rows 10000000 IM scan rows projected 84 IM scan rows valid 10000000 10 rows selected.
To summarize, STATISTICS_LEVEL=ALL causes a problem for Push Down Optimization. BUG has been filed for this issue and the resolution/fix is expected soon. Will keep you posted.
The idea behind this blog was just to let the readers know the importance of Testing before concluding. Each Database setup is different. A minor change can cause a change in the behaviour and it will be in the interest of the readers to investigate this change.