Autonomous Database ! ADW and ATP. Why do I need both ?

Autonomous Data Warehouse (ADW) was announced in March 2018 and then Autonomous Transaction Processing (ATP) is the latest addition to the Autonomous Database family. ATP is for Mixed Workload and therefore provides 4 Service Names. These are :

HIGH :Highest level of CPU and IO Resources. Concurrency with this Service Name depends upon the number of OCPUs and will scale accordingly.

MEDIUM :Lowest level of CPU and IO Resources. Concurrency depends on the number of CPU’s and will scale accordingly.

LOW :Least amount of CPU and IO Resources. Again, Concurrency depends on the number of CPU’s and the Scaling is directly proportional to the number of OCPU’s.

ADW provides these three Service Names as well. However, unlike ADW (see my previous blog), the definition of these Services are different. In ADW, HIGH provides highest level of Resources but less concurrency, whereas, in ATP, HIGH do provide highest level of Resources but also provides Highest level of Concurrency. Since ATP is designed for Mixed Workload, it provides an additional Service Name PARALLEL. It provides high level of resources but lowest concurrency and is useful for Batch Processes or Heavy Reporting.

Now, many of the technical folks have raised this question on “Why do we need Both? Can’t we use ATP to take care of Data Warehouse kind of a load?”. ATP very well does that by providing an option to run Mixed Workload. However, there are many implementations that are purely Data Warehouse. The problem is that the amount of data that we generate, Warehouse Databases have grown massive in size supporting large amount of complex queries involving many nested joins and expecting sub-second response time. Further, these Databases house data from many different sources. Optimization Strategy for an OLTP and Data Warehouses are completely different. Therefore, an OLTP kind of a setup cannot (in most of the cases) provide the response time needed for DWH Queries.

Row Store v/s Columnar Store

Now, coming back to ATP v/s ADW argument. In an ADW, the data loaded is by default compressed to Oracle Hybrid Columnar Compression, whereas ATP uses a Row Store Format. What this means is that in an ADW, the data stored at disk is in a Columnar format, which is a proven format for speeding up Analytics kind of a Workload where we query few columns. All Data Warehouse Implementations, like Redshift, Google Big Query etc. store data in Columnar format. On the other hand, ATP stores the data in Row format which is ideal for OLTP that demands Single or few row(s) updates or deletes or queries small number of rows.

Since the data in ADW is stored Compressed in Columnar format, it allows scanning of Billions of Rows per CPU cycle. The data for a particular column is stored contiguously making it possible to scan through large number of rows in a single CPU cycles. This technique is called SIMD (Single Instruction Multiple Dataset) processing. To meet a sub-second response time, this technique is plays a significant role.

Such large scans (scanning billion of rows per second) is not possible with Row Store and one of the reasons for this can be found in one of my blog published in 2015 which is still very much relevant. Skipping of Columns consume CPU Cycles which is demonstrated in the blog post that can be found here.

Through this short blog, I wanted to clear the confusion around ADW and ATP. Both have their own use cases and thus are optimized to manage different workloads. Both of these Autonomous Databases have some exciting features. Stay tuned for more on these.

Autonomous Database Tech Day ! Gurgaon

 

#Autonomous #AIOUG Presenting a Full Day Event on 8th September 2018 for North India Oracle User Group in Gurgaon. This is on Oracle Autonomous Database. Would be covering some interesting technical capabilities of Autonomous Databases. I am covering the 2 offerings i.e. Autonomous Data Warehouse and Autonomous Transaction Processing. For Registration, click on the following link :

Meraevents Link

This being an Oracle User Group Session, focus would be on the Technical Capabilities of ADW / ATP, like Parallel Processing, Concurrency, Optimizer Enhancements and Behaviour and most importantly, Competition.

So, North India Folks : See you all on 8th September 2018.