Thursday, September 25, 2008
Smart Scan is the key to Oracle Exadata and 'X'
I googled Exadata last night after Larry's keynote and it returned 934 hits, this morning it was in 3000's, right now its 3440 and sure anything that Larry Ellison has said or done in past, becomes an instant hype...
So what is that makes Oracle Exadata storage or the Oracle HP Database Platform unique? It is the Smart Scan technology that reduces the amount of data to be sent from the storage system to the database server. For general description of the new technology, I suggest reading the White Paper:
Oracle Database 11g for Data Warehousing and Business Intelligence
The new Storage Server returns a query result set of the SQL, rather than entire tables. This reduces the network bottlenecks and frees up database server resources. When analyzing data stored in their data warehouses, Oracle claims, that the performance improvement can be 10X or more. Whaty if you are still not satisfied? Well Oracle's advanced compression, can typically reduce the data volumnes on the storage servers by 2-3 times easily. This will further improve the performance in Exadata or the 'X' machine.
It requires Oracle Database 11g Enterprise Edition 220.127.116.11 or later for the database accessing Exadata storage. Let us look at the visual on top of this post, to see how the Smart Scan reduces the data blocks send to the Database server. As Larry explained, a SQL query in Data Warehosue report, specially on an unindexed column, will do a FULL SCAN and return all blocks to the Database server, unlike the SMART SCAN will only retuern the result set, reducing the number of blocks to be tranasmitted to the database. This will reduce the I/O load substantially. So the key to Smart Scan are:
>Offload predicate evaluation
>Only return relevant rows and columns to host
SMART SCAN is totally transparent to the SQL writers, so all current applications will work as it is, what a relief! Let's take an example of a Telecom provider. They want to know which customers are spending over $200 in a single call. If the huge customer table is 1TB, these small set of customers are actually in say 2MB of space. Traditionally, the 1 TB space will have to be searched for these customers even though it may be by partition pruning etc and a lot of blocks have to be send to the DB. Using SMART SCAN, the data reduction takes place in the storage sub-system itself.
Som of the other uses of SMART SCAN are:
>Join filtering where Star join filtering is performed within Exadata storage cells
or Dimension table predicates are transformed into filters that are applied to scan of fact table
With SMART SCAN - Exadata Way!
Without SMART SCAN (traditional storage)
>Backups I/O for incremental backups is much more efficient since only changed blocks are returned
>Create Tablespace (file creation)Formatting of tablespace extents eliminates the I/O associated with the creation and writing of tablespace blocks
Smart Scans correctly handle complex cases including
Uncommitted data and locked rows
Regular expression searches AND