Thursday, September 25, 2008

Smart Scan is the key to Oracle Exadata and 'X'

I googled Exadata last night after Larry's keynote and it returned 934 hits, this morning it was in 3000's, right now its 3440 and sure anything that Larry Ellison has said or done in past, becomes an instant hype...

So what is that makes Oracle Exadata storage or the Oracle HP Database Platform unique? It is the Smart Scan technology that reduces the amount of data to be sent from the storage system to the database server. For general description of the new technology, I suggest reading the White Paper:
Oracle Database 11g for Data Warehousing and Business Intelligence

The new Storage Server returns a query result set of the SQL, rather than entire tables. This reduces the network bottlenecks and frees up database server resources. When analyzing data stored in their data warehouses, Oracle claims, that the performance improvement can be 10X or more. Whaty if you are still not satisfied? Well Oracle's advanced compression, can typically reduce the data volumnes on the storage servers by 2-3 times easily. This will further improve the performance in Exadata or the 'X' machine.

It requires Oracle Database 11g Enterprise Edition or later for the database accessing Exadata storage. Let us look at the visual on top of this post, to see how the Smart Scan reduces the data blocks send to the Database server. As Larry explained, a SQL query in Data Warehosue report, specially on an unindexed column, will do a FULL SCAN and return all blocks to the Database server, unlike the SMART SCAN will only retuern the result set, reducing the number of blocks to be tranasmitted to the database. This will reduce the I/O load substantially. So the key to Smart Scan are:
>Offload predicate evaluation
>Only return relevant rows and columns to host
>Join filtering

SMART SCAN is totally transparent to the SQL writers, so all current applications will work as it is, what a relief! Let's take an example of a Telecom provider. They want to know which customers are spending over $200 in a single call. If the huge customer table is 1TB, these small set of customers are actually in say 2MB of space. Traditionally, the 1 TB space will have to be searched for these customers even though it may be by partition pruning etc and a lot of blocks have to be send to the DB. Using SMART SCAN, the data reduction takes place in the storage sub-system itself.

Som of the other uses of SMART SCAN are:
>Join filtering where Star join filtering is performed within Exadata storage cells
or Dimension table predicates are transformed into filters that are applied to scan of fact table

With SMART SCAN - Exadata Way!

Without SMART SCAN (traditional storage)
>Backups I/O for incremental backups is much more efficient since only changed blocks are returned

>Create Tablespace (file creation)Formatting of tablespace extents eliminates the I/O associated with the creation and writing of tablespace blocks

Smart Scans correctly handle complex cases including
Uncommitted data and locked rows
Chained rows
Compressed tables
Date arithmetic
Regular expression searches AND
Partitioned tables

Wednesday, September 24, 2008

Oracle Exadata 'X' is out

Convergence of hardware and datab ase soft

Larry Ellison announed the database machine today in his keynote, a machine build on radicle new ideas. He used the sail boat analogy on the out of box thinking needed to overcome the bandwidth limitation between the storage system and the database. The large DW's are tripling in size every two years and are plagued with the data bandwidth problem. There is a disk to Database choking. Storage disks can easily store 100TB but movement of data is the biggest.

Most hard configurations slow down at even lesser values 1-10 TB range. The largest storage systems show the exponential increase in scan time at 10TB. This problem can be solved in two way, as a data problem, reduce the data - compression or indexing the data or partition pruning etc. The other way it to enhance the amount of data travelling over the data pipes. Make the pipes faster and increase their number.

This lead to the announcement of Oracle's first hardware product called Exadata programmable storage server in partnership with HP. This is a combination of hardware and software. The idea is to locate the intelligence next to the storage server, to process the query in the h/w side and ship only the results. Typically a query on a large table that is not indexed, results in full table scan, i.e. shipping of all the blocks to the database, this is the point of choking. To reduce that, now the intelligence in in the storage system that limits the results and only ships the result blocks to the database. This will help to improve the parallel query. Now the idea is resturn results not disk blocks. Query is now moved to a lower layer. There will be three grids now, grid of database, grid of fusion middleware and grid of storage servers. The connection between the storage server and DB is the Infiniband pipe at that can practically transfer data at 1 GB / sec.

The exadata is available for immediate shipping for Linux and will be extended for all OS. The Oracle database machine or 'X' was announced as the competitor for Teradata and Netezza platforms or the hardware database appliances. The results of three years of R&D leads to X as the fastest database machine on earth. It has 8, total of 64 Intel cores and 112 cores of Intel CPU just for storage. It can hold 168 TB of data and is 1000 X larger than largest Ipod.

The pilot customers were Yahoo, NPD (the company I cosulted at last summer when I was in Oracle's BI practice), Country wide, Retailer Giant eagle, Amazon etc.
The European Telecom provider, saw 10X to 72 x query performance 4.5TB of CDR's, average 28 X perf (on half the config of X). This is compared to 2 IBM p series and EMC disk array. The largest imporovement was in CRM report - customer discount report
LGR telecom DW for phone companies saw 30X performance improvement, compared to HP superdome + based config with Hitachi array.

Other Examples of Performance Improvement
Chicago Board of Trade 10-15 X perf
Giant Eagle Retail sales 16 X performance
Oracle's Internal financial data warehouse netapps 30X speedup

In a nutshell, X has intelligent storage and more bandwidth. Rather than conventional movement of disk blocks, X shows almost same access time as the DB's in size making is very scalable as DB and DW's grow into petabyte ranges.

Market comparision: X is faster than 5 rack Terdata 5550
Larry said that there is no query intelligence in Teradata, very sophisticated DB though, proven over many years. Teradata moves disk blocks.

Unlike Teradata, Netezza, is a storage server, however, in the comparision shows, it was about 2:1 improvement in data bandwidth (14 GB/sec compared to 7.5 GB/sec)
Besides, the Netezza h/w does not run Oracle database.
One one had the Teradata is a very good DB, netezza is not. X provides the balance between both the current market leaders for DW appliances.

Now let's look at the pricing model, $1,680,000 for the license for X, about 650 for the H/W, cost of about 4K/TB of data at undiscounted level. Its open technology, Intel CPU's with 6 cores can be used in future, will take advantage of
cpu and ram speed etc. reducing the cost to the end user.

Currently, 90 of the load in OLTP systems is queries and reports. Therefore, it is imporant to make sure that both databases for OLTP and data warehouses can be run on this platform.

HP and Oracle will together take orders for X, Mike Hurd of HP came "online"

Let's keep an eye how the rest of the world will respond to this news..

Sunday, September 21, 2008

BIWA Sessions at OOW 2008

Dan Vlamis started the day...about 100 attendees, well received talk, was in parallel with Mark Rittman's talk. Mark had arrived at 4 AM and said he was jet lagged...

Next session was Ian (his story is he lost his wallet on the way to San Francisco), he spoke on Master Data Management. This was in parallel Matt Vranikar's talk.

After lunch, was my talk on Retail Business Intelligence Accelerator, Carl Daniels from the Oracle Development, ran the demo during my talk. They will also be on the demopod L23 in Moscone west.

Richard Solari and Teresa Wong's tag team talk was very powerful and very well received. It ran amost full 90 mins. The final event of the day was the BI Panel, moderated by Joe Thomas, participants, Charlie Berger, Dan Vlamis, Jon Mead, Matt Vranikar, myseld and Rich Solari. The Q&A was quite interactive at the end, followed by some post Q&A discussions, one on one...

Sunday Openworld

I am sitting in the keynote session, Sunday evening, the first big event. Today BIWA had 7 sessions, ending the in the BI panel. Bareley made it in time this keynote. This time a see reserved tables for bloggers, and I am in one of those.

Safra started the evening, announced >43K attendees, over 450 exhibitores, 300 oracle demos. Safra joked that no way Oracle say show all its 9000 applications!

She told that Michael Phelps will be in tomorrow. Although the green theme is being emphasised this year, the confernece book is only offered in print and not on the USB stick like last year!

Safra introduced the San Francisco mayor, the youngest mayor ever...Gavin Newsom.
Gavin said that SF's bond rating was increased recently, highest ever.

The next speaker was Ed Begley and actor and activist. He talked about the "sustainable conference"

Friday, September 19, 2008

Last Working day before Oracle Openworld

TOday was the last working day before the Openworld kicks off. The stock market had a good 2 day run, and Oracle stock ORCL came postive from the 52 week lows, so the stage is all set for a bigger, better OOW. BEA will be there on the red side of the fence for the first time. Last two years, the BI folks (Siebel and Hyperion) got the spot light, this time its middleware, let's see what is new in Fusion this time. An interesting trend though is "fusion" of BI and SOA. Oracle is well placed in both the segments. The new paradigm is to think of BI components as "services" in the enterprise SOA architecture. This has a lot of potential as SOA so far has lacked "content" and growing popularity of BI can provide that missing "content" to the service oriented architecture.

So I would be airborne in 12 hrs, getting there middle of day on Sat to Bay area, so that I am all set of the marathon 7 seven session for BIWA SIG on Sunday. Actually two of the BIWA sessions will current concurrently as one room can have only 5 sessions in a day...

BIWA SIG does not formally have any business meeting, but please feel free to stop by at my session as I will spend a few minutes on what is BIWA upto lately. IT will be at 1 PST at Room 2001 in Moscone West on Sunday Sep 21. There is also a BI panel at 4 PM in the same room on "Why do BI projects Fail?" Joe Thomas, the Oracle veteran will drive that session.

Wednesday, September 17, 2008

Oracle Openworld 2008 - Countdown begins

This is the last week before the Openworld starts on Sep 21, 2008. The Wall Street has really pushed the panic button, 500 and 400 points in the red, in last two days! Will the state of the economic union have any impact at an event like OOW?

Well the Oracle stock ORCL was at its 52 week low today at $18.07, the earnings call is tomorrow and the guidance had been less than optimistic. However, Larry is known to announce new products, offerings every time at Openworld. So what is the speculation this time? Oracle database guru Mark Townsend says it's going to be the biggest one ever! Over 50,000 are expected to participate online apart from 43,000 attending in person. The Extreme Weekhand starts on the weekend and will feature how to setup RAC cluster, Data warehousing, JD Edwards hands-on and other stuff with pizza and beer in the room. The Oracle Develop event runs in the Hilton...with purely development focus. Google and others will talk in this event and his event will cover beginer to advanced level developers who will learn about .Net, Java, PHP etc.

Two large exhibit halls will house about 500 exhibitors. Chuck Roswat will speak on Information Management, Andy Mendelson will talk about the future of databases.
Tom Kyte will talk about the database "worst practices" like he did recently at NY Oracle USers Group. We hope to hear a lot about content management products like the content DB and records DB. Records DB helps with the compliance and security with the help of rules that can be implemented with these product. Database Vault will also be show cased for internal security of data with roles based segregation of data access in the database.

Secure backup and Secure Enterprise Search (SES) have been around for a while but will get some attention this time. IT will be good to compare / contrast SES with Goolge search applicance.

In the data warehousing space, Oracle Warehouse Builder is bundled in database. With OWB base product being a free product, it's likely to get more popular. The data quality and enterprise connectors will stay priced options.

Stop by the demogrounds for demo of all these products in the Exhibition Hall.

Those attending in person, if you will be driving in the San Francisco downtown, be ready to observe altered traffic patterns:

Monday, September 15, 2008

BIWA SIG Sessions at Oracle OpenWorld - Sep 21, 2008

Oracle BIWA SIG presents to you, 7 sessions at SIG Day - Sunday Sep 21 at Oracle Openworld

Building Cubes and Analyzing Data with Oracle OLAP 11g (IOUG)
S301022 8:30 AM, Dan Vlamis, Vlamis Software Solutions, Moscone West 2001

Be the Master of Your Domain: MDM Explained (IOUG)
S301156, 10:00 AM, Ian Abranson, IAS Inc. Jeremy Fitzgerald , Dimensional Strategies, Moscone West 2001

A Retail Business Intelligence Accelerator: Oracle Data Warehouse for Retail (IOUG)
S301009, 1:00 PM, Shyam Varan Nath, Deloitte Consulting, BIWA SIG President, Moscone West 2001

Case Studies: Implementing Oracle Business Intelligence Suite Enterprise Edition in Three Environments
S301190, 2:30 PM, Richard Solari, Teresa Wong, Deloitte Consulting, Moscone West 2001

The State of Oracle Business Intelligence and Data Warehousing: BI Discussion Panel (IOUG)
S301191, 4:00 PM, Joe Thomas Oracle Corporation, other Panelists, Moscone West 2001

Note Different Room

Extending and Customizing the Oracle Business Intelligence Applications Data Warehouse
S301063 8:30 AM, Mark Rittman, Rittman Mead Consulting, Moscone West 2008

2009: A BI Odyssey
S301163, 10:00 AM, Matt Vranikar, Piocon Technologies, Moscone West 2008