Enterprise data warehousing: External MySQL Storage Engines for Data Warehousing

There are three storage engines built and maintained by third-party storage providers that can be used to

meet a few of the most common data warehousing use cases. These are InnoDB, NitroEDB, and

BrightHouse

InnoDB

First is InnoDB, which is a transactional storage engine maintained by Oracle. InnoDB obviously supports

OLTP applications, but it can also serve analytic purposes too and therefore may be applicable to the

OLTP/analytic hybrid use case identified by Gartner.

NitroEDB

Nitrosecurity is a provider of security information management solutions, which are undergirded by a very

powerful database management system. The Nitrosecurity database has been designed with a number of

very unique characteristics so it can support the enormous amount of data volumes and concurrent queries

that come as part of a security management software package.

Nitrosecurity (www.nitrosecurity.com) and MySQL have partnered to produce the NitroEDB storage

engine, which has been created to handle the needs of real-time data warehousing customers. The

NitroEDB engine has the ability to receive huge volumes of inserts while supporting concurrent queries

against the same data at the same time, all with no performance penalty whatsoever. In addition, NitroEDB

contains specialized indexes (N-tree, Microcluster) that are designed to deliver extremely fast response

times for aggregate styled queries (SUM, AVG, etc.) due to the fact that the aggregate information is

actually stored within the index itself.

Brighthouse

The last third party storage engine applicable for MySQL data warehousing is Brighthouse that is

produced by a partnership between MySQL and Infobright (www.infobright.com). At it core, BrightHouse is

a highly compressed column-oriented datastore that incorporates MySQL technology, although the engine

uses its own load and unload utilities rather than MySQL’s mainly because (1) compression and

decompression are done on load and unload and (2) the BrightHouse Knowledge Grid is created on load.

In addition, BrightHouse also uses its own optimizer instead of the MySQL optimizer because the

BrightHouse optimizer knows how to use the information that is stored in its Knowledge Grid to optimize the

execution of queries against the BrightHouse engine.

The Brighthouse engine consists of 4 key layers:

1. BrightHouse is a column-oriented

data store. This means that

instead of the data being stored

row by row, it is stored column by

column. There are many

advantages to column-orientation,

not the least of which is the ability

to do more efficient data

compression because since each

column stored a single data type

(as opposed to rows that typically

contain several data types),

compression can be optimized for

each particular data type. The

data itself within the columns is

stored by 65K item groupings. We

refer to each of these groupings

as Data Packs. The use of Data

Packs improves data compression

and is also critical to how

Infobright resolves complex

queries.

2. Data Pack Nodes (DPNs) contain a set of statistics stored related to the data that is stored and

compressed in each of the Data Packs. There is always a 1 to 1 relationship between Data Packs and DPNs.

3. Knowledge Nodes are a further set of metadata related to Data Packs, columns or table combinations.

The set of these Knowledge Nodes taken together is called the Knowledge Grid.

4. The BrightHouse Optimizer uses the Knowledge Grid to determine the minimum set of Data Packs

which need to be decompressed in order to satisfy a given query. In some cases, the information

contained in the Knowledge Grid is sufficient to resolve the query, in which case nothing is

decompressed.

The Brighthouse storage engine’s design is perfect for the massive amounts of data and queries that are

the standard fare of traditional or historical data warehousing.

Enterprise data warehousing

Friday, April 2, 2010

External MySQL Storage Engines for Data Warehousing

No comments:

!!....Site Under Monitorizing ..!!