Open source Database Breakthrough: 10-80 times faster


The figure below shows the architecture of the new VectorWise engine. The
left part shows the system architecture (“X100” execution engine and
ColumnBM buffer manager) and how it maps on the computer resources
(CPU cache, RAM and disk). The right part shows a query in action, having
been decomposed into so-called relational operators (Aggregate, Project,
Select and Scan) and execution primitives (such as summation –
aggr_sum_flt_col).

A ground-breaking database kernel – is now being combined with the leading open source relational database from Ingres.

The Ingres VectorWise project team has worked with Intel to evaluate database performance on the new Intel Xeon processor 5500 series based platform. To date, the results of the project have demonstrated dramatic cost and performance capabilities as evidenced by nearly 80 fold speed up on a query modelled after the Q1 query of TPC-H3 suite on the Intel Xeon processor.

VectorWise next-generation database technology is based on a novel query processing architecture that allows modern microprocessors and memory architectures to reach their full potential. This is a unique achievement: in detail studies that compare common computing tasks such as scientific calculation, multi-media authoring, games, and databases have consistently shown that typical database engines do not benefit from new processor performance features such as SSE, out of order execution, chip multithreading, and increasingly larger L2/L3 caches due to their large, complex legacy code structure.

The computational power that database systems provide is known to be lower than the performance realized by hand-coding the same task in a (e.g. C++) program. However, the actual performance difference can be surprisingly large: a factor 100. VectorWise has created the first database system to revert that situation, with dramatic efficiency improvements as a result

.

Vectorized Execution
The VectorWise engine is designed for in-cache execution, which means that the only “randomly” accessible memory is the CPU cache, and main memory (by now inappropriately named “RAM” – Random Access Memory) is already considered part of secondary storage, used in the buffer manager for buffering I/O operations and their compressed large intermediate results. Queries are processed by passing on multiple tuples at a time between relational operators, called “vectors.” These vectors are at the heart of the execution engine:
• VectorWise has developed vectorized versions of relational operators, so there is vectorized selection, vectorized project, join, sort etc. It has even been possible to vectorize binary search.
• Vectors are the simplest possible data structure, an array of values. A tuple is represented as a set of vectors of the same length, one for each column.
• An optional selection vector contains the positions of the tuples currently taking part in processing, i.e. those that have passed a filter operation.
• A vector contains typically between 100-1000 values. The vector size is tuned such that all vectors in a query plan fit comfortably together in the CPU cache.
• Vectorized primitives have many performance advantages because methods perform 100-1000 times more work, function call overhead is dramatically reduced, and database code becomes much more local, improving instruction cache locality and CPU branch prediction.