Home

Data cube: A relational aggregation operator generalizing group-by cross-tab and sub-totals


Author(s) : Adam Bosworth Microsoft Jim Gray Microsoft Hamid Pirahesh Andrew Layman Adam Bosworth Jim Gray Andrew Layman Microsoft, 
Publisher : N/A
Publication Date : 1996
ISSN : N/A
Abstract : Abstract: Data analysis applications typically aggregate data across many dimensions looking for unusual patterns. The SQL aggregate functions and the GROUP BY operator produce zero-dimensional or one-dimensional answers. Applications need the N-dimensional generalization of these operators. This paper defines that operator, called the data cube or simply cube. The cube operator generalizes the histogram, cross-tabulation, roll-up, drill-down, and sub-total constructs found in most report writers. The cube treats each of the N aggregation attributes as a dimension of N-space. The aggregate of a particular set of attribute values is a point in this space. The set of points forms an N-dimensional cube. Super-aggregates are computed by aggregating the N-cube to lower dimensional spaces. Aggregation points are represented by an "infinite value", ALL. For example, the point (ALL,ALL,ALL,...,ALL, sum(*)) would represent the global sum of all items. Each ALL value actually represents the set of values contributing to that aggregation. 1.,