|
Abstract : |
E cient storage and retrieval of multi-attribute datasets have become one of the es-sential requirements for many data-intensive applications. The Cartesian product le has been known as an e ective multi-attribute le structure for partial-match and best-match queries. Several heuristic methods have been developed todecluster Cartesian product les across multiple disks to obtain high performance for disk accesses. Though the scalability of the declustering methods becomes increasingly important for systems equipped with a large number of disks, no analytic studies have been done so far. In this paper we derive formulas describing the scalability of two popular declustering methods Disk Modulo and Fieldwise Xor for range queries, which are the most common type of queries. These formulas disclose the limited scalability of the declustering methods and are corroborated by extensive simula-tion experiments. From the practical point of view, the formulas given in this paper provide a simple measure which can be used topredict the response time of a given range query and to guide the selection of a declustering method under various conditions. Index Terms: multi-attribute access methods, range query, le declustering, scalability, Disk Modulo, Fieldwise Xor, Hilbert curve allocation method., |