|
Abstract : |
I/O subsystem is widely accepted as one of the principal bottlenecks for high performance parallel databases systems. The emergence of parallel I/O architectures has made the problem of data declustering, i.e. fragmenting a file of records and allocating the pieces to different disks, one of prime importance. This is evident from the growing activity in this area. In this study we focus only on multi-attribute declustering methods which are based on some type of grid-based partitioning of the data space. While a number of such declustering methods exist, we believe a good performance evaluation of their relative merits is lacking. Almost all performance analyses so far have been theoretical, where exact conditions on number of disks, sizes of attribute domains, and query shapes and sizes have been derived, for which a certain declustering method is optimal. Also, most conditions exist for partial match queries. We believe that in practice putting restrictions on the size of attribute domains is debatable and on the shape and size of queries is unacceptable. Thus, to answer the question how do various declustering schemes perform under a wide range of query and database scenarios (both relative to each other and to the optimal)?, we have carried out a detailed performance evaluation. Parameters that, |