Outline
- Abstract
- 1. Introduction
- 2. Cost-Based Access Method Selection
- 3. Path Indexes in the Optimizer
- 4. Conclusion
- References
رئوس مطالب
- چکیده
- 1.مقدمه
- 2. انتخاب متد دستیابی مبتنی بر هزینه
- 3. اندکس های مسیر در بهینه ساز
- 4. نتیجه گیری ها
Abstract
With the growing interest in native XML query processing comes an increased awareness of the lack of maturity in XML optimizers. We believe that there is a significant opportunity to adapt and extend mature relational optimization techniques in XML systems. In this paper we introduce a novel two-level approach to cost-based optimization. The higher level consists of the traditional join order selection together with the cost-based selection of access methods. The lower level cost-based optimization is entirely performed within an original access method that takes advantage of XML path indexes. A path index, also known as a structural index or as a structural summary, represents a summarization of the paths that actually occur in an XML document. Using path indexes in XML optimization helps to constrain the query plan search space and allows the exploitation of cost models based on XML-specific statistics. The optimization approach is described in the context of ToXop, a cost-based optimizer for ToX that seamlessly incorporates both streaming (single-pass) and path index based pattern matching evaluation strategies for XQuery.
Conclusions
We have presented on-going work on a two level approach to XML optimization used by ToXop, a cost-based XML query optimizer. The higher level consists of the cost-based selection of two XML access methods: ToXStream and ToXinScan. ToXStream supports single pass evaluation through XML documents, while ToXinScan operates on ToXin trees, a path index structure for XML documents. Both access methods use ToXin trees as a system catalog. The higher optimization level also includes traditional join order selection. The lower level optimization determines how the path index will be traversed. This choice is encapsulated in the second access method, ToXinScan. This access method also encapsulates the pruning of the patterns to be matched against the path index. Exploiting the path index in this lower level optimization brings the two crucial benefits of constraining the search space of query plans and of exploiting a cost model based on XML-specific statistics.
Our immediate plans for ToXop begin with an experimental evaluation of the optimizer’s behavior. Future work includes incorporating a wider variety of native XML indexing access methods (beyond path indexes) into a uniform framework such as the one described here. We are also interested in extending the use of XML-specific statistics to support better cost models and cost estimates.