×

On batch-constructing B\(^{+}\)-trees: Algorithm and its performance evaluation. (English) Zbl 1018.68023

Summary: Efficient construction of indexes is very important in bulk-loading a database or adding a new index to an existing database since both of them should handle an enormous volume of data. In this paper, we propose an algorithm for batch-constructing the B\(^{+}\)-tree, the most widely used index structure in database systems. The main characteristic of our algorithm is to simultaneously process all the key values to be placed on each B\(^{+}\)-tree page when accessing the page. This avoids the overhead due to accessing the same page multiple times, which results from applying the B+-tree insertion algorithm repeatedly. For performance evaluation, we have analyzed our algorithm in terms of the number of disk accesses. The results show that the number of disk accesses excluding those in the relocation process is identical to the number of pages belonging to the B\(^{+}\)-tree. Considering that the relocation process is an unavoidable preprocessing step for batch-constructing of B\(^{+}\)-trees, our algorithm requires just one disk access per B\(^{+}\)-tree page, and therefore turns out to be optimal. We also present the performance tendency in relation with different parameter values via simulation. Finally, we show the performance enhancement effect of our algorithm, compared with the one using repeated insertions through experiments.

MSC:

68P15 Database theory
68M20 Performance evaluation, queueing, and scheduling in the context of computer systems
68P05 Data structures

Software:

STR
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aho, A. V.; Hopcroft, J. E.; Ullman, J. D., Data Structures and Algorithms (1987), Addison-Wesley: Addison-Wesley Reading, MA · Zbl 0307.68053
[2] Bayer, R.; McCreight, C., Organization and maintenance of large ordered indexes, Acta Informatica, 1, 173-189 (1972) · Zbl 0226.68008
[3] Bercken, J.; Seeger, B.; Widmayer, P., A generic approach to bulk loading multidimensional index structures, (Proceedings of the International Conference on Very Large Data Bases, VLDB (1997)), 406-415
[4] Comer, D., The ubiquitous B-trees, ACM Computing Surveys, 11, 2, 121-137 (1979) · Zbl 0419.68034
[5] Elmasri, R.; Navathe, S. B., Fundamentals of Database Systems (1994), Benjamin/Cummings: Benjamin/Cummings Menlo Park, CA · Zbl 0845.68030
[6] Exodus Project Implementation Team, Using the Exodus Storage Manager, Exodus Project Document, University of Wisconsin, 1992; Exodus Project Implementation Team, Using the Exodus Storage Manager, Exodus Project Document, University of Wisconsin, 1992
[7] Horowitz, E.; Sahni, S.; Anderson-Freed, S., Fundamentals of Data Structures in C (1993), Computer Science Press: Computer Science Press Rockville, MD · Zbl 0842.68017
[8] Jannink, J., Implementing deletion in \(B^+\)-trees, ACM SIGMOD Record, 24, 1, 33-38 (1995)
[9] Kamel, I.; Faloutsos, C., On packing R-trees, (Proceedings of the ACM International Conference on Information and Knowledge Management, ACM CIKM (1993)), 490-499
[10] Knuth, D., The Art of Computer Programming: Sorting and Searching, vol. 3 (1973), Addison-Wesley: Addison-Wesley Reading, MA · Zbl 0302.68010
[11] Leutenegger, S. T.; Edgington, J.; Lopez, M. A., STR: a simple and efficient algorithm for R-tree packing, (Proceedings of the IEEE International Conference on Data Engineering, IEEE ICDE (1997)), 497-506
[12] Lee, Y. K.; Kim, S. W.; Jang, J. W.; Whang, K. Y., KAOSS: general-purpose multiuser object-storage system supporting new database applications, Database Review, 11, 2, 91-110 (1995)
[13] Maelbrancke, R.; Olivie, H., Optimizing Jan Jannink’s implementation of \(B^+\)-tree deletion, ACM SIGMOD Record, 24, 3, 5-7 (1995)
[14] Ullman, J. D., Principles of Database and Knowledge-Base Systems (1988), Computer Science Press: Computer Science Press Rockville, MD
[15] Wiederhold, G., Database Design (1983), McGraw-Hill: McGraw-Hill New York
[16] Wirth, N., Algorithms+Data Structures=Programs (1976), Prentice-Hall: Prentice-Hall Englewood Cliffs, NJ · Zbl 0375.68005
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.