zbMATH — the first resource for mathematics

A bee colony based optimization approach for simultaneous job scheduling and data replication in grid environments. (English) Zbl 1348.90315
Summary: This paper presents a novel Bee Colony based optimization algorithm, named Job Data Scheduling using Bee Colony (JDS-BC). JDS-BC consists of two collaborating mechanisms to efficiently schedule jobs onto computational nodes and replicate datafiles on storage nodes in a system so that the two independent, and in many cases conflicting, objectives (i.e., makespan and total datafile transfer time) of such heterogeneous systems are concurrently minimized. Three benchmarks – varying from small- to large-sized instances – are used to test the performance of JDS-BC. Results are compared against other algorithms to show JDS-BC’s superiority under different operating scenarios. These results also provide invaluable insights into data-centric job scheduling for grid environments.

90B35 Deterministic scheduling theory in operations research
90C59 Approximation methods and heuristics in mathematical programming
Full Text: DOI
[1] Berman, F.; Fox, G.; Hey, A. J.G., Grid computing: making the global infrastructure a reality, (2003), John Wiley and Sons New York
[2] CERN. Compact muon solenoid (CMS). 〈http://public.web.cern.ch/public/en/lhc/CMS-en.html〉; 2011 [visited].
[3] CERN. Large hadron collider (LHC). 〈http://public.web.cern.ch/public/en/lhc/lhc-en.html〉; 2011 [visited].
[4] Holtman K. CMS data grid system overview and requirements. The compact muon solenoid (CMS) experiment note 2001/037. Switzerland: CERN; 2001.
[5] Holtman K. HEPGRID2001: a model of a virtual data grid application. In: Hertzberger LO, Hoekstra AG, Williams R, editors. HPCN Europe 2001: Proceedings of the ninth international conference on high-performance computing and networking. London, UK: Springer-Verlag; 2001, p. 711-20.
[6] Anjum, A.; McClatchey, R.; Ali, A.; Willers, I., Bulk scheduling with the DIANA scheduler, IEEE Transactions on Nuclear Science, 53, 3818-3829, (2006)
[7] Subrata, R.; Zomaya, A. Y.; Landfeldt, B., Cooperative power-aware scheduling in grid computing environments, Journal of Parallel and Distributed Computing, 70, 84-91, (2010) · Zbl 1233.68118
[8] McClatchey, R.; Anjum, A.; Stockinger, H.; Ali, A.; Willers, I.; Thomas, M., Data intensive and network aware (DIANA) grid scheduling, Journal of Grid Computing, 5, 43-64, (2007)
[9] Tang, M.; Lee, B.-S.; Tang, X.; Yeo, C.-K., The impact of data replication on job scheduling performance in the data grid, Future Generation Computer Systems, 22, 254-268, (2006)
[10] Frey, J.; Tannenbaum, T.; Livny, M.; Foster, I.; Tuecke, S., Condor-G: a computation management agent for multi-institutional grids, Cluster Computing, 5, 237-246, (2002)
[11] Andreetto P, Borgia S, Dorigo A, Gianelle A, Mordacchini M, Sgaravatto M et al. Practical approaches to grid workload and resource management in the EGEE project. In: Proceedings of the conference on computing in high energy and nuclear physics (CHEP’04); 2004. p. 899-902.
[12] Jin, H.; Shi, X.; Qiang, W.; Zou, D., An adaptive meta-scheduler for data-intensive applications, International Journal of Grid and Utility Computing, 1, 32-37, (2005)
[13] Kosar, T.; Livny, M., A framework for reliable and efficient data placement in distributed computing systems, Journal of Parallel and Distributed Computing, 65, 1146-1157, (2005)
[14] Thain D, Bent J, Arpaci-Dusseau A, Arpaci-Dusseau R, Livny M. Gathering at the well: creating communities for grid I/O. In: Proceedings of the 2001 ACM/IEEE conference on supercomputing; 2001. p. 58-78.
[15] Basney, J.; Livny, M.; Mazzanti, P., Utilizing widely distributed computational resources efficiently with execution domains, Computer Physics Communications, 140, 246-252, (2001) · Zbl 1033.68503
[16] Bode B, Halstead DM, Kendall R, Lei Z, Jackson D. The portable batch scheduler and the maui scheduler on linux clusters. In: Proceedings of the fourth annual showcase and conference (LINUX-00); 2000. p. 217-24.
[17] Cirne W, Da Silva DP, Costa L, Santos-Neto E, Brasileiro FV, Sauve J et al. Running bag-of-tasks applications on computational grids: the MyGrid approach. In: Proceedings of the 2003 international conference on parallel processing (ICPP’03); 2003. p. 407-16.
[18] Huedo, E.; Montero, R.e. S.; Llorente, I. M.i., The gridway framework for adaptive scheduling and execution on grids, Scalable Computing: Practice and Experience, 6, 1-8, (2005)
[19] Strazdins PE Uhlmann J. A comparison of local and gang scheduling on a beowulf cluster. In: Proceedings of 2004 IEEE international conference on cluster computing (CLUSTER’04); 2004. p. 55-62.
[20] Maheswaran, M.; Ali, S.; Siegel, H. J.; Hensgen, D.; Freund, R. F., Dynamic mapping of a class of independent tasks onto heterogeneous computing systems, Journal of Parallel and Distributed Computing, Special Issue on Software Support for Distributed Computing, 59, 107-131, (1999)
[21] Casanova H, Zagorodnov D, Berman F, Legrand A. Heuristics for scheduling parameter sweep applications in grid environments. In: Proceedings of the ninth heterogeneous computing workshop; 2000. p. 349-63.
[22] Ranganathan K Foster I. Decoupling computation and data scheduling in distributed data-intensive applications. In: Proceedings of 11th IEEE international symposium on high performance distributed computing (HPDC’02); 2002. p. 352-8.
[23] GriPhyN. Grid physics network in ATLAS. 〈http://www.usatlas.bnl.gov/computing/grid/griphyn/〉; 2011 [visited].
[24] Hoschek, W.; Jaen-Martinez, J.; Samar, A.; Stockinger, H.; Stockinger, K., Data management in an international data grid project, (Buyya, R.; Baker, M., Grid computing, vol. 1971, (2000), Springer-Verlag New York), 77-90
[25] SAM. 〈http://projects.fnal.gov/samgrid/〉; 2011 [visited].
[26] Chakrabarti, A.; Sengupta, S., Scalable and distributed mechanisms for integrated scheduling and replication in data grids, (Rao, S.; Chatterjee, M.; Jayanti, P.; Murthy, C.; Saha, S., Distributed computing and networking, vol. 4904, (2008), Springer Berlin/Heidelberg), 227-238 · Zbl 1131.68351
[27] GILDA. 〈https://gilda.ct.infn.it/〉; 2011 [visited].
[28] Feitelson, D.; Rudolph, L.; Schwiegelshohn, U.; Sevcik, K.; Wong, P., Theory and practice in parallel job scheduling, (Feitelson, D.; Rudolph, L., Job scheduling strategies for parallel processing, vol. 1291, (1997), Springer Berlin/Heidelberg), 1-34
[29] Mohamed HH, Epema DHJ. An evaluation of the close-to-files processor and data co-allocation policy in multiclusters. In: Proceedings of 2004 IEEE international conference on cluster computing; 2004. p. 287-98.
[30] Chakrabarti, A.; Dheepak, R.; Sengupta, S., Integration of scheduling and replication in data grids, (Bougé, L.; Prasanna, V., High performance computing—HiPC, vol. 3296, (2005), Springer Berlin/Heidelberg), 85-101
[31] Chang, R.-S.; Chang, J.-S.; Lin, S.-Y., Job scheduling and data replication on data grids, Future Generation Computer Systems, 23, 846-860, (2007)
[32] UniGrid. Taiwan unigrid environment. 〈http://www.unigrid.org.tw〉; 2011 [visited].
[33] Dang, N. N.; Lim, S. B., Combination of replication and scheduling in data grids, International Journal of Computer Science and Network Security (IJCSNS), 7, 304-308, (2007)
[34] Abdi, S.; Mohamadi, S., Two level job scheduling and data replication in data grid, International Journal of Grid Computing and Applications (IJGCA), 1, 23-37, (2010)
[35] Bell, W. H.; Cameron, D. G.; Capozza, L.; Millar, A. P.; Stockinger, K.; Zini, F., Optorsim: a grid simulator for studying dynamic data replication strategies, The International Journal of High Performance Computing Applications, 17, 403-416, (2003)
[36] Subrata, R.; Zomaya, A. Y.; Landfeldt, B., A cooperative game framework for QoS guided job allocation schemes in grids, IEEE Transactions on Computers, 57, 1413-1422, (2008) · Zbl 1390.91231
[37] Dick RP, Rhodes DL, Wolf W. TGFF: task graphs for free. In: Proceedings of the sixth international workshop on hardware/software codesign; 1998. p. 97-101.
[38] Hongzhang S, Oliker L, Biswas R. Job superscheduler architecture and performance in computational grid environments. In: Proceedings of the ACM/IEEE SC2003 conference (SC’03); 2003. p. 44-58.
[39] Karaboga D, Akay B, Ozturk C. Artificial bee colony (ABC) optimization algorithm for training feed-forward neural networks. In: Proceedings of the fourth international conference on modeling decisions for artificial intelligence; 2007. p. 318-29.
[40] Karaboga, D.; Basturk, B., A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm, Journal of Global Optimization, 39, 459-471, (2007) · Zbl 1149.90186
[41] Chin Soon C, Hean L Malcolm Yoke, Iyer S Appa, Leng G Kheng. A bee colony optimization algorithm to Job shop scheduling. In: Proceedings of the winter simulation conference (WSC 06); 2006. p. 1954-61.
[42] Tovey C. The honey bee algorithm: a biological inspired approach to internet server optimization. The alumni magazine for ISyE at Georgia Institute of technology; 2004. p. 13-5.
[43] Wong, L.-P.; Low, M. Y.H.; Chong, C. S., Bee colony optimization with local search for traveling salesman problem, International Journal on Artificial Intelligence Tools, 19, 305-334, (2010)
[44] Bitam S, Batouche M, Talbi EG. A survey on bee colony algorithms. In: Proceedings of 2010 IEEE international symposium on parallel and distributed processing, workshops and Ph.D. forum (IPDPSW); 2010. p. 1-8.
[45] Sang-Min P, Jair-Hoom K. Chameleon: a resource scheduler in a data grid environment. In: Proceedings of third IEEE international symposium on cluster computing and the grid (CCGRID’03); 2003. p. 258-65.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.