×

Decentralized list scheduling. (English) Zbl 1273.90086

Summary: Classical list scheduling is a very popular and efficient technique for scheduling jobs for parallel and distributed platforms. It is inherently centralized. However, with the increasing number of processors, the cost for managing a single centralized list becomes too prohibitive. A suitable approach to reduce the contention is to distribute the list among the computational units: each processor only has a local view of the work to execute. Thus, the scheduler is no longer greedy and standard performance guarantees are lost.{ }The objective of this work is to study the extra cost that must be paid when the list is distributed among the computational units. We first present a general methodology for computing the expected makespan based on the analysis of an adequate potential function which represents the load imbalance between the local lists. We obtain an equation giving the evolution of the potential by computing its expected decrease in one step of the schedule. Our main theorem shows how to solve such equations to bound the makespan. Then, we apply this method to several scheduling problems, namely, for unit independent tasks, for weighted independent tasks and for tasks with precedence constraints. More precisely, we prove that the time for scheduling a global workload \(W\) composed of independent unit tasks on \(m\) processors is equal to \(W/m\) plus an additional term proportional to \(\log_2 W\). We provide a lower bound which shows that this is optimal up to a constant. This result is extended to the case of weighted independent tasks. In the last setting, precedence task graphs, our analysis leads to an improvement on the bound of N. S. Arora et al. [Theory Comput. Syst. 34, No. 2, 115–144 (2001; Zbl 0978.68020)]. We end with some experiments using a simulator. The distribution of the makespan is shown to fit existing probability laws. Moreover, the simulations give a better insight into the additive term whose value is shown to be around \(3 \log_2 W\) confirming the precision of our analysis.

MSC:

90B35 Deterministic scheduling theory in operations research

Citations:

Zbl 0978.68020

Software:

KAAPI
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Adler, M., Chakrabarti, S., Mitzenmacher, M., & Rasmussen, L. (1995). Parallel randomized load balancing. In Proceedings of STOC (pp. 238–247). · Zbl 0968.68569
[2] Arora, N. S., Blumofe, R. D., & Plaxton, C. G. (2001). Thread scheduling for multiprogrammed multiprocessors. Theory of Computing Systems, 34(2), 115–144. · Zbl 0978.68020
[3] Azar, Y., Broder, A. Z., Karlin, A. R., & Upfal, E. (1999). Balanced allocations. SIAM Journal on Computing, 29(1), 180–200. doi: 10.1137/S0097539795288490 . · Zbl 0937.68053 · doi:10.1137/S0097539795288490
[4] Bender, M. A., & Rabin, M. O. (2002). Online scheduling of parallel programs on heterogeneous systems with applications to Cilk. Theory of Computing Systems, 35, 289–304. · Zbl 1017.68035 · doi:10.1007/s00224-002-1055-5
[5] Berenbrink, P., Friedetzky, T., & Goldberg, L. A. (2003). The natural work-stealing algorithm is stable. SIAM Journal on Computing, 32(5), 1260–1279. · Zbl 1027.60082 · doi:10.1137/S0097539701399551
[6] Berenbrink, P., Friedetzky, T., Goldberg, L. A., Goldberg, P. W., Hu, Z., & Martin, R. (2007). Distributed selfish load balancing. SIAM Journal on Computing, 37(4), 1163–1181. doi: 10.1137/060660345 . · Zbl 1141.68018 · doi:10.1137/060660345
[7] Berenbrink, P., Friedetzky, T., Hu, Z., & Martin, R. (2008). On weighted balls-into-bins games. Theoretical Computer Science, 409(3), 511–520. · Zbl 1155.68085 · doi:10.1016/j.tcs.2008.09.023
[8] Berenbrink, P., Friedetzky, T., & Hu, Z. (2009). A new analytical method for parallel, diffusion-type load balancing. Journal of Parallel and Distributed Computing, 69(1), 54–61. · Zbl 06521952 · doi:10.1016/j.jpdc.2008.05.005
[9] Blumofe, R. D., & Leiserson, C. E. (1999). Scheduling multithreaded computations by work stealing. Journal of the ACM, 46(5), 720–748. · Zbl 1065.68504 · doi:10.1145/324133.324234
[10] Chekuri, C., & Bender, M. (2001). An efficient approximation algorithm for minimizing makespan on uniformly related machines. Journal of Algorithms, 41(2), 212–224. · Zbl 1051.68150 · doi:10.1006/jagm.2001.1184
[11] Drozdowski, M. (2009). Scheduling for parallel processing. Berlin: Springer. · Zbl 1187.68090
[12] Frigo, M., Leiserson, C. E., & Randall, K. H. (1998). The implementation of the Cilk-5 multithreaded language. In Proceedings of PLDI.
[13] Gast, N., & Gaujal, B. (2010). A mean field model of work stealing in large-scale systems. In Proceedings of SIGMETRICS.
[14] Gautier, T. (2010). Personal communication.
[15] Gautier, T., Besseron X., & Pigeon, L. (2007). KAAPI: a thread scheduling runtime system for data flow computations on cluster of multi-processors. In Proceedings of PASCO (pp. 15–23).
[16] Graham, R. L. (1969). Bounds on multiprocessing timing anomalies. SIAM Journal on Applied Mathematics, 17, 416–429. · Zbl 0188.23101 · doi:10.1137/0117039
[17] Hwang, J. J., Chow, Y. C., Anger, F. D., & Lee, C. Y. (1989). Scheduling precedence graphs in systems with interprocessor communication times. SIAM Journal on Computing, 18(2), 244–257. · Zbl 0677.68026 · doi:10.1137/0218016
[18] Kotz, S., & Nadarajah, S. (2001). Extreme value distributions: theory and applications. Singapore: World Scientific. · Zbl 0960.62051
[19] Leung, J. (2004). Handbook of scheduling: algorithms, models, and performance analysis. Boca Raton: CRC Press. · Zbl 1103.90002
[20] Lueling, R., & Monien, B (1993). A dynamic distributed load balancing algorithm with provable good performance. In SPAA: annual ACM symposium on parallel algorithms and architectures.
[21] Mitzenmacher, M. (1998). Analyses of load stealing models based on differential equations. In Proceedings of SPAA (pp. 212–221).
[22] Robert, Y., & Vivien, F. (2009). Introduction to scheduling. London/Boca Raton: Chapman & Hall/CRC Press.
[23] Robison, A., Voss, M., & Kukanov, A. (2008). Optimization via reflection on work stealing in TBB. In Proceedings of IPDPS (pp. 1–8).
[24] Rudolph, L., Slivkin-Allalouf, M., & Upfal, E. (1991). A simple load balancing scheme for task allocation in parallel machines. In SPAA (pp. 237–245).
[25] Sanders, P. (1999). Asynchronous random polling dynamic load balancing. In A. Aggarwal & C. P. Rangan (Eds.), Lecture notes in computer science: Vol. 1741. ISAAC (pp. 37–48). Berlin: Springer. · Zbl 0970.68620
[26] Schwiegelshohn, U., Tchernykh, A., & Yahyapour, R. (2008). Online scheduling in grids. In Proceedings of IPDPS. · Zbl 1208.68102
[27] Tchiboukdjian, M., Gast, N., Trystram, D., Roch, J. L., & Bernard, J. (2010). A tighter analysis of work stealing. In The 21st international symposium on algorithms and computation (ISAAC). · Zbl 1310.68050
[28] Traoré, D., Roch, J. L., Maillard, N., Gautier, T., & Bernard, J. (2008). Deque-free work-optimal parallel STL algorithms. In Proceedings of Euro-Par (pp. 887–897).
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.