Performance modeling of serial and parallel implementations of the fractional Adams-Bashforth-Moulton method.

*(English)*Zbl 1307.65105Summary: Numerical schemes for solving fractional differential equations are computationally heavy, due to the floating-point operations needed and, more importantly, the data flow within the entire memory system of a computer. We choose the fractional Adams-Bashforth-Moulton method as a representative numerical scheme, and review various code optimizations that can be applied to its serial and parallel implementations. As the most important contribution of this paper, we propose a simple methodology to analyze the achievable serial and parallel performance, based on quantifying the amount of data flow through various stages of the entire memory system, together with a small set of easily obtainable hardware parameters. This quantitative approach to performance modeling can in most cases pinpoint the real performance bottleneck, while also verifying the actual performance improvements due to various code optimizations. Moreover, the optimization techniques and performance modeling approach can both be applied to other convolution-intensive numerical methods for solving fractional differential equations.

##### MSC:

65L05 | Numerical methods for initial value problems |

65L06 | Multistep, Runge-Kutta and extrapolation methods for ordinary differential equations |

34A34 | Nonlinear ordinary differential equations and systems, general theory |

34A08 | Fractional ordinary differential equations and fractional differential inclusions |

65Y05 | Parallel numerical computation |

65Y20 | Complexity and performance of numerical algorithms |

##### Keywords:

fractional differential equation; fractional Adams-Bashforth-Moulton method; performance modeling; parallel computing; multicore CPU; numerical examples
PDF
BibTeX
XML
Cite

\textit{W. Zhang} et al., Fract. Calc. Appl. Anal. 17, No. 3, 617--637 (2014; Zbl 1307.65105)

Full Text:
DOI

##### References:

[1] | AMD Opteron™ 6200 Series Processors: Linux Tuning Guide, 2012. |

[2] | M. Caputo, Linear models of dissipation whose Q is almost frequency independent, II. Geophys. J. Int., 13, No 5 (1967), 529-539. http://dx.doi.org/10.1111/j.1365-246X.1967.tb02303.x |

[3] | B. Chapman, G. Jost and R. van der Pas, Using OpenMP: Portable Shared Memory Parallel Programming. MIT Press, Cambridge, MA (2007). |

[4] | K. Diethelm, The Analysis of Fractional Differential Equations. Springer, Berlin (2001). |

[5] | K. Diethelm, An efficient parallel algorithm for the numerical solution of fractional differential equations. Fract. Calc. Appl. Anal. 14, No 3 (2011), 475-490; DOI: 10.2478/s13540-011-0029-1; http://link.springer.com/article/10.2478/s13540-011-0029-1. http://dx.doi.org/10.2478/s13540-011-0029-1 |

[6] | K. Diethelm, N.J. Ford and A.D. Freed, A predictor-corrector approach for the numerical solution of fractional differential equations. Nonlinear Dynamics 29 (2002), 3-22. http://dx.doi.org/10.1023/A:1016592219341 · Zbl 1009.65049 |

[7] | S. Goedecker and A. Hoisie, Performance Optimization of Numerically Intensive Codes. SIAM, Philadelphia (2001). http://dx.doi.org/10.1137/1.9780898718218 · Zbl 0985.68006 |

[8] | W. Gropp and E. Lusk and A. Skjellum, Using MPI: Portable Parallel Programming with the Message-Passing Interface. MIT Press, Cambridge, MA (1994). · Zbl 0875.68206 |

[9] | Intel Xeon Processor E5-2670. http://ark.intel.com/products/64595/. |

[10] | Intel Xeon Processor E5504. http://ark.intel.com/products/40711/. |

[11] | E.D. Lazowska, J. Zahorjan, G.S. Graham and K.C. Sevcik, Quantitative System Performance: Computer System Analysis Using Queueing Network Models. Prentice-Hall, Upper Saddle River, NJ (1984). |

[12] | PAPI: Performance Application Programming Interface. http://icl.cs.utk.edu/papi/. |

[13] | Scalasca homepage. http://www.scalasca.org/. |

[14] | The STREAM2 Home Page. http://www.cs.virginia.edu/stream/stream2/. |

[15] | S. Williams, A. Waterman and D. Patterson, Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52, No 4 (2009), 65-76. http://dx.doi.org/10.1145/1498765.1498785 |

[16] | W. Zhang and X. Cai, Efficient implementations of the Adams-Bashforth-Moulton method for solving fractional differential equations. In: Proceedings of FDA’12, Nanjing (2012). |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.