Discovering workflow nets using integer linear programming.

*(English)*Zbl 1395.90166Summary: Process mining is concerned with the analysis, understanding and improvement of business processes. Process discovery, i.e. discovering a process model based on an event log, is considered the most challenging process mining task. State-of-the-art process discovery algorithms only discover local control flow patterns and are unable to discover complex, non-local patterns. Region theory based techniques, i.e. an established class of process discovery techniques, do allow for discovering such patterns. However, applying region theory directly results in complex, overfitting models, which is less desirable. Moreover, region theory does not cope with guarantees provided by state-of-the-art process discovery algorithms, both w.r.t. structural and behavioural properties of the discovered process models. In this paper we present an ILP-based process discovery approach, based on region theory, that guarantees to discover relaxed sound workflow nets. Moreover, we devise a filtering algorithm, based on
the internal working of the ILP-formulation, that is able to cope with the presence of infrequent, exceptional behaviour. We have extensively evaluated the technique using different event logs with different levels of exceptional behaviour. Our experiments show that the presented approach allows us to leverage the inherent shortcomings of existing region-based approaches. The techniques presented are implemented and readily available in the HybridILPMiner package in the open-source process mining tool-kits ProM (http://promtools.org) and RapidProM (http://rapidprom.org).

PDF
BibTeX
XML
Cite

\textit{S. J. van Zelst} et al., Computing 100, No. 5, 529--556 (2018; Zbl 1395.90166)

Full Text:
DOI

**OpenURL**

##### References:

[1] | Badouel E, Bernardinello L, Darondeau P (1995) Polynomial algorithms for the synthesis of bounded nets. In: Mosses PD, Nielsen M, Schwartzbach MI (eds) TAPSOFT’95: theory and practice of software development, 6th international joint conference CAAP/FASE, Aarhus, Denmark, May 22-26, 1995, Proceedings. Lecture notes in computer science, vol 915, pp 364-378. Springer, Berlin |

[2] | Bergenthum, R; Desel, J; Lorenz, R; Mauser, S, Synthesis of Petri nets from finite partial languages, Fundam Inf, 88, 437-468, (2008) · Zbl 1167.68037 |

[3] | Bergenthum R, Desel J, Lorenz R, Mauser S (2007) Process mining based on regions of languages. In: Alonso G, Dadam P, Rosemann M (eds) 5th international conference on business process management, BPM 2007, Brisbane, Australia, September 24-28, 2007, Proceedings. Lecture notes in computer science, 4714, pp 375-383. Springer |

[4] | Bernardinello L (1993) Synthesis of net systems. In: Marsan MA (ed) 14th International conference on application and theory of Petri nets 1993, Chicago, IL, USA, June 21-25, 1993, Proceedings. Lecture notes in computer science, vol 691, pp 89-105. Springer |

[5] | Bolt, A; Leoni, M; Aalst, WMP, Scientific workflows for process mining: building blocks, scenarios, and implementation, STTT, 18, 607-628, (2016) |

[6] | Buijs JCAM, van Dongen BF, van der Aalst WMP (2012) A genetic algorithm for discovering process trees. In: Proceedings of the IEEE congress on evolutionary computation, CEC 2012, Brisbane, Australia, June 10-15, 2012, pp 1-8. IEEE |

[7] | Buijs JCAM, van Dongen BF, van der Aalst WMP (2012) On the role of fitness, precision, generalization and simplicity in process discovery. In: Meersman R, Panetto H, Dillon TS, Rinderle-Ma S, Dadam P, Zhou X, Pearson S, Ferscha A, Bergamaschi S, Crux IF (eds) On the move to meaningful internet systems: OTM 2012, confederated international conferences: CoopIS, DOA-SVI, and ODBASE 2012, Rome, Italy, September 10-14, 2012. Proceedings, Part I. Lecture notes in computer science, vol 7565, pp 305-322. Springer |

[8] | Carmona, J; Cortadella, J, Process discovery algorithms using numerical abstract domains, IEEE Trans Knowl Data Eng, 26, 3064-3076, (2014) |

[9] | Conforti, R; Rosa, M; Hofstede, AHM, Filtering out infrequent behavior from business process event logs, IEEE Trans Knowl Data Eng, 29, 300-314, (2017) |

[10] | Darondeau P (1998) Deriving unbounded petri nets from formal languages. In: Sangiorgi D, de Simone R (eds) 9th international conference on CONCUR ’98: concurrency theory, Nice, France, September 8-11, 1998, Proceedings. Lecture notes in computer science, vol 1466, pp 533-548. Springer · Zbl 0940.68097 |

[11] | de Leoni M, Mannhardt F (2015) Road traffic fine management process. https://doi.org/10.4121/uuid:270fd440-1057-4fb9-89a9-b699b47990f5 · Zbl 1183.68710 |

[12] | Weerdt, J; Backer, M; Vanthienen, J; Baesens, B, A multi-dimensional quality assessment of state-of-the-art process discovery algorithms using real-life event logs, Inf Syst, 37, 654-676, (2012) |

[13] | Ehrenfeucht, A; Rozenberg, G, Partial (set) 2-structures. part I: basic notions and the representation problem, Acta Inf, 27, 315-342, (1990) · Zbl 0696.68082 |

[14] | Ehrenfeucht, A; Rozenberg, G, Partial (set) 2-structures. part II: state spaces of concurrent systems, Acta Inf, 27, 343-368, (1990) · Zbl 0696.68083 |

[15] | Leemans SJJ, Fahland D, van der Aalst WMP (2013) Discovering block-structured process models from event logs containing infrequent behaviour. In: Lohmann N, Song M, Wohed P (eds) Business process management workshops—BPM 2013 international workshops, Beijing, China, August 26, 2013, Revised papers. Lecture notes in business information processing, vol 171, pp 66-78. Springer |

[16] | Leemans SJJ, Fahland D, van der Aalst WMP (2013) Discovering block-structured process models from event logs—a constructive approach. In: Colom JM, Desel J (eds) Application and theory of Petri nets and concurrency—34th international conference, Petri Nets 2013, Milan, Italy, June 24-28, 2013. Proceedings. Lecture notes in computer science, vol 7927, pp 311-329. Springer · Zbl 1381.68211 |

[17] | Lorenz R, Juh G (2006) Towards synthesis of Petri nets from scenarios. In: Donatelli S, Thiagarajan PS (eds) Petri nets and other models of concurrency—ICATPN 2006, 27th international conference on applications and theory of Petri nets and other models of concurrency, Turku, Finland, June 26-30, 2006, Proceedings. Lecture notes in computer science, vol 4024, pp 302-321. Springer |

[18] | Lorenz R, Mauser S, Juh G (2007) How to synthesize nets from languages—a survey. In: Henderson SG, Biller B, Hsieh MH, Shortle J, Tew JD, Barton RR (eds) Proceedings of the winter simulation conference, WSC 2007, Washington, DC, USA, December 9-12, 2007, pp 637-647. WSC · Zbl 1382.68153 |

[19] | Mannhardt F (2016) Sepsis cases—event log. https://doi.org/10.4121/uuid:915d2bfb-7e84-49ad-a286-dc35f063a460 |

[20] | Maruster, L; Weijters, AJMM; Aalst, WMP; Bosch, A, A rule-based approach for process discovery: dealing with noise and imbalance in process logs, Data Min Knowl Discov, 13, 67-87, (2006) |

[21] | Munoz-Gama J (2016) Conformance checking and diagnosis in process mining—comparing observed and modeled processes. Lecture notes in business information processing, vol 270. Springer · Zbl 1375.68008 |

[22] | Murata, T, Petri nets: properties, analysis and applications, Proc IEEE, 77, 541-580, (1989) |

[23] | Reisig, W, The synthesis problem, Trans Petri Nets Other Models Concurr, 7, 300-313, (2013) · Zbl 1382.68153 |

[24] | Schrijver A (1999) Theory of linear and integer programming. Wiley-Interscience series in discrete mathematics and optimization. Wiley, London |

[25] | Solé M, Carmona J (2010) Process mining from a basis of state regions. In: Lilius J, Penczek W (eds) Applications and theory of Petri nets, 31st international conference, Petri nets 2010, Braga, Portugal, June 21-25, 2010, Proceedings. Lecture notes in computer science, vol 6128, pp 226-245. Springer · Zbl 0696.68083 |

[26] | van der Aalst WMP, Bolt A, van Zelst SJ (2017) RapidProM: mine your processes and not just your data. CoRR abs/1703.03740 · Zbl 0696.68082 |

[27] | Aalst, WMP, The application of Petri nets to workflow management, J Circuits Syst Comput, 8, 21-66, (1998) |

[28] | van der Aalst WMP (2016) Process mining—data science in action, 2nd edn. Springer, Berlin |

[29] | Aalst, WMP; Hofstede, AHM; Kiepuszewski, B; Barros, AP, Workflow patterns, Distrib Parallel Datab, 14, 5-51, (2003) |

[30] | Aalst, WMP; Weijters, AJMM; Maruster, L, Workflow mining: discovering process models from event logs, IEEE Trans Knowl Data Eng, 16, 1128-1142, (2004) |

[31] | Aalst, WMP; Rubin, V; Verbeek, HMW; Dongen, BF; Kindler, E; Günther, CW, Process mining: a two-step approach to balance between underfitting and overfitting, Softw Syst Model, 9, 87-111, (2010) |

[32] | Aalst, WMP; Hee, KM; Hofstede, AHM; Sidorova, N; Verbeek, HMW; Voorhoeve, M; Wynn, MT, Soundness of workflow nets: classification, decidability, and analysis, Formal Asp Comput, 23, 333-363, (2011) · Zbl 1225.68129 |

[33] | Aalst, WMP; Adriansyah, A; Dongen, BF, Replaying history on process models for conformance checking and performance analysis, Wiley Interdiscipl Rew Data Min Knowl Discov, 2, 182-192, (2012) |

[34] | Werf, JMEM; Dongen, BF; Hurkens, CAJ; Serebrenik, A, Process discovery using integer linear programming, Fundam Info, 94, 387-412, (2009) · Zbl 1183.68710 |

[35] | Dongen, BF; Medeiros, AKA; Wen, L, Process mining: overview and outlook of Petri net discovery algorithms, Trans Petri Nets Other Models Concurr, 2, 225-242, (2009) |

[36] | van Zelst SJ, van Dongen BF, van der Aalst WMP (2015) Avoiding over-fitting in ILP-based process discovery. In: Motahari-Nezhad HR, Recker J, Weidlich M (eds) Business process management—13th international conference, BPM 2015, Innsbruck, Austria, August 31-September 3, 2015, Proceedings. Lecture notes in computer science, vol 9253, pp 163-171. Springer |

[37] | van Zelst SJ, van Dongen BF, van der Aalst WMP (2015) ILP-based process discovery using hybrid regions. In: van der Aalst WMP, Bergenthum R, Carmona J (eds) Proceedings of the ATAED 2015 workshop, satellite event of Petri Nets/ACSD 2015, Brussels, Belgium, June 22-23, 2015. CEUR workshop proceedings, vol 1371 pp 47-61. CEUR-WS.org |

[38] | van Zelst SJ, van Dongen BF, van der Aalst WMP, Verbeek HMW (2017) Discovering relaxed sound workflow nets using integer linear programming. CoRR abs/1703.06733 |

[39] | Verbeek HMW, Buijs JCAM, van Dongen BF, van der Aalst WMP (2010) XES, XESame, and ProM 6. In: Soffer P, Proper E (eds) Information systems evolution—CAiSE Forum 2010, Hammamet, Tunisia, June 7-9, 2010, Selected extended papers. Lecture notes in business information processing, vol 72, pp 60-75. Springer |

[40] | Weijters AJMM, Ribeiro JTS (2011) Flexible heuristics miner (FHM). In: Proceedings of the IEEE symposium on computational intelligence and data mining, CIDM 2011, part of the IEEE symposium series on computational intelligence 2011, April 11-15, 2011, Paris, France, pp 310-317 |

[41] | Weijters, AJMM; Aalst, WMP, Rediscovering workflow models from event-based data using little thumb, Integr Comput-Aided Eng, 10, 151-162, (2003) |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.