Master Thesis - Most Probable Alignment Computation
Most organizations, in a variety of fields such as banking, insurance and healthcare, execute several different (business) processes. Modern information systems allow us to track, store and retrieve data related to the execution of such processes, in the form of event logs. Often, an organization has a global idea, or even a formal specification, of how the process is supposed to be executed. In other cases, laws and legislations dictate explicitly in what way a process is ought to be executed. Hence, it is in the company's interest to assess to what degree the execution of their processes is in line with the corresponding specification.
Conformance checking techniques, originating from the field of process mining, aim at solving the aforementioned problem. Conformance checking techniques allow us to quantify to what degree the actual execution of a process, as recorded in an event log, conforms to a corresponding process specification. Recently, alignments were introduced, which rapidly developed into the de-facto standard in conformance checking. The major advantage of alignments w.r.t. alternative conformance checking techniques, is the fact that deviations and/or mismatches are quantified in an exact, unambiguous manner.
When computing alignments, we convert a given process model, together with the behaviour recorded in an event log, into a synchronous product net and subsequently solve a shortest path problem on its state space. Typically, the well known A*algorithm is used as an underlying solution to the shortest path problem. Moreover, several (in some cases alignment-specific) parametrization options are defined and applied on top of the basic A* solution.
The fact that we solve a shortest path problem on the state-space of the aforementioned synchronous product net, usually results in the notion of an optimal alignment, i.e. an alignment that minimizes a cost function defined on top of the corresponding state-space. Usually a static cost function is used that treats any possible deviation as equally likely. Recently, several researchers have noted that this is rather unnatural, i.e. if we observe that some part of process behaviour is missing, why would very rare behaviour be equally likely to be inserted into an alignment as very dominant, omni-present behaviour?
In recent work, researchers have therefore proposed an alternative definition, i.e. the most likely alignment. Such alignment, strives to explain deviations of the reference models in terms of the most likely alternative behaviour, rather than in terms of optimality. A concrete implementation, and experimental evaluation of the behaviour of such most likely alignments w.r.t. conventional alignments is however missing. In this MSc. project, the student is therefore asked to implement, develop and compare several strategies for the purpose of most-likely alignment calculation. Note that initially a set of existing definitions needs to be implemented, yet the student is explicitly encouraged to define alternative solutions as well.
Knowledge of basic computer science concepts, good programming skills (Java/Python) and an interest in theoretical and practical aspects of process mining (i.e. conformance checking) recommended.
- Process Mining Book - Data Science in Action
- Coursera Process Mining Course
- Paper: Replaying History on Process Models for Conformance Checking and Performance Analysis
- Paper: Tuning Alignment Computation: An Experimental Evaluation
- Paper: Automatic Root Cause Identification Using Most Probable Alignments
Prof. Wil van der Aalst
ir. Sebastiaan van Zelst
For more Information
Send an e-mail to Sebastiaan van Zelst. Make sure to include detailed information about your background and scores for completed courses.