Process mining is a data science technique that could help with discovering, monitoring, improving or predicting processes. It could help you to exploit the data of your information systems in a useful way. Process mining is a research discipline located between process modelling & analysis and computational intelligence & data mining. Process mining starts with combining data of your information systems in an event log. In this event log every recorded event refers to an activity which is related to a case. Each event is executed by an originator, has a timestamp and events are ordered [1]. The event logs can store additional information that is added by the person or device that is executing the process [2].
Process mining process
Process mining starts with the planning of a process mining project. After the planning, the input and output of the project need to be determined. For instance: the models, objectives, data, and questions that are needed from the information systems. Questions that need to be answered to retrieve this data are “what can be used for analysis?” and “what are important questions?”. In the next stage of the process, a model will be build and linked to the event log. When the model is created, it could be extended with other perspectives according to the established relations in the previous stage. In the last step, the established models could be used for operational support [2].
Process mining types
Process mining could be used for different purposes as is mentioned in the introduction. Literature mentions different types of process mining [3][4][5][6], t main types of process mining are:
- Process discovery makes a model out of the event log without using any additional information. It is used to learn how the process flows.
- Process conformance compares an existing process model with the event log of that process. With this technique, it is possible to check if reality and the model correlate.
- Process enhancement tries to extend or improve an already existing process. While process conformance only checks the existing model, process enhancement changes this model. Van der Aalst [6] calls this re-engineering.
- The fourth type that is mentioned by Van der Aalst [6] is operational support. . For example, predictions could help with deadlines and predicting remaining costs.
Figure 1 [6] positions these four types of process mining. Figure 2 is an adapted figure from van der Aalst et al. [2] that describes the needed input and delivered output of the different process mining types. The operational support part of the figure is added to the original figure.
Figure 1: Overview positioning the different types of process mining and the role of log abstractions and model abstractions. Reprinted from [6].
Figure 2: The basic types of process mining explained in terms of input and output. Adapted from [2].
Process mining perspectives
How a process is used, by whom, and for what reason is valuable information for an auditor. Process mining could help with determining these aspects. These aspects represent the following different process mining perspectives that could be distinguished [1][7][8].
- The process perspective focuses on the flow of activities. The goal in this perspective is to find good descriptions of all possible paths, usually resulting in a process model.
- The organisational perspective focuses on how the organisational structure looks like. The goal of this perspective is to determine which roles are present and how they interact.
- The case perspective focuses on the characteristics of a case, like its path and actors.
- Van der Aalst et al. [2] mentioned a fourth process mining perspective in the process mining manifesto, the time perspective. This perspective focuses on the timing of the events, which could lead to discovering bottlenecks, monitor resources, predict the remaining time and measure service levels.
These perspectives help to determine how a process was executed, who was involved and what happened [7]. So, the perspectives indicate how a user looks at a process model. While the process mining types on the other hand determine how process mining could be used.
Proces model quality
It is not simple to determine if a process model is useful, it also depends on with what perspective the user looks at a model. There are many dimensions where the quality of a model could depend on. The following four main quality dimensions could be distinguished [5][9]:
- Fitness is the ability of the process model to reproduce all the cases from the event log. An activity in a process needs a trigger to start and has an output that is the trigger for another activity to start. The trigger output could be called a token, so every activity consumes and produces one or several tokens.
- Precision should avoid underfitting of the model. The model should not allow more behaviour than that occurred in the event logs. Models with a loop are not precise, in theory, they could have an infinite number of routes.
- Simplicity is capturing the complexity of a model. A process model should be as simple as possible with not too many nodes. A model that has many nodes could be seen as complex. A spaghetti model (see Figure 3 [11]) is an example of a model that is complex. Occam’s Razor [11] is often mentioned with this criterion, which states that one should choose the option with the least assumptions.
- Generalization should help to avoid overfitting. All cases in the data set, where the process model is based on, should fit into the model. But the model should not be limited to these cases. It should be possible for other behaviours, outside the data set to fit into the model.
Algorithms should find a balance between precision and generalization. In practice, it is hard to balance all four of the criteria because they could contradict each other. That is why algorithms mostly focus on only one or two of these quality . Which criteria depends on the viewpoint the user looks at the process model.
Figure 3: Spaghetti process describing the diagnosis and treatment [10].
Process mining is a data science technique that could help with discovering, monitoring, improving or predicting processes. So, there are different way how process mining could be used and how you could look at a process model. What quality criteria of a process model are important differ per user and the way he or she wants to use the model. Therefore, the user should always determine if the model is of good quality to reach their goal.
[1] van Dongen, B. F., de Medeiros, A. K., Verbeek, H. M., Weijters, A. J., & van der Aalst, W. M. (2005). The prom framework: A new era in process mining tool support. ICATPN (3536), 444-454.
[2] van der Aalst, W. M., Adriansyah, A., de Medeiros, A. K., Arcieri, F., Baier, T., Blickle, T., et al. (2011). Process mining manifesto. International Conference on Business Process Management , 169-194, Springer, Berlin, Heidelberg.
[3] Hakvoort, R., & Sluiter, A. (2008). Process Mining: Conformance analysis from a financial audit perspective. Int. J. Business Process Integration and Management. , 1-26.
[4] Rojas, E., Munoz-Gama, J., Sepúlveda, M., & Capurro, D. (2016). Process mining in healthcare: A literature review. Journal of biomedical informatics (61), 224-236.
[5] van der Aalst, W. M. (2011). Process mining: Discovery, Conformance and Enhancement of Business Processes. Springer: Dordrecht.
[6] van der Aalst, W. M. (2017). Process Discovery from Event Data: Relating Models and Logs Through Abstractions. Eindhoven: Technische Universiteit Eindhoven.
[7] Jans, M., Alles, M., & Vasarhelyi, M. (2013). The case for process mining in auditing: Sources of value added and areas of application. International Journal of Accounting Information Systems, 14(1), 1-20.
[8] van der Aalst, W. M., Reijers, H. A., Weijters, A. J., van Dongen, B. F., Alves de Medeiros, A. K., Song, M., & Verbeek, H. M. (2007). Business process mining: An industrial application. Information Systems, 32(5), 713-732.
[9] Buijs, J. C., Van Dongen, B. F., & van der Aalst, W. M. (2012). On the role of fitness, precision, generalization and simplicity in process discovery. OTM Confederated International Conferences” On the Move to Meaningful Internet Systems , 305-322.
[10] van der Aalst, W. M. (2011a). Process mining: discovering and improving Spaghetti and Lasagna processes. Computational Intelligence and Data Mining (CIDM), 2011 IEEE Symposium on, 1-7.
[11] Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. K. (1987). Occam’s razor. Information processing letters , 24 (6), 377-380.