[22]:
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
warnings.filterwarnings("ignore", category=DeprecationWarning)

Synthetic Log Generation from Declare Positional Models

What is a Declare Positional Based Model

A Declare positional based model is essentially a Declare model with different type of constraints.

Defining an activity

Activities can be defined through the keyword activity. More activities can be defined in the same line using the comma as a separator.

During the definition of an activity remember that the colons character ``:`` followed by a space cannot be used since is used by the parser to distinguish the difference between activities and attributes or attributes and values

Example : activity activity_name_1, ..., activity_name_n or simply activity activity_name_1

Assigning attributes to activities

Activities has attributes and in order to assign attributes the keyword bind.

Example : bind activity_name_1, ..., activity_name_n: attribute_name_1, ... , attribute_name_n

By Using this type definition line to more activities can be assigned more attributes. This way every activity will have the same attributes as the other defined in the same line.

Otherwise, the binding can also be done singularly: bind activity_name_1: attribute_name_1, ... , attribute_name_n

Note: If a previous bind assigned some attributes to an activity, another bind will add the new attributes on top resulting in the activity having the attributes of the first and second bind Note: If the activity is not defined the parser of the PositionalBased Model will define an activity for you, hence the definition of an activity can actually be omitted by using directly the line:

Example : bind activity_name_1, ..., activity_name_n: attribute_name_1, ... , attribute_name_n In order to define both activities and attributes in one line.

Assigning values to attributes

Remember to assign values to the attributes created before, otherwise the PositionalBasedModel will launch an error!

In order to assign some values to attributes it is not necessary to use a keyword. The following line can be used as an example in order to assign more values to more attributes:

Example: attribute_name_1, ..., attribute_name_n: attribute_value_1, ... , attribute_value_n

This type of definition line creates attributes of type Enumeration

In order to define a range of integers or float, instead use the following definition:

Integer Example: attribute_name_1, ..., attribute_name_n: integer between x and y

Float Example: attribute_name_1, ..., attribute_name_n: float between x and y

This type of definition line creates attributes of type integer between or float between.

Note: In the definition of float between the number must be written with the point as a separator for floating point numbers. The precision of the floating point number is defined by how many values exists after the point:

Float Example: attribute_name_1: float between 10.0 and 15.0: 1 floating point digit

Float Example: attribute_name_1: float between 10.00 and 15.00: 2 floating point digit

Float Example: attribute_name_1: float between 10.0 and 15.000: 3 floating point digit. (The number with bigger precision is selected for the calculation of the digit precision)

Note: Redefining an attribute with its values will overwrite the current definition!

Creating constraints

In the following 2 tables is presente the definition of the constraints and their arguments, together with the possible actions that can be performed with each parameter.

Declare Function

Argument 1

Type

Supports Variables

Supports Conditional Operators

Arg can be empty

Argument 2

Type

Supports Variables

Supports Conditional Operators

Arg can be empty

pos(activity a, position p, time t)

activity a

encode

yes

yes (only !=)

no

position p

int

yes

yes

yes

payload(attribute a, value v, position p)

attribute a

encode

yes

yes (only !=)

no

value v

any

yes

yes

no

payload_range(attribute a, min_value min, max_value max, position p)

attribute a

encode

yes

no

no

min_value min

int / float

no

no

yes

absolute_pos(activity a, position p, time t)

activity a

encode

yes

no

no

position p

int

yes

no

no

pos_not_greater_than(activity a, position p, time t)

activity a

encode

yes

no

no

position p

int

yes

no

no

pos_not_lower_than(activity a, position p, time t)

activity a

encode

yes

no

no

position p

int

yes

no

no

absolute_payload(attribute a, value v)

attribute a

encode

yes

no

no

value v

any

yes

no

no

Declare Function

Argument 3

Type

Supports Variables

Supports Conditional Operators

Arg can be empty

Argument 4

Type

Supports Variables

Supports Conditional Operators

Arg can be empty

pos(activity a, position p, time t)

time t

int

yes

yes

yes

payload(attribute a, value v, position p)

position p

int

yes

yes

yes

payload_range(attribute a, min_value min, max_value max, position p)

max_value max

int / float

no

no

yes

position p

int

yes

yes

yes

absolute_pos(activity a, position p, time t)

time t

int

yes

no

yes

pos_not_greater_than(activity a, position p, time t)

time t

int

yes

no

yes

pos_not_lower_than(activity a, position p, time t)

time t

int

yes

no

yes

Typing

Arguments can have 4 different types:

  • int : The argument supports integers numbers .

  • float : The argument supports floating point numbers .

  • encode : This type of argument defines a name of an Activity, a name of an Attribute or an Enumeration value.

  • any : An argument defined as any can be either int, float or encode.

Variables

The arguments that supports this feature can have their value replaced with the character sequence that defines a variables.

In order to define a variable use : followed by upper case characters or numbers. (No spaces). Example: :VAR1

Example: pos(activity a, :P1, :T1) or payload(attribute a, :V2, :P2)

Conditional constraints

Conditional constraints can be implemented with the use of variables. This type of constraint creates the possibility to enhance the Declare Functions in order to create different results.

Structure of a conditional constraint: value or variable (+,-) value or variable (==,!=,<=,<,>,>=) value or variable (+,-) value or variable

Example of Conditional constraints: :Var1 != 7, :Var1 == :Var2, :Var1 == :Var2 + 10, :Var1 >= :Var2 - 10

Variables and Conditional constraints

Altogether with the use of variables and conditional constraints, the definition of constraints can be changed to:

Example: pos(activity a, :P1, :T1), pos(activity b, :P2, :T2), :P1 == :P2 + 2, :T1 + 10 <= :T2

Example: pos(:ACT1, position p, time t), :ACT1 != a

Conditional Operators

Arguments that support conditional operators can be written in the following way:

Example: pos(a, :P1, >=5), pos(a, >=:P1, 10!=) which corresponds to pos(a, :P1, :T1), :T1 >= 5, pos(a, :P2, :T2), :P2 >= :P1, 10 != :T2

This gives the possibility to bound variable or values in different ways.

Note: Conditional operators can be placed in front or at the end of a value or variable. The parsing is done completely by the Positional Based model.

Empty Arguments

Arguments can be left empty using _. This means that it doesn’t matter what value is there. The generator will then choose any value for that argument that fits.

Some more examples:

Declare Function Example

Function

pos(activity a, position p, time t)

The activity “a” will be placed in position “p” at time “t”

pos(activity a, position p, _)

The activity “a” will be placed in position “p” at any time “t”

pos(activity a, ,)

The activity “a” will be present at least 1 time in the events at any position “p” and at any time “t”

payload(attribute a, value v, position p)

The attribute “a” in position “p” will have a value “v”

payload(attribute a, value v, _)

One attribute “a” in any position “p” will have a value “v”

payload_range(attribute a, int min, int max, position p)

The attribute “a” in position “p” will have a value in between “min” and “max”

payload_range(attribute a, int min, _, position p)

The attribute “a” in position “p” will have a value in between “min” and the maximum value that the attribute can have

payload_range(attribute a, _, int max, position p)

The attribute “a” in position “p” will have a value in between the minimum value that the attribute can have and “max”

payload_range(attribute a, ,, position p)

The attribute “a” in position “p” will have a value in between the minimum value that the attribute can have and the maximum value that the attribute can have

payload_range(attribute a, int min, int max, _)

One attribute “a” in any position “p” will have a value in between “min” and “max”

absolute_pos(activity a, position p, time t)

The activity “a” will be placed in position “p” at time “t” and there will not be any more occurrences of that activity in the trace events

pos_not_greater_than(activity a, position p, time t)

The activity “a” will be placed in a position not greater than “p” at a time not greater than “t” and there will not be any more occurrences of that activity after position “p” at time “t”

pos_not_greater_than(activity a, position p, _)

The activity “a” will be placed in a position not greater than “p” at any time “t” and there will not be any more occurrences of that activity after position “p”

pos_not_lower_than(activity a, position p, time t)

The activity “a” will be placed in a position not lower than “p” at a time not lower than “t” and there will not be any more occurrences of that activity before position “p” at time “t”

pos_not_lower_than(activity a, position p, _)

The activity “a” will be placed in a position not lower than “p” at any time “t” and there will not be any more occurrences of that activity before position “p”

absolute_payload(attribute a, value v)

The attribute “a” will have value “v” in any activity that contains the attribute “a” in the trace

Functioning of the positional model

Let’s see how the model works in practice. First import the model.

[2]:
from Declare4Py.ProcessMiningTasks.LogGenerator.PositionalBased.PositionalBasedModel import PositionalBasedModel

In order to instantiate a PositionalBasedModel the functions: parse_from_file or parse_from_string can be used. Both functions read PositionalBasedDeclare and create a model accordingly.

For our model, the file experimental_model.decl will be used.

With the creation of the PositionalBasedModel these parameters can be inserted in the constructor:

  • positional_time_start: int = Indicates the starting time unit for the positional based model. Standard value: 1

  • positional_time_end: int = Indicates the ending time unit for the positional based model. Standard value: 100

  • time_unit_in_seconds_min: int = Indicates the minimum value in seconds that 1 positional_time_unit can have. Standard value: 240

  • time_unit_in_seconds_max: int = Indicates the maximum value in seconds that 1 positional_time_unit can have. Standard value: 300

  • verbose: bool = Indicates if the user wants to see debug messages.

Each trace generated has a positional time unit value. 1 positional time unit corresponds to a number of seconds in range time_unit_in_seconds_min and time_unit_in_seconds_max. When the sequence of events are generated the timestamp is calculated based on the time unit assigned

NOTE: positional_time_end cannot be lower than the maximum number of event, otherwise the problem becomes UNSAT

NOTE: The lesser is the difference of time between positional_time_start and positional_time_end the lesser the time of generating traces will be.

NOTE: The bigger is the difference of time between positional_time_start and positional_time_end the bigger the time of generating traces will be.

[3]:
model_name = "experimental_model"
model_path: str = f"../../../Declare4Py/ProcessMiningTasks/LogGenerator/PositionalBased/DeclareFiles/{model_name}.decl"
model: PositionalBasedModel = PositionalBasedModel().parse_from_file(model_path)

Positional Time and Time Unit in seconds can be changed anytime from the model using the functions set_positional_based_time_range and set_time_unit_in_seconds_range. Calling the function with empty parameters will reset the values.

[4]:
model.set_positional_based_time_range()
model.set_time_unit_in_seconds_range()

The Positional Based Model can be exported to ASP or Declare strings using:

[5]:
# Returns the ASP string of the model form positive traces
print(model.to_asp())
time(1..100).

% p = number of events in each trace
pos(1..p).
% Generating part
{timed_event(A,P,T) : activity(A), time(T)} = 1 :- pos(P).
{assigned_value(K,V,P) : attribute_value(K,V)} = 1 :- timed_event(A,P,_), has_attribute(A,K).
% time event rule
:- timed_event(_,P1,T1), timed_event(_,P1+1,T2), T1 >= T2.
% traces length filter
:- p != #count{P : timed_event(_, P, _)}.
%returns
#show timed_event/3.
#show assigned_value/3.


activity(terapia_A).
has_attribute(terapia_A, resource).
has_attribute(terapia_A, qt).
has_attribute(terapia_A, age).

activity(terapia_B).
has_attribute(terapia_B, resource).
has_attribute(terapia_B, qt).
has_attribute(terapia_B, age).

activity(controllo).
has_attribute(controllo, resource).
has_attribute(controllo, age).

activity(analisi_sangue).
has_attribute(analisi_sangue, resource).
has_attribute(analisi_sangue, valore).
has_attribute(analisi_sangue, age).

attribute_value(resource, Y).
attribute_value(resource, C).
attribute_value(resource, K).
attribute_value(resource, O).
attribute_value(resource, X).
attribute_value(resource, R).
attribute_value(resource, Q).
attribute_value(resource, G).
attribute_value(resource, F).
attribute_value(resource, J).
attribute_value(resource, B).
attribute_value(resource, M).
attribute_value(resource, U).
attribute_value(resource, T).
attribute_value(resource, L).
attribute_value(resource, P).
attribute_value(resource, H).
attribute_value(resource, S).
attribute_value(resource, N).
attribute_value(resource, A).
attribute_value(resource, D).
attribute_value(resource, W).
attribute_value(resource, I).
attribute_value(resource, V).
attribute_value(resource, E).

attribute_value(qt, 2..149).

attribute_value(age, 20..90).

attribute_value(valore, 50..700).

c1r1_fixed_timed_event_rule :- timed_event(terapia_A, POS, TIME), POS != 2, TIME != 2.
c1r2_fixed_timed_event_rule :- timed_event(terapia_B, POS, TIME), POS != 5, TIME != 5.
c1r3_fixed_payload_rule :- assigned_value(age, ATTR_VALUE, _), ATTR_VALUE != 30.

rule :- timed_event(terapia_A, 2, 2), not c1r1_fixed_timed_event_rule, timed_event(terapia_B, 5, 5), not c1r2_fixed_timed_event_rule, timed_event(terapia_A, 2, 2), timed_event(terapia_B, 5, 5), timed_event(analisi_sangue, 1, _), timed_event(analisi_sangue, 3, _), timed_event(analisi_sangue, 6, _), timed_event(analisi_sangue, 9, _), timed_event(analisi_sangue, 11, _), timed_event(analisi_sangue, 15, _), timed_event(analisi_sangue, 17, _), timed_event(analisi_sangue, 19, _), assigned_value(age, 30, _), not c1r3_fixed_payload_rule, assigned_value(valore, @range(50,100), 1), assigned_value(valore, @range(150,400), 3), assigned_value(valore, @range(500,700), 6), assigned_value(valore, @range(500,700), 9), assigned_value(valore, @range(500,700), 11), assigned_value(valore, @range(500,700), 15), assigned_value(valore, @range(300,400), 17), assigned_value(valore, @range(50,150), 19).

c2r1_fixed_payload_rule :- assigned_value(age, ATTR_VALUE, _), ATTR_VALUE != 60.

rule :- not timed_event(terapia_A, _, _), not timed_event(terapia_B, _, _), timed_event(analisi_sangue, 1, _), timed_event(analisi_sangue, 3, _), timed_event(analisi_sangue, 6, _), timed_event(analisi_sangue, 9, _), timed_event(analisi_sangue, 11, _), timed_event(analisi_sangue, 15, _), timed_event(analisi_sangue, 17, _), timed_event(analisi_sangue, 19, _), assigned_value(age, 60, _), not c2r1_fixed_payload_rule, assigned_value(valore, @range(50,80), 1), assigned_value(valore, @range(100,350), 3), assigned_value(valore, @range(500,700), 6), assigned_value(valore, @range(500,700), 9), assigned_value(valore, @range(500,700), 11), assigned_value(valore, @range(500,700), 15), assigned_value(valore, @range(500,700), 17), assigned_value(valore, @range(500,700), 19).

:- not rule.

[6]:
# Returns the ASP string of the model for negative traces
print(model.to_asp(generate_negatives=True))
time(1..100).

% p = number of events in each trace
pos(1..p).
% Generating part
{timed_event(A,P,T) : activity(A), time(T)} = 1 :- pos(P).
{assigned_value(K,V,P) : attribute_value(K,V)} = 1 :- timed_event(A,P,_), has_attribute(A,K).
% time event rule
:- timed_event(_,P1,T1), timed_event(_,P1+1,T2), T1 >= T2.
% traces length filter
:- p != #count{P : timed_event(_, P, _)}.
%returns
#show timed_event/3.
#show assigned_value/3.


activity(terapia_A).
has_attribute(terapia_A, resource).
has_attribute(terapia_A, qt).
has_attribute(terapia_A, age).

activity(terapia_B).
has_attribute(terapia_B, resource).
has_attribute(terapia_B, qt).
has_attribute(terapia_B, age).

activity(controllo).
has_attribute(controllo, resource).
has_attribute(controllo, age).

activity(analisi_sangue).
has_attribute(analisi_sangue, resource).
has_attribute(analisi_sangue, valore).
has_attribute(analisi_sangue, age).

attribute_value(resource, M).
attribute_value(resource, L).
attribute_value(resource, I).
attribute_value(resource, Q).
attribute_value(resource, Y).
attribute_value(resource, C).
attribute_value(resource, W).
attribute_value(resource, F).
attribute_value(resource, U).
attribute_value(resource, T).
attribute_value(resource, O).
attribute_value(resource, N).
attribute_value(resource, J).
attribute_value(resource, V).
attribute_value(resource, S).
attribute_value(resource, G).
attribute_value(resource, H).
attribute_value(resource, K).
attribute_value(resource, B).
attribute_value(resource, X).
attribute_value(resource, R).
attribute_value(resource, D).
attribute_value(resource, E).
attribute_value(resource, P).
attribute_value(resource, A).

attribute_value(qt, 2..149).

attribute_value(age, 20..90).

attribute_value(valore, 50..700).

c1r1_fixed_timed_event_rule :- timed_event(terapia_A, POS, TIME), POS != 2, TIME != 2.
c1r2_fixed_timed_event_rule :- timed_event(terapia_B, POS, TIME), POS != 5, TIME != 5.
c1r3_fixed_payload_rule :- assigned_value(age, ATTR_VALUE, _), ATTR_VALUE != 30.

rule :- timed_event(terapia_A, 2, 2), not c1r1_fixed_timed_event_rule, timed_event(terapia_B, 5, 5), not c1r2_fixed_timed_event_rule, timed_event(terapia_A, 2, 2), timed_event(terapia_B, 5, 5), timed_event(analisi_sangue, 1, _), timed_event(analisi_sangue, 3, _), timed_event(analisi_sangue, 6, _), timed_event(analisi_sangue, 9, _), timed_event(analisi_sangue, 11, _), timed_event(analisi_sangue, 15, _), timed_event(analisi_sangue, 17, _), timed_event(analisi_sangue, 19, _), assigned_value(age, 30, _), not c1r3_fixed_payload_rule, assigned_value(valore, @range(50,100), 1), assigned_value(valore, @range(150,400), 3), assigned_value(valore, @range(500,700), 6), assigned_value(valore, @range(500,700), 9), assigned_value(valore, @range(500,700), 11), assigned_value(valore, @range(500,700), 15), assigned_value(valore, @range(300,400), 17), assigned_value(valore, @range(50,150), 19).

c2r1_fixed_payload_rule :- assigned_value(age, ATTR_VALUE, _), ATTR_VALUE != 60.

rule :- not timed_event(terapia_A, _, _), not timed_event(terapia_B, _, _), timed_event(analisi_sangue, 1, _), timed_event(analisi_sangue, 3, _), timed_event(analisi_sangue, 6, _), timed_event(analisi_sangue, 9, _), timed_event(analisi_sangue, 11, _), timed_event(analisi_sangue, 15, _), timed_event(analisi_sangue, 17, _), timed_event(analisi_sangue, 19, _), assigned_value(age, 60, _), not c2r1_fixed_payload_rule, assigned_value(valore, @range(50,80), 1), assigned_value(valore, @range(100,350), 3), assigned_value(valore, @range(500,700), 6), assigned_value(valore, @range(500,700), 9), assigned_value(valore, @range(500,700), 11), assigned_value(valore, @range(500,700), 15), assigned_value(valore, @range(500,700), 17), assigned_value(valore, @range(500,700), 19).

:- rule.

[7]:
# Returns the Declare string of the model
print(model.to_declare())
activity terapia_A
bind terapia_A: resource, qt, age
activity terapia_B
bind terapia_B: resource, qt, age
activity controllo
bind controllo: resource, age
activity analisi_sangue
bind analisi_sangue: resource, valore, age
resource: M, L, I, Q, Y, C, W, F, U, T, O, N, J, V, S, G, H, K, B, X, R, D, E, P, A
qt: float between 0.2 and 14.9
age: integer between 20 and 90
valore: float between 5.0 and 70.0
absolute_pos(terapia_A, 2, 2), absolute_pos(terapia_B, 5, 5), pos(analisi_sangue, 1, _), pos(analisi_sangue, 3, _), pos(analisi_sangue, 6, _), pos(analisi_sangue, 9, _), pos(analisi_sangue, 11, _), pos(analisi_sangue, 15, _), pos(analisi_sangue, 17, _), pos(analisi_sangue, 19, _), absolute_payload(age, 30), payload_range(valore, 5.0, 10.0, 1), payload_range(valore, 15.0, 40.0, 3), payload_range(valore, 50.0, 70.0, 6), payload_range(valore, 50.0, 70.0, 9), payload_range(valore, 50.0, 70.0, 11), payload_range(valore, 50.0, 70.0, 15), payload_range(valore, 30.0, 40.0, 17), payload_range(valore, 5.0, 15.0, 19)
!pos(terapia_A, _, _), !pos(terapia_B, _, _), pos(analisi_sangue, 1, _), pos(analisi_sangue, 3, _), pos(analisi_sangue, 6, _), pos(analisi_sangue, 9, _), pos(analisi_sangue, 11, _), pos(analisi_sangue, 15, _), pos(analisi_sangue, 17, _), pos(analisi_sangue, 19, _), payload_range(valore, 5.0, 8.0, 1) payload_range(valore, 10.0, 35.0, 3), payload_range(valore, 50.0, 70.0, 6), payload_range(valore, 50.0, 70.0, 9), payload_range(valore, 50.0, 70.0, 11), payload_range(valore, 50.0, 70.0, 15), payload_range(valore, 50.0, 70.0, 17), payload_range(valore, 50.0, 70.0, 19), absolute_payload(age, 60)

[8]:
# Returns the String of the parsed Declare model
print(model.get_parsed_model())
bind terapia_A, terapia_B: resource, qt, age
bind controllo: resource, age
bind analisi_sangue: resource, valore, age
# Cambiato 5 per problemi nel constraints
valore: float between 5.0 and 70.0
age: integer between 20 and 90
qt: float between 0.2 and 14.9
resource: A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y

# terapia_A deve stare SOLO in pos 2, terapia_B deve stare SOLO in pos 5
# Ci sono regole su posizione analisi_sangue e relativo valore (cresce, rimane stabile e poi decresce)
# Età paziente == 30

absolute_pos(terapia_A, 2, 2), absolute_pos(terapia_B, 5, 5),  pos(analisi_sangue, 1, _), pos(analisi_sangue, 3, _), pos(analisi_sangue, 6, _), pos(analisi_sangue, 9, _), pos(analisi_sangue, 11, _), pos(analisi_sangue, 15, _), pos(analisi_sangue, 17, _), pos(analisi_sangue, 19, _), absolute_payload(age, 30), payload_range(valore, 5.0, 10.0, 1), payload_range(valore, 15.0, 40.0, 3),  payload_range(valore, 50.0, 70.0, 6), payload_range(valore, 50.0, 70.0, 9), payload_range(valore, 50.0, 70.0, 11), payload_range(valore, 50.0, 70.0, 15), payload_range(valore, 30.0, 40.0, 17), payload_range(valore, 5.0, 15.0, 19)

# terapia_A e terapia_B NON vengono eseguite
# Ci sono regole su posizione analisi_sangue e relativo valore (cresce e rimane stabile)
# Età paziente == 60
!pos(terapia_A, _, _), !pos(terapia_B, _, _), pos(analisi_sangue, 1, _), pos(analisi_sangue, 3, _), pos(analisi_sangue, 6, _), pos(analisi_sangue, 9, _), pos(analisi_sangue, 11, _), pos(analisi_sangue, 15, _), pos(analisi_sangue, 17, _), pos(analisi_sangue, 19, _), payload_range(valore, 5.0, 8.0, 1) payload_range(valore, 10.0, 35.0, 3), payload_range(valore, 50.0, 70.0, 6), payload_range(valore, 50.0, 70.0, 9), payload_range(valore, 50.0, 70.0, 11), payload_range(valore, 50.0, 70.0, 15), payload_range(valore, 50.0, 70.0, 17), payload_range(valore, 50.0, 70.0, 19), absolute_payload(age, 60)












The Positional Based Model can be exported to ASP or Declare files using:

[9]:
# if you want to export both Declare and ASP file

export_path = "../../../output/" + model_name

# The parameters are self-explanatory based on values the result will be one of the previous cell method as file
# The function exports both asp and decl file. It can also export only one file at the time based on the attributes asp_file and Decl_file
model.to_file(export_path)

# Otherwise files can be exported singularly using
# The parameters are the same
model.to_asp_file(export_path)
model.to_decl_file(export_path)

The Asp String can also be exported differently using:

[10]:
# Exports the model without constraints
model.to_asp_file_without_constraints(export_path + "_no_constraints")

# Creates one asp file per constraint rule
# generate_also_negatives is true export also the negatives of the rules
model.to_one_asp_file_per_constraints(export_path, generate_also_negatives=True)

NOTE: Remember to add manually the constant p in ASP if you want to use the exported file. Use the line #const p = insert_value.

Synthetic Positional Based Log Generation from DECLARE Models

DECLARE4Py implements the generation of synthetic logs from DECLARE positional models with a positional solution based on Answer Set Programming that uses a Clingo solver.

[11]:
from Declare4Py.ProcessMiningTasks.LogGenerator.PositionalBased.PositionalBasedLogGenerator import PositionalBasedLogGenerator

By using the already initialized model with some general settings we can instantiate the generator

[12]:
# Number of cases that have be generated
num_of_cases = 40

# Minimum and maximum number of events a case can contain
(num_min_events, num_max_events) = (20, 20)

# Shows some feedback from the Generator (Set it too false to ignore all debug messages)
verbose = False

generator: PositionalBasedLogGenerator = PositionalBasedLogGenerator(num_of_cases, num_min_events, num_max_events, model, verbose=verbose)

# If the number of traces wants to be changed use:
# generator.set_total_traces()
# If the number of min and max events wants to be changed use:
# generator.set_min_max_events()
# If the model wants to be changed use:
# generator.set_positional_based_model()

In order to run the generator call the method run. The method also supports some parameters, which are and function as follows:

  • equal_rule_split: bool = Indicates if the user wants to generate an equal number of traces for each rule. If not, the traces will be generated randomly.

  • high_variability: bool = If True Generates the traces singularly otherwise generates the traces together with low variability.

  • generate_negatives: bool = Indicates if the user wants to generate the negative traces as well. If true, the number of traces will be doubled and half will be positives and the other half will be negatives.

  • positive_noise_percentage: int = Indicates the percentage of noise in the trace generation for positive traces. To x percentage of positive traces will be falsely assigned to negative label.

  • negative_noise_percentage: int = Indicates the percentage of noise in the trace generation for negative traces. To x percentage of negative traces will be falsely assigned to positive label.

  • append_results: bool = Appends the current run result to the old results, Otherwise it deletes the old results and stores the new ones.

[13]:
%%time
generator.run(generate_negatives_traces=True)
CPU times: total: 10.6 s
Wall time: 2.26 s

The results of the PositionalBasedLogGenerator can then be later exported with the to_xes method will save them in a .xes event log or the to_csv method will save them in a .csv file.

[23]:
generator.to_csv(export_path)
generator.to_xes(export_path)

Generating traces with noise, equal split rule or high variability

Noise can be applied to the traces in order to generate falsely labelled traces. By inserting a value between 0 and 100 the generator will generate “n” traces and apply the noise percentage

[15]:
%%time
generator.run(generate_negatives_traces=True, negative_noise_percentage=5, positive_noise_percentage=7)
generator.to_csv(f'{export_path}_Noise_Test.csv')
CPU times: total: 10.3 s
Wall time: 2.27 s

The generator with the equal_rule_split set as False in this case will generate the traces without knowing which rule the trace is correlated to.

[16]:
%%time
generator.run(equal_rule_split=False, generate_negatives_traces=True, negative_noise_percentage=6, positive_noise_percentage=3)
generator.to_csv(f'{export_path}_No_Equal_Split_Rule_Test.csv')
CPU times: total: 5.64 s
Wall time: 1.54 s

The generator with the high_variability set as False will generate quickly traces all together but the variability of the generate traces will be low.

[17]:
%%time
generator.run(high_variability=True, equal_rule_split=True, generate_negatives_traces=True, negative_noise_percentage=6, positive_noise_percentage=3)
generator.to_csv(f'{export_path}_No_Equal_Split_Rule_Test.csv')
CPU times: total: 2min 27s
Wall time: 33.4 s

Setting up the Length Distribution of the Cases

Users can specify a probability distribution over the lengths of the generated traces. The method set_distribution_type takes as parameter the distribution_type. By setting this parameter with the uniform value, a uniform distribution in [num_min_events, num_max_events] is chosen.

Also, the length of the positive traces can be changed with the method set_total_traces

[18]:
%%time
# Default is uniform
generator.set_distribution_type("uniform")

generator.run(high_variability=True, generate_negatives_traces=True)
generator.to_csv(f'{export_path}_Distribution_Test_1.csv')
CPU times: total: 2min 24s
Wall time: 29.9 s

A gaussian distribution requires a location (the mean) and a scale (the variance)

[19]:
%%time
generator.change_distribution_settings(min_num_events_or_mu=25.5, max_num_events_or_sigma=2.0, dist_type="gaussian")
generator.run(high_variability=True, generate_negatives_traces=True)
generator.to_csv(f'{export_path}_Distribution_Test_2.csv')
CPU times: total: 4min 59s
Wall time: 45.4 s

A custom distribution requires the user to set the probability for each length in [num_min_events, num_max_events]

[20]:
%%time
generator.set_distribution_type("custom")

# Let's change the minimum and maximum number of events
generator.set_min_max_events(19,26)
generator.set_custom_probabilities([0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.3])

generator.run(high_variability=True, generate_negatives_traces=True)
generator.to_csv(f'{export_path}_Distribution_Test_3.csv')
CPU times: total: 3min 54s
Wall time: 37.3 s

Setting up the Personalized Clingo configuration

More information

For more information on clingo and its functionalities consult: https://potassco.org/

For more information on the option commands consult the documentation of Clingo (Potassco) at: https://github.com/potassco/guide/releases/ or https://github.com/potassco/asprin/blob/master/asprin/src/main/clingo_help.py

Or download directly the documentation from here: https://github.com/potassco/guide/releases/download/v2.2.0/guide.pdf

Setting up the configuration

Clingo offers various option to personalize the solver range of action, probabilistic reasoning and decision-making

At the moment the solver can be personalized using the following method use_custom_clingo_configuration with the following options:

  • The Configuration of clingo can be: “frumpy”, “tweety”, “crafty”, “jumpy”, “trendy” or “handy”. (Default is trendy)

  • The amount of Threads used by clingo to speed up the process. (Default uses al possible cores)

  • The Time limit is the maximum time that the solver can use in order to search for a satisfiable answer

  • The Random Frequency used by clingo in the decision-making is a float number between 0 and 1 included. Where 0 means: No random decisions and 1 means: Every decision is random. (Default is 0.3)

  • The Mode configures the optimization of the algorithm and can be either “optN” or “ignore”. (Default is optN)

  • The Sign of the operation which can be “asp”, “pos” “neg”, “rnd”. (Default is asp)

  • The Strategy configures the optimization of the strategy and can be “bb” or “usc”. (This functionality is not used in the default configuration)

  • The Heuristic used by clingo configures the decision heuristic and can be “Berkmin”, “Vmtf”, “Vsids”, “Domain”, “Unit” or “None”. (This functionality is not used in the default configuration)

[21]:
%%time

generator.use_default_clingo_configuration()
# The default configuration can be obtained using the following command
print(generator.get_current_clingo_configuration())

# To enable the custom configuration:
generator.use_custom_clingo_configuration(config="frumpy", threads=None, frequency=0.9, sign_def="asp", strategy="bb", heuristic="Domain")

# The current configuration then becomes the custom one
print(generator.get_current_clingo_configuration())

# this command tells the generator to use the default configuration again
# generator.use_default_clingo_configuration()
# It does not delete the old custom configuration, in fact the custom configuration can be re-enabled by calling
# generator.use_custom_clingo_configuration()

generator.run(high_variability=True, generate_negatives_traces=True)
generator.to_csv(f'{export_path}_Custom_Configuration_Test.csv')
{'CONFIG': 'jumpy', 'TIME-LIMIT': '120', 'THREADS': '16', 'FREQUENCY': '1', 'SIGN-DEF': 'rnd', 'MODE': 'optN', 'STRATEGY': 'bb', 'HEURISTIC': 'Vsids'}
{'CONFIG': 'frumpy', 'TIME-LIMIT': '120', 'THREADS': '16', 'FREQUENCY': '0.9', 'SIGN-DEF': 'asp', 'MODE': 'optN', 'STRATEGY': 'bb', 'HEURISTIC': 'Domain'}
CPU times: total: 6min 51s
Wall time: 45.8 s