Pascal Manual¶
Introduction¶
Pasccal is an algorithm for learning probabilistic integrity constraints. It was proposed in [RBZ+21]. It contains modules for both structure and parameter learning.
Pascal is also available in the cplint on SWISH web application at http://cplint.eu.
Installation¶
Pascal is distributed as a pack of SWI-Prolog. To install it, use
?- pack_install(pack).
Requirements¶
It requires the pack
It is installed automatically when installing pack pascal or can be installed manually as
$ swipl
?- pack_install(lbfgs).
lbfgs uses a foreign library and contains the library binaries for 32 and 64 bits Linux. If you want to recompile the foreign library you can use
?- pack_rebuild(lbfgs).
On 32 and 64 bits Linux this should work out of the box.
You can upgrade the pack with
$ swipl
?- pack_upgrade(pack).
Note that the pack on which pascal depends is not upgraded automatically in this case so it needs to be upgraded manually.
Example of use¶
$ cd <pack>/pascal/prolog/examples
$ swipl
?- [bongardkeys].
?- induce_pascal([train]),T).
Testing the installation¶
$ swipl
?- [library(test_pascal)].
?- test_pascal.
Support¶
Use the Google group https://groups.google.com/forum/#!forum/cplint.
Language¶
A Probabilistic Constraint Logic Theory (PCLT) is a set of Probabilistic Integrity Constraints (PIC) of the form
where \(p\) is a probability, each \(L_i\) is a literal and each \(P_j\) and \(N_j\) is a conjunction of literals. We call each \(P_j\) a P conjunction and each \(N_k\) an N conjunction. We call each \(\exists(P_j)\) a P disjunct and each \(\forall\neg(N_k)\) an N disjunct.
The variables that occur in the body are quantified universally with scope the PIC. The variables in the head that do not occur in the body are quantified existentially if they occur in a P disjunct and universally if they occur in an N disjunct, with scope the disjunct they occur in.
An example of a PIC for the Bongard problems of [DRVL95]
which states that if there is a triangle inside a square then either there exists a circle inside the square or there doesn’t exist a circle inside the triangle. This constraint has probability 0.5.
Use¶
The following learning algorithms are available:
Parameter learning
Structure learning
Input¶
To execute the learning algorithms, prepare a Prolog file divided in five parts
preamble
background knowledge, i.e., knowledge valid for all interpretations
PCLT for you which you want to learn the parameters (optional)
language bias information
example interpretations
The preamble must come first, the order of the other parts can be changed.
For example, consider the Bongard problems of [DRVL95]. bongardkeys.pl represents a Bongard problem for Pascal.
Preamble¶
In the preamble, the Pascal library is loaded with (bongardkeys.pl):
:- use_module(library(pascal)).
Now you can initialize pascal with
:- pascal.
At this point you can start setting parameters for Pascal such as for example
:-set_pascal(examples,keys(pos)).
:-set_pascal(learning_algorithm,gradient_descent).
:-set_pascal(learning_rate,fixed(0.5)).
:-set_pascal(verbosity,1).
We will see later the list of available parameters.
A parameter that is particularly important for Pascal is verbosity
: if set to 1,
nothing is printed and learning is fastest,
if set to 3 much information is printed and learning is slowest, 2 is in between. This ends the preamble.
Background and Initial PCLT¶
Now you can specify the background knowledge by including a set of Prolog clauses
in a section between :- begin_bg.
and :- end_bg.
For example
:- begin_bg.
in(A,B) :- inside(A,B).
in(A,D) :- inside(A,C),in(C,D).
:- end_bg.
Moreover, you can specify an initial PCLT in a section between :- begin_in.
and :- end_in.
.
The initial program is used in parameter learning for providing the structure.
In the section, facts for the predicates rule/2
or ic/1
can be given.
Facts for rule/2
take the form rule(ic,prob)
where prob
is a probability and
ic
is a term of the form head:-body
. In it, body
is a list of literals and head
is a list of head disjuncts. Each head disjunct is couple (sign,conjunction)
where sign is
either (+)
for P disjuncts or (-)
for N disjuncts and conjunction
is a list of literals.
The example of a PIC above can be expressed as
:- begin_in.
rule(([((+),[circle(C),in(C,S)]),((-),[circle(C),in(C,T)])]:-
[triangle(T),square(S),in(T,S)]),0.5).
:- end_in.
Facts for the predicate ic/1
take the form ic(string)
where string
is a Prolog
string wher a constraint is encoded as prob::body--->head
.
body
is a conjuntion of literals where the conjunction symbol is /\
.
head
is a disjunction where the disjunction symbol is \/
.
Each disjunct is either a conjunction of literals, in the case of a P disjunct, or of the form
not(conjunction)
where conjunction
is a conjunction of literals.
The example of a PIC above can be expressed as
:- begin_in.
ic("0.5 :: triangle(T)/\\square(S)/\\in(T,S)
--->
circle(C)/\\in(C,S)
\\/
not(circle(C)/\\in(C,T)).").
:- end_in.
Language Bias¶
The language bias part is specified by means of mode declarations in the style of Progol.
modeh(<recall>,<predicate>(<arg1>,...)).
specifies the atoms that can appear in the head of clauses, while
modeb(<recall>,<predicate>(<arg1>,...)).
specifies the atoms that can appear in the body of clauses. <recall>
can be an integer or *
.
<recall>
indicates how many atoms for the predicate specification are considered.
*
stands for all those that are found.
Otherwise the indicated number is randomly chosen.
Arguments of the form
+<type>
specifies that the argument should be an input variable of type <type>
, i.e., a variable replacing a +<type>
argument in the head or a -<type>
argument in a preceding literal in the current hypothesized clause.
Another argument form is
-<type>
for specifying that the argument should be a output variable of type <type>
.
Any variable can replace this argument, either input or output.
The only constraint on output variables is that those in the head of the current hypothesized clause must appear as output variables in an atom of the body.
Other forms are
#<type>
for specifying an argument which should be replaced by a constant of type <type>
<constant>
for specifying a constant.
An example of language bias for the Bongard domain is
modeh(*,triangle(+obj)).
modeh(*,square(+obj)).
modeh(*,circle(+obj)).
modeh(*,in(+obj,-obj)).
modeh(*,in(-obj,+obj)).
modeh(*,in(+obj,+obj)).
modeh(*,config(+obj,#dir)).
modeb(*,triangle(-obj)).
modeb(*,square(-obj)).
modeb(*,circle(-obj)).
modeb(*,in(+obj,-obj)).
modeb(*,in(-obj,+obj)).
modeb(*,config(+obj,#dir)).
Example Interpretations¶
The last part of the file contains the data.
You can specify data with two modalities: models and keys.
In the models type, you specify an example model (or interpretation or megaexample) as a list of Prolog facts initiated by begin(model(<name>)).
and terminated by end(model(<name>)).
as in
begin(model(2)).
pos.
triangle(o5).
config(o5,up).
square(o4).
in(o4,o5).
circle(o3).
triangle(o2).
config(o2,up).
in(o2,o3).
triangle(o1).
config(o1,up).
end(model(2)).
The facts in the interpretation are loaded in SWI-Prolog database by adding an extra initial argument equal to the name of the model.
After each interpretation is loaded, a fact of the form int(<id>)
is asserted, where id
is the name of the interpretation.
This can be used in order to retrieve the list of interpretations.
Alternatively, with the keys modality, you can directly write the facts and the first argument will be interpreted as a model identifier. The above interpretation in the keys modality is
pos(2).
triangle(2,o5).
config(2,o5,up).
square(2,o4).
in(2,o4,o5).
circle(2,o3).
triangle(2,o2).
config(2,o2,up).
in(2,o2,o3).
triangle(2,o1).
config(2,o1,up).
which is contained in the bongardkeys.pl.
This is also how model 2
above is stored in SWI-Prolog database.
The two modalities, models and keys, can be mixed in the same file.
Facts for int/1
are not asserted for interpretations in the key modality but can be added by the user explicitly.
In order to specify if an interpretation is positive or negative, you should include in the interpretation a
fact. The form of the fact dependes on the parameter examples
whose values are {auto, keys(pred)}
. If set to auto
, positive examples in the models format should contain a pos
fact and in the keys format a pos(id)
fact, where id
is the identifier of the interpretation. If set to keys(pred)
, pred
or pred(pos)
(pred(id)
or pred(id,pos)
in the keys format) is used instead of pos
to determine positive exmples.
Then you must indicate how the examples are divided in folds with facts of the form: fold(<fold_name>,<list of model identifiers>)
, as for example
fold(train,[2,3,...]).
fold(test,[490,491,...]).
As the input file is a Prolog program, you can define intensionally the folds as in
fold(all,F):-
findall(I,int(I),F).
which however must be inserted after the input interpretations otherwise the facts for int/1
will not be available and fold all
would be empty.
Commands¶
Parameter Learning¶
To execute parameter learning, prepare an input file as indicated above and call
?- induce_par_pascal(<list of folds>,T).
where <list of folds>
is a list of the folds for training and T
will contain the input theory with updated parameters.
For example bongardkeys.pl, you can perform parameter learning on the train
fold with
?- induce_par_pascal([train],P).
Structure Learning¶
To execute structure learning, prepare an input file in the editor panel as indicated above and call
induce(+List_of_folds:list,-T:list) is det
where List_of_folds
is a list of the folds for training and T
will contain the learned PCLT.
For example bongardkeys.pl, you can perform structure learning on the train
fold with
?- induce([train],P).
A PCLT can also be tested on a test set with test_pascal/7
or test_prob_pascal/6
as described below.
Testing¶
A PCLT can also be tested on a test set with
test_pascal(+T:list,+List_of_folds:list,-LL:float,-AUCROC:float,-ROC:list,-AUCPR:float,-PR:list) is det
or
test_prob_pascal(+T:list,+List_of_folds:list,-NPos:int,-NNeg:int,-LL:float,-ExampleList:list) is det
where T
is a list of terms representing clauses and List_of_folds
is a list of folds.
test_pascal/7
returns the log likelihood of the test examples in LL
, the Area Under the ROC curve in AUCROC
, a dictionary containing the list of points (in the form of Prolog pairs x-y
) of the ROC curve in ROC
, the Area Under the PR curve in AUCPR
, a dictionary containing the list of points of the PR curve in PR
.
test_prob_pascal/6
returns the log likelihood of the test examples in LL
, the numbers of positive and negative examples in NPos
and NNeg
and the list ExampleList
containing couples Prob-Ex
where Ex
is a
for a
a positive example and \+(a)
for a
a negative example and Prob
is the probability of example a
.
Then you can draw the curves in cplint
on SWISH using C3.js using
compute_areas_diagrams(+ExampleList:list,-AUCROC:float,-ROC:dict,-AUCPR:float,-PR:dict) is det
(from pack auc.pl) that takes as input a list ExampleList
of pairs probability-literal of the form that is returned by test_prob_pascal/6
.
For example, to test on fold test
the program learned on fold train
you can run the query
?- induce_par([train],P),
test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
Or you can test the input program on the fold test
with
.. code:: prolog
?- in(P),test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
In cplint
on SWISH, by including
.. code:: prolog
:- use_rendering(c3). :- use_rendering(lpad).
in the code before :- pascal.
the curves will be shown as graphs using C3.js and the output program will be pretty printed.
Parameters for Learning¶
Parameters are set with commands of the form
:- set_pascal(<parameter>,<value>).
The available parameters are:
examples
: (values:{auto, keys(pred)}
, default value:auto
) if set toauto
, positive examples in the models format should contain apos
fact and in the keys format apos(id)
fact, whereid
is the identifier of the interpretation. If set tokeys(pred)
,pred
is used instead ofpos
to determine positive exmplesbeamsize
(values: integer, default value: 10): size of the beamverbosity
(values: integer in [1,3], default value: 1): level of verbosity of the algorithms.max_nodes
(values: integer, defualt value: 10): maximum number of iteration of beam searchoptimal
(values:{yes, no}
, default value:no
): whether ther refinement operator is optimal or notmax_length
(values: integer, defualt value: 4): maximum number of body literals and head disjunctsmax_lengths
(values: list of integers[Body,Disjuncts,LitIn+,LitIn-]
, defualt value:[1,1,1,0]
): maximum number of, respectively, body literals, head disjuncts, literals in P disjuncts and literals in N disjunctsmax_initial_weight
(values: real, default value \(0.1)\): absolute value of the maximum of the initial weights in weight learning.learning_algorithm
(values:{gradient_descent, lbfgs}
, default value:gradient_descent
): algorithm for parameter learningrandom_restarts_number
(values: integer, default value: 1): number of random restarts for gradient descent parameter learninglearning_rate
(values:{fixed(value),decay(eta_0,eta_tau,tau)}
, default value:fixed(0.01)
): value of the learning rate, either fixed to a value or set with a decay strategygd_iter
(values: integer, default value: 1000): maximum number of gradient descent iterationsepsilon
(values: real, default value: 0.0001): if the difference in the log likelihood in two successive parameter gradient descent iterations is smaller thanepsilon
, then the algorithm stopsepsilon_fraction
(values: real, default value: 0.00001): if the difference in the log likelihood in two successive parameter gradient descent iterations is smaller thanepsilon_fraction*(-current log likelihood)
, then the algorithm stopsregularization
(values:{1,2}
, default value:2
): either L1 or L2 regularization in gradient descent and lbfgsregularizing_constant
(values: real, default value: 5): value of the regularizatiom constant in gradient descent and lbfgsmax_rules
(values: integer, default value: 10): maximum number of PIC in the final theorylogzero
(values: negative real, default value \(\log(0.01)\): value assigned to \(\log(0)\)zero
(values: positive real, default value0.0001
: value assigned to \(0\) when computing denominators that are countsminus_infinity
(values: negative real, default value-1.0e20
: value assigned to \(-\infty\), used as the intial value of the log likelihood in parameter learning
Example Files¶
The pack/pascal/prolog/examples
folder in SWI-Prolog home contains some example programs.
The pack/pascal/docs
folder contains this manual in html and pdf.
Manual in PDF¶
A PDF version of the manual is available at http://friguzzi.github.io/pascal/_build/latex/pascal.pdf.
License¶
Pascal follows the BSD 2-Clause License that you can find in the root folder. The copyright is by Fabrizio Riguzzi.
References¶
- DRVL95(1,2)
L. De Raedt and W. Van Laer. Inductive constraint logic. In Proceedings of the 6th Conference on Algorithmic Learning Theory (ALT 1995), volume 997 of LNAI, 80–94. Fukuoka, Japan, 1995. Springer.
- RBZ+21
Fabrizio Riguzzi, Elena Bellodi, Riccardo Zese, Marco Alberti, and Evelina Lamma. Probabilistic inductive constraint logic. Machine Learning, 110:723–754, 2021. doi:10.1007/s10994-020-05911-6.