Probabilistic Logic Programming (PLP) introduces probabilistic reasoning in Logic Programs in
order to represent uncertain information. It is receiving an increased attention due to its applications
in particular in the Machine Learning field.
In this tutorial we will show how to use
cplint on SWISH,
a web application for performing inference and learning
on user-defined probabilistic logic programs. You will
learn how to write a probabilistic
logic program processable by cplint on SWISH,
how to execute the different types of queries allowed by
this application and how to perform learning.
cplint on SWISH is based on SWISH, a web framework for Logic Programming, and on cplint , a suite of programs for inference and learning of Logic Programs with annotated disjunctions (LPADs) . It keeps the same syntax of cplint and as cplint it uses Logic Programs with annotated disjunctions (LPADs) as formalism to represent probabilistic logic programs.
The tutorial is structured as follows. provides an introduction to LPADs. shows how to write an LPAD processable by cplint on SWISH and how to perform simple and conditional exact inference over LPADs. and illustrate how to perform approximate inference over LPADs. Sections and explain how to perform simple and conditional approximate inference over hybrid programs, i.e., programs where some of the random variables are continuous. describes how to learn the parameters of an LPAD, whereas explains how to perform structure learning. Finally shows how to test a learned program.
A Logic Program with Annotated Disjunction (LPAD) consists of a set of rules of the following form:
h_1 : a_1 ; ... ; h_n : a_n :- b_1, ..., b_m.
where h_i
are atoms, b_i
are literals and a_i
are real numbers between 0 and 1 such that the sum of all a_i
is 1. The set of elements h_i : a_i
compose the head of a rule, while the set b_i
is the body. Disjunction in the head is represented with a semicolon and atoms in the head are separated from probabilities by a colon.
If the head of a rule contains only one element h : 1
, we can simply write this element as h
, i.e. the clause takes the form of a normal prolog clause. Therefore
h : 1 :- b_1, ..., b_m.
is equivalent to
h :- b_1, ..., b_m.
If the clause has an empty body, it can be represented like this:
h_1 : a_1 ; ... ; h_n : a_n.
If the sum of all the a_i
is smaller than 1, an extra disjunct null
is assumed with probability 1 - sum(a_i). Therefore
h_1 : 0.5 ; h_2 : 0.2 :- b_1, ..., b_m.
is equivalent to
null : 0.3 ; h_1 : 0.5 ; h_2 : 0.2 :- b_1, ..., b_2.
We consider as example a program which models the fact that if somebody has the flu and the climate is cold, there is the possibility that an epidemic or a pandemic arises. We are uncertain about whether the climate is cold but we know for sure that David and Robert have the flu (based on ).
The rule that we want to write is the one which states that, if somebody has the flu and the climate is cold, an epidemic arises with 60% probability, a pandemic arises with 30% probability, whereas we have a 10% probability that neither an epidemic nor a pandemic arises. We can write
epidemic : 0.6; pandemic : 0.3; null: 0.1 :- flu(_), cold.
As we said in Section Logic Program with Annotated Disjunction, the null
atom can be implicit.
Therefore the previous rule, without changing its meaning, can be written
epidemic : 0.6; pandemic : 0.3 :- flu(_), cold.
The following probabilistic fact says that the weather is cold with a 70% probability. Note that the null
atom is implicit here as well.
cold : 0.7.
Now we assert that David and Robert have the flu.
flu(david).
flu(robert).
The program is almost complete, what we need now is to
load the library pita
in order to perform
exact inference (for further information about PITA see
).
Therefore we import this library with the built-in predicate
use_module/1
. So we need to write
:- use_module(library(pita)).
Moreover, after :- use_module(library(pita))
we need to write :- pita.
in order to initalize the library and the program should be enclosed by :- begin_lpad.
and :- end_lpad.
(respectively at the begin and at the end of the program). These goals are mandatory to initialize the inference system.
The full LPAD of this example is shown below.
% load the 'pita' library to perform inference
:- use_module(library(pita)).
:- pita.
% to be written before the program
:- begin_lpad.
epidemic : 0.6; pandemic : 0.3 :- flu(_), cold.
cold : 0.7.
flu(david).
flu(robert).
% to be written after the program
:- end_lpad.
Ok, we have our program, now what?! Now it’s time to submit some queries!
To query a program you must use the prob/2
predicate with the following syntax
prob(:Query:atom,-Probability:float).
Query
is an atom that represents the query that we want to perform, while Probability
is the variable that will contain the probability that Query
is true (a float between 0 and 1).
For instance, let us ask for the probability that an epidemic arises. To know it we just have to submit the following query.
prob(epidemic, P).
If you click on the icon
a frame containing cplint on SWISH will be open, then you can
submit the query by clicking on the "Run" button.
You can close the frame by clicking on the icon
cplint on SWISH allows to ask conditional
queries with the predicate prob/3
prob(:Query:atom,:Evidence:atom,-Probability:float).
For instance, we can ask for the probability that an epidemic arises given that outside is cold
prob(epidemic, cold, P).
cplint on SWISH can show the probabilistic
results of a query as histograms using predicate bar/2
.
The syntax is
bar(+Probability:float,-Chart:dict).
Where Probability
is the probability of the query and
Chart
is
a dictionary that will contain a bar chart with two bars,
one for the probability of the atom of being true and
one for the probability of the atom of being false
(1- true_probability).
After the submission of the query a graphical bar chart
of the two values will be plotted.
However, before submitting this kind of query, we need to
specify that we want to use the renderer c3
by adding the following line before the
:- begin_lpad.
goal
:- use_rendering(c3).
Moreover cplint on SWISH allows to draw
the Binary Decision Diagram (BDD) used to perform queries with
exact inference. A BDD represents the explanations
of a query.
The syntax is the following:
bdd_dot_string(:Query:atom,-DotString:string,-Var:list).
A solid edge indicates a 1-child, a dashed edge
indicates a 0-child and a dotted edge indicates a
negated 0-child. Each level of the BDD is associated
to a variable of the form XI_J indicated on the left:
I indicates the multivalued variable index and J the
index of the Boolean variable of I. The table
Var
contains the associations between the rule groundings
and the multivalued variables: the first column
contains the multivalued variable index, the second
column contains the rule index, corresponding to its
position in the program, and the last column contains
the list of constants grounding the rule, each replacing
a variable in the order of appearance in the rule.
However, before submitting bdd_dot_string/3
,
we have to specify that we want to use the renderers graphviz
and table
.
:- use_rendering(graphviz).
:- use_rendering(table,[header(['Multivalued variable index','Rule index','Grounding substitution'])]).
This render plugins are necessary to plot the BDD and to show
what is contained in Var
as a table.
Therefore our program becomes
% load the 'pita' library to perform inference
:- use_module(library(pita)).
:- pita.
% allows to create graphical results
:- use_rendering(c3).
:- pita.
% the following renderers allow to correctly show the BDDs
:- use_rendering(graphviz).
% to be written before the program
:- begin_lpad.
epidemic : 0.6; pandemic : 0.3 :- flu(_), cold.
cold : 0.7.
flu(david).
flu(robert).
% to be written after the program
:- end_lpad.
Let us see the histogram of the first query of the example (simple inference)
prob(epidemic, P),bar(P,G).
If we want to see the histogram of the second query (conditional query)
prob(epidemic, cold, P),bar(P,G).
If we want to see the BDD of the first query of the example (simple inference)
bdd_dot_string(epidemic, BDD, Var).
This example shows that conclusions from different
groundings of a rule are combined with a noisy or
rule: the probability of an epidemic is obtained by
combining with noisy or the conclusions of the two
groundings of the rule where the only variable is
replaced by David or Robert. So the probability of
an epidemic if cold
is true is 0.6+0.6-0.6*0.6=84.
Since cold
is also uncertain, the overall
probability is 0.84*0.7=0.588.
Complete example: epidemic.pl
In this example we will show how to perform an approximate inference with Monte Carlo sampling. In addition, as in the exact inference setting, it is possible to plot an histogram representing the probabilistic values.
To show these features we exploit the Markov chain example . In this example we want to know what is the likelihood that on an execution of a Markov chain from a start state 's', a final state 't' will be reached? The chains may be infinite so the query may have an infinite number of explanations and if we want exact inference and use PITA, the inference may not terminate. To ensure termination, we have two solutions. We may either fix a bound on the depth of the derivations of PITA by setting the parameters
:- set_pita(depth_bound,true).
:- set_pita(depth,<level of depth (integer)>).
(see exact inference variant of this example). Alternatively, MCINTYRE can be used.
Here we will use the latter approach.
Below the full program of this example is shown
% load the library ‘mcintyre’ to perform approximate inference
:- use_module(library(mcintyre)).
% load the renderer ‘c3’ for graphical results
:- use_rendering(c3).
% initialize the library 'mcintyre'
:- mc.
% to be written before the program
:- begin_lpad.
reach(S, I, T) :-
trans(S, I, U),
reach(U, next(I), T).
reach(S, _, S).
trans(s0,S,s0):0.5; trans(s0,S,s1):0.3; trans(s0,S,s2):0.2.
trans(s1,S,s1):0.4; trans(s1,S,s3):0.1; trans(s1,S,s4):0.5.
trans(s4,_,s3).
% to be written after the program
:- end_lpad.
To execute queries we must use the predicates mc_prob/2
.
mc_prob(:Query:atom,-Probability:float).
We ask for the probability that starting at state 's0' at instance 0, state 's3' is reachable
mc_prob(reach(s0,0,s3),P).
if we want to see the probability histogram
mc_prob(reach(s0,0,s3),P),bar(P,G).
With MCINTYRE, you can also take a given number of sample with
mc_sample(:Query:atom,+Samples:int,-Probability:float).
For example this query
mc_sample(reach(s0,0,s3),1000,P).
samples reach(s0,0,s3)
1000 times and returns in P the estimated probability.
We can obtain a bar chart of the samples with the predicate bar/2
(note: remember to load the renderer c3
):
mc_sample(reach(s0,0,s3),1000,P),bar(P,G).
We can also sample arguments of queries with the predicate mc_sample_arg/4
mc_sample_arg(:Query:atom,+Samples:int,?Arg:var,-Values:list).
The predicate samples Query
a number of Samples
times. Arg
should be a variable in Query
.
The predicate returns in Values
a list of couples L-N
where L
is the list of
values of Arg
for which Query
succeeds in world sampled at random and
N
is the number of samples. If L
is the empty list, it means that for that sample the query failed.
If L
is a list with a
single element, it means that for that sample the query is
determinate.
If, in all couples L-N
, L
is a list with a
single element, it means that the clauses in the program
are mutually exclusive, i.e., that in every sample,
only one clause for each subgoal has the body true.
So for example we may sample the argument S
of
reach(s0,0,S)
with
mc_sample_arg(reach(s0,0,S),50,S,Values).
If we want to see the bar graph of this sampling we use the predicate argbar/2
argbar(+List:list,-Chart:dict).
For example
mc_sample_arg(reach(s0,0,S),50,S,List),argbar(List,Chart).
Moreover, we can sample arguments of queries with the predicate mc_sample_arg_first/4
mc_sample_arg_first(:Query:atom,+Samples:int,?Arg:var,-Values:list)
that returns in Values
a list of couples V-N
where
V
is the value of Arg
returned as the first answer by Query
in
a world sampled at random and N
is the number of samples
returning that value.
V
is failure
if the query fails.
mc_sample_arg_first/4
differs from mc_sample_arg/4
because the first just computes the first answer of the query for each sampled world.
So for example we may sample 50 times the first answer for S
in reach(s0,0,S)
with
mc_sample_arg_first(reach(s0,0,S),50,S,Values).
Complete example: markov_chain.pl
With cplint on SWISH it is possible to
compute the expectation of a random variable.
In this example we want to perform model checking of the
Synchronous Leader Election Protocol
expressed in Probabilistic
Computation Tree Logic (PCTL).
Given a synchronous ring of N processes the Synchronous
Leader Election Protocol is used to elect a leader
(a uniquely designated processor) by sending messages around the ring.
The protocol proceeds in rounds and is parametrised by a
constant K. Each round begins by all processors (independently)
choosing a random number (uniformly) from {1,…,K} as an
id. The processors then pass their ids around the ring.
If there is a unique id, then the processor with the
maximum unique id is elected the leader, and otherwise
the processors begin a new round.
The full program of this example is
:- use_module(library(mcintyre)).
:- if(current_predicate(use_rendering/1)).
:- use_rendering(c3).
:- endif.
:- dynamic kr/1,num/1.
:- mc.
:- begin_lpad.
% State Formulae
models(_S, tt,_Hist,_Limit,_Time).
models(S, prop(P),_Hist,_Limit,_Time) :-
proposition(P, S).
models(S, and(F1, F2),Hist,Limit,Time) :-
models(S, F1,Hist,Limit,Time), models(S, F2,Hist,Limit,Time).
models(S, or(F1, _F2),Hist,Limit,Time) :-
models(S, F1,Hist,Limit,Time).
models(S, or(F1, F2),Hist,Limit,Time) :-
\+ models(S, F1,Hist,Limit,Time),
models(S, F2,Hist,Limit,Time).
models(S, not(F), Hist,Limit,Time) :-
\+ models(S, F,Hist,Limit,Time).
models(S, prob_until(comp(Op, P), F1, F2),Hist,Limit,Time) :-
mc_sample(pmodels(S, until(F1, F2),Hist,Limit,Time),20, Q),
comp(Q, Op, P).
models(S, prob_next(comp(Op, P), F),Hist,Limit,Time) :-
mc_sample(pmodels(S, next(F),Hist,Limit,Time),20, Q),
comp(Q, Op, P).
comp(Q,>,P):-
Q>P.
comp(Q,>=,P):-
Q>=P.
comp(Q,<,P):-
Q<P.
comp(Q,=<,P):-
Q=<P.
% Path Formulae
pmodels(S,F):-
pmodels(S,F,[],nolimit,0,_Time).
pmodels(S,F,Hist,Limit,Time):-
pmodels(S,F,Hist,Limit,Time,_Time).
pmodels(S, until(_F1, F2),Hist,Limit,Time,Time) :-
models(S, F2,Hist,Limit,Time),!.
pmodels(S, until(F1, F2),Hist0,Limit,Time0,Time) :-
within_limit(Time0,Limit),
models(S, F1,Hist0,Limit,Time0),
ctrans(S, _, T, Hist0, Hist),!,
Time1 is Time0+1,
pmodels(T, until(F1,F2),Hist,Limit,Time1,Time).
pmodels(S, next(F), Hist,Limit,Time0,Time) :-
within_limit(Time0,Limit),
ctrans(S, _, T, Hist, _),!,
Time is Time0+1,
models(T, F,Hist,Limit,Time).
within_limit(_Time,nolimit):-!.
within_limit(Time,Limit):-
Time<Limit.
bounded_eventually(Prop,Rounds):-
num(N),
B is Rounds*(N+1),
eventually(Prop,B,_T).
eventually(Prop):-
eventually(Prop,_T).
eventually(Prop,Rounds):-
eventually(Prop,nolimit,T),
num(N),
Rounds is T/(N+1).
eventually(Prop,Limit,T) :-
init(S),
pctlspec(Prop, F),
pmodels(S, F,[],Limit,0,T).
pctlspec(X, until(tt, prop(X))).
proposition(P, S) :- final(P, S).
final(elect, [_|L]) :-
num(N),
gen_elected_state(N, L).
gen_elected_state(J, L) :-
(J==0
-> L=[]
; L = [state(3,_,_,_)|Rest],
J1 is J-1,
gen_elected_state(J1,Rest)
).
% transitions
% module counter
% [read] c<N-1 -> (c'=c+1);
% reading
trans(counter, counter(C), read, counter(D),_S,H,H) :-
num(N),
C < N-1,
D is C+1.
% [read] c=N-1 -> (c'=c);
% finished reading
trans(counter, counter(C), read, counter(C),_S,H,H) :-
num(N),
C =:= N-1.
% [done] u1 | u2 | u3 | u4 -> (c'=c);
% done
trans(counter, counter(C), done, counter(C),S,H,H) :-
get_processid(P),
nonlocal(process(P,_), uniqueid, 1,S).
% [retry] !(u1 | u2 | u3 | u4) -> (c'=1);
% pick again reset counter
trans(counter, counter(_C), retry, counter(1),S,H,H) :-
findall(P,get_processid(P),PL),
maplist(nl(S),PL).
% [loop] s1=3 -> (c'=c);
% loop (when finished to avoid deadlocks)
trans(counter, counter(C), loop, counter(C),S,H,H) :-
nonlocal(process(1,_), state, 3,S).
% module process
% local state
% s1=0 make random choice
% s1=1 reading
% s1=2 deciding
% s1=3 finished
% [pick] s1=0 -> 1/K : (s1'=1) & (p1'=0) & (v1'=0) & (u1'=true) + ...;
% pick value
trans(process(_N,_Next), state(0,_,_,_), pick, state(1,1,R,R),_S,H,[pick(R)|H]) :-
pick(H,R).
%read
% [read] s1=1 & u1 & !p1=v2 & c<N-1 -> (u1'=true) & (v1'=v2);
trans(process(_N,Next), state(1,1,_,P), read, state(1,1,V,P),S,H,H) :-
nonlocal(counter, counter, C,S),
num(CN),
C < CN - 1,
nonlocal(process(Next,_), value, V,S),
P \== V.
% [read] s1=1 & u1 & p1=v2 & c<N-1 -> (u1'=false) & (v1'=v2) & (p1'=0);
trans(process(_N,Next), state(1,1,_,P), read, state(1,0,V,0),S,H,H) :-
nonlocal(counter, counter, C,S),
num(CN),
C < CN - 1,
nonlocal(process(Next,_), value, V,S),
P == V.
% [read] s1=1 & !u1 & c<N-1 -> (u1'=false) & (v1'=v2);
trans(process(_N,Next), state(1,0,_,P), read, state(1,0,V,P),S,H,H) :-
nonlocal(counter, counter, C,S),
num(CN),
C < CN - 1,
nonlocal(process(Next,_), value, V,S).
% read and move to decide
% [read] s1=1 & u1 & !p1=v2 & c=N-1 -> (s1'=2) & (u1'=true) & (v1'=0) & (p1'=0);
trans(process(_N,Next), state(1,1,_,P), read, state(2,1,0,0),S,H,H) :-
nonlocal(counter, counter, C,S),
num(CN),
C =:= CN - 1,
nonlocal(process(Next,_), value, V,S),
P \== V.
% [read] s1=1 & u1 & p1=v2 & c=N-1 -> (u1'=false) & (v1'=0) & (p1'=0);
trans(process(_N,Next), state(1,1,_,P), read, state(2,0,0,0),S,H,H) :-
nonlocal(counter, counter, C,S),
num(CN),
C =:= CN - 1,
nonlocal(process(Next,_), value, V,S),
P == V.
% [read] s1=1 & !u1 & c=N-1 -> (s1'=2) & (u1'=false) & (v1'=0);
trans(process(_N,_Next), state(1,0,_,P), read, state(2,0,0,P),S,H,H) :-
nonlocal(counter, counter, C,S),
num(CN),
C =:= CN - 1.
% done
% [done] s1=2 -> (s1'=3) & (u1'=false) & (v1'=0) & (p1'=0);
trans(process(_N,_Next), state(2,_,_,_), done, state(3,0,0,0),_S,H,H).
% retry
% [retry] s1=2 -> (s1'=0) & (u1'=false) & (v1'=0) & (p1'=0);
trans(process(_N,_Next), state(2,_,_,_), retry, state(0,0,0,0),_S,H,H).
% loop (when finished to avoid deadlocks)
% [loop] s1=3 -> (s1'=3);
trans(process(_N,_Next), state(3,U,V,P), loop, state(3,U,V,P),_S,H,H).
pick(H,V):-
kr(K),
K1 is K-1,
PH is 1/K,
findall(I,between(0,K1,I),L),
foldl(pick_value(H,PH),L,(1,_),(_,V)).
pick_value(_H,_PH,_I,(P0,V0),(P0,V0)):-
nonvar(V0).
pick_value(H,PH,I,(P0,V0),(P1,V1)):-
var(V0),
PF is PH/P0,
(pick_fact(H,V0,PF)->
P1=PF,
V1=I
;
P1 is P0*(1-PF),
V1=V0
).
pick_fact(_,_,P):P.
%pick(H,0):0.5; pick(H,1):0.5.
ctrans(S, A, T, Hi, Ho) :-
config(P),
find_matching_trans(P,S,S,[],A,T,Hi,Ho).
find_matching_trans([], [], _CS, _PA, A, [], H,H) :- nonvar(A).
find_matching_trans([P|Ps], [S|Ss], CS, PA, A, [T|Ts],Hi,Ho) :-
pick_trans(P, S, CS, PA, A, T, Hi,H1),
find_matching_trans(Ps, Ss, CS, PA, A, Ts,H1,Ho).
find_matching_trans([P|Ps], [S|Ss], CS, PA, A, [S|Ts], Hi,Ho) :-
% skip current process; but then all transitions enabled in the current
% process will be prohibited for selection in later processes.
enabled_trans(P,L),
(nonvar(A) -> \+ member(A,L); true),
append(L, PA, PA1),
find_matching_trans(Ps, Ss, CS, PA1, A, Ts, Hi, Ho).
pick_trans(P, S, CS, PA, A, T, H0,H) :-
etrans(P, S, PA, A, T,CS, H0,H).
etrans(P, S, PA, A, T, CS,H0,H) :-
trans(P, S, A, T,CS,H0,H),
A \= epsilon,
\+ member(A, PA).
enabled_trans(P, L) :-
setof(A, enabled_trans_in_process(P,A), L).
enabled_trans_in_process(P,A) :-
clause(trans(P,_,A,_,_,_,_),_),
A \= epsilon.
nonlocal(Proc, Var, Val,CS) :-
getpid(Proc, Var, Pid, Idx),
nth1(Pid, CS, State),
arg(Idx, State, Val).
% writeln(nonlocal_read(Proc, State, Var, Val)).
getpid(Proc, Var, Pid, Idx) :-
config(Config),
nth1(Pid, Config, Proc),
nonlocal_access(Proc, Var, Idx).
get_processid(P):-
num(N),
between(1,N,P).
config([counter| L]) :-
findall(P,get_processid(P),PL),
maplist(neighbor,PL,L).
neighbor(I,process(I,J)) :-
num(N),
(I<N
->
J is I+1
; J = 1
).
%config([counter, process(1,2), process(2,3), process(3,4), process(4,1)]).
init(S) :-
config(P),
maplist(init_state,P,S).
init_state(counter, counter(1)).
init_state(process(_,_), state(0,0,0,0)).
nonlocal_access(counter, counter, 1).
nonlocal_access(process(_,_), state, 1).
nonlocal_access(process(_,_), uniqueid, 2).
nonlocal_access(process(_,_), value, 3).
nl(S,P):-
nonlocal(process(P, _), uniqueid, 0,S).
num(4).
kr(4).
:- end_lpad.
graph_prob(G):-
retract(num(N)),
retract(kr(K)),
assert(num(4)),
assert(kr(2)),
findall(L-P,
(between(1,6,L),mc_sample(bounded_eventually(elect,L),100,P)),LV),
G=c3{data:_{x:x, rows:[x-'Probability of leader elected within L rounds (N=4, K=2)'|LV]},%legend:_{show: false},
axis:_{x:_{min:1,max:6,label:'L',padding:_{bottom:0.0,top:0.0}},
y:_{label:'Probability',padding:_{bottom:0.0,top:0.0}}}},
retract(num(4)),
retract(kr(2)),
assert(num(N)),
assert(kr(K)).
graph_exp_rounds_n(G):-
retract(num(NI)),
retract(kr(KI)),
assert(kr(3)),
findall(N-E,
(between(3,8,N),
assert(num(N)),
mc_expectation(eventually(elect,T),100,T,E),
retract(num(N))),
LV),
G=c3{data:_{x:x, rows:[x-'Expected rounds to elect a leader (K=3)'|LV]},%legend:_{show: false},
axis:_{x:_{min:3,max:8,label:'N',padding:_{bottom:0.0,top:0.0}},
y:_{label:'Expected rounds',padding:_{bottom:0.0,top:0.0}}}},
retract(kr(3)),
assert(num(NI)),
assert(kr(KI)).
graph_exp_rounds_k(G):-
retract(num(NI)),
retract(kr(KI)),
assert(num(3)),
findall(K-E,
(between(3,8,K),
assert(kr(K)),
mc_expectation(eventually(elect,T),500,T,E),
retract(kr(K))),
LV),
G=c3{data:_{x:x, rows:[x-'Expected rounds to elect a leader (N=3)'|LV]},%legend:_{show: false},
axis:_{x:_{min:3,max:8,label:'K',padding:_{bottom:0.0,top:0.0}},
y:_{label:'Expected rounds',padding:_{bottom:0.0,top:0.0}}}},
retract(num(3)),
assert(num(NI)),
assert(kr(KI)).
We consider the problem of computing expectations.
We would like to compute the expected number of rounds to elect a leader.
We can compute expectations with
mc_expectation(:Query:atom,+N:int,?Arg:var,-Exp:float).
that computes the expected value of Arg
in Query
by
sampling.
It takes N
samples of Query
and sums up the value of Arg
for
each sample. The overall sum is divided by N
to give Exp
.
An example of use of the above predicate is
mc_expectation(eventually(elect,T),1000,T,E).
that returns in E the expected value of rounds necessary to elect a leader computed by taking 1000 samples.
Complete example: pctl_slep.pl
In this example we want to show how to perform conditional inference in an approximate way using sampling. In particular, we will show how to use rejection sampling and Metropolis-Hastings.
This example generatively defines a random arithmetic function. The problem is to predict the value returned by the function given one or two couples of input-output, i.e., to compute a conditional probability. This program is translated from the example in the Church functional probabilistic programming language. Sampling is necessary as queries have an infinite number of explanations.
The full program of this example is
:- use_module(library(mcintyre)).
:- if(current_predicate(use_rendering/1)).
:- use_rendering(c3).
:- endif.
:- mc.
:- begin_lpad.
eval(X,Y):-
random_fn(X,0,F),
Y is F.
op(+):0.5;op(-):0.5.
random_fn(X,L,F):-
comb(L),
random_fn(X,l(L),F1),
random_fn(X,r(L),F2),
op(Op),
F=..[Op,F1,F2].
random_fn(X,L,F):-
\+ comb(L),
base_random_fn(X,L,F).
comb(_):0.3.
base_random_fn(X,L,X):-
identity(L).
base_random_fn(_X,L,C):-
\+ identity(L),
random_const(L,C).
identity(_):0.5.
random_const(L,0):0.1;random_const(L,1):0.1;random_const(L,2):0.1;
random_const(L,3):0.1;random_const(L,4):0.1;random_const(L,5):0.1;
random_const(L,6):0.1;random_const(L,7):0.1;random_const(L,8):0.1;
random_const(L,9):0.1.
:- end_lpad.
We know that the random function returns 3 for input 1 and we want to
compute the probability that it returns 4 for input 2.
We thus need to ask a conditional query and sampling
is necessary as queries have an infinite number of explanations.
The simplest approach is to use rejection sampling: you first query the evidence and, if the query is successful, query the goal in the same sample, otherwise
the sample is discarded.
This can be done with
mc_rejection_sample(:Query:atom,:Evidence:atom,+Samples:int,
-Probability:float,+Options:list).
that takes Samples
samples of Query
given that Evidence
is true.
Options
is a list of options, the following are recognised:
successes(-successes:int)
Number of successes
failures(-failures:int)
Number of failures
An example of use of the above predicate is
mc_rejection_sample(eval(2,4),eval(1,3),1000,P,[]).
that perform rejection sampling of eval(2,4)
given that eval(1,3)
is true.
Differently from exact inference, in approximate inference
the evidence can be a conjunction of atoms, so if
you also know that eval(0,2)
is true, you can use
mc_rejection_sample(eval(2,4),(eval(0,2),eval(1,3)),1000,P,[]).
and, as you can see, the query with more evidence is now almost certainly true.
In Metropolis-Hastings MCMC, a Markov chain is produced using the algorithm of : after a sample, a number of sampled probabilistic choices are deleted and the others are retained for the next sample. The sample is accepted with a probability of min{1,N0/N1} where N0 is the number of choices sampled in the previous sample and N1 is the number of choices sampled in the current sample. Metropolis-Hastings is usually much faster than rejection sampling because less samples are discarded.
To use Metropolis-Hastings, the following predicate is available
mc_mh_sample(:Query:atom,:Evidence:atom,+Samples:int,
-Probability:float,+Options:list).
where Options
is a list of options, the following are recognised:
mix(+Mix:int)
The first Mix
samples are discarded (mixing time), default value 0
lag(+lag:int)
lag between each sample, Lag
sampled choices are forgotten, default value 1
successes(-successes:int)
Number of successes
failures(-failures:int)
Number of failures
bar(-BarChar:dict)
BarChart
is a bar chart of the possible values
where Lag
is the number of sampled choices to forget before taking a new sample.
For example
mc_mh_sample(eval(2,4),eval(1,3),10000,P).
takes 10000 accepted samples and returns in P
the
estimated probability.
mc_rejection_sample_arg(:Query:atom,:Evidence:atom,+Samples:int,
?Arg:Var,-Values:list,+Options:list).
mc_mh_sample_arg(:Query:atom,:Evidence:atom,+Samples:int,
?Arg:Var,-Values:list,+Options:list).
that return the distribution of values for
Arg
in Query
in
Samples
of Query
given that
Evidence
is true. The predicate returns in Values
a list of couples L-N
where L
is the list of values of Arg
for which Query
succeeds in a world sampled at random where Evidence
is true and N
is the number of samples returning that list of values.
For example if we want to sample arg Y
of eval(2,Y)
given that
eval(0,2)
and eval(1,3)
are true with Metropolis-Hastings MCMC
mc_mh_sample_arg(eval(2,Y),eval(1,3),1000,Y,V,[]).
The printed results are pretty ugly.
For visualizing the results of sampling arguments as
bar chart you can use argbar/2
If we want to plot the bar chart of the previous query
mc_mh_sample_arg(eval(2,Y),eval(1,3),1000,Y,L),argbar(L,C).
You can also compute conditional expectations with
mc_mh_expectation(:Query:atom,:Evidence:atom,+N:int,?Arg:var,-Exp:float,+Options:list).
as in
mc_mh_expectation(eval(2,Y),eval(1,3),1000,Y,E,[]).
that computes the expectation of argument Y
of eval(2,Y)
given that
eval(1,3)
is true by taking 1000 samples using Metropolis-Hastings MCMC.
Complete example: arithm.pl
Up to now we have considered only discrete random variables and discrete probability distributions. Here we consider Hybrid Probabilistic Logic Programs, where some of the random variables are continuous.
How can we consider continuous random variables and probability density functions, for example real variables following a Gaussian distribution?
cplint on SWISH with its sampling inference module
allows the specification of density functions over arguments
of atoms in the head of rules. To specify a probability
density on an argument Var
of an atom
A
you can used rules of the form
A : Density :- Body.
where Density
is a special atom identifying
a probability density on variable Var
and Body
(optional) is a regular clause body.
Allowed Density
atoms are:
uniform(Var,L,U)
: Var
is uniformly distributed in \([L,U]\)
gaussian(Var,Mean,Variance)
: Var
follows a Gaussian distribution with mean Mean
and variance Variance
dirichlet(Var,Par)
: Var
is a list of real numbers following a Dirichlet distribution with
parameters \(\alpha\)
specified by the list Par
gamma(Var,Shape,Scale)
: Var
follows a gamma distribution with shape parameter
Shape
and scale parameter
Scale
.
beta(Var,Alpha,Beta)
: Var
follows a beta distribution with parameters
Alpha
and Beta
.
This syntax can be used to describe also discrete distribution, with either a finite or countably infinite support:
discrete(Var,D)
or finite(Var,D)
:
A
is an atom containg variable
Var
and D
is a list
of couples Value:Prob
assigning probability
Prob
to Value
uniform(Var,D)
: A
is an atom containg variable Var
and D
is a list of values each
taking the same probability (1 over the length
of D
).
poisson(Var,Lambda)
: Var
follows a Poisson distribution with parameter
Lambda
.
g(X) : gaussian(X,0,1).
states that argument X
of g(X)
follows a Gaussian distribution with mean 0 and variance 1.
In this section we will see how to perform simple
queries over hybrid programs.
If an atom encodes a continuous random variable (such
as g(X)
above),
asking the probability that a ground instantiation,
such as g(0.3)
,
is true is
not meaningful, as the probability that a continuous
random variables takes a
specific value is always 0. In this case you are more
interested in computing the
distribution of X
of a goal g(X)
,
possibly after having observed some evidence.
If the query is unconditional, we can use approximate inference
with Monte Carlo sampling as described in the section
.
When you have continuous random variables, you may be
interested in sampling arguments of goals representing
continuous random variables. In this way you can build a
probability density of the sampled argument. To do that
you can use the predicate mc_sample_arg/4
.
As example let us consider a program the following program.
A biased coin is thrown, if it lands heads, X
in mix(X)
is sampled from a Gaussian with mean 0 and variance 1.
If it lands tails, X
is sampled from a Gaussian with
mean 5 and variance 2.
:- use_module(library(mcintyre)).
:- use_rendering(c3).
:- mc.
:- begin_lpad.
% a coin is thrown. The coin is biased: with probability 0.6 it lands heads,
% with probabiity 0.4 it lands tails
heads:0.6;tails:0.4.
% X in g(X) follows a Gaussian distribution with mean 0 and variance 1
g(X): gaussian(X,0,1).
% X in h(X) follows a Gaussian distribution with mean 5 and variance 2
h(X): gaussian(X,5,2).
% if the coin lands heads, X in mix(X) is given by g(X)
mix(X) :- heads, g(X).
% if the coin lands tails, X in mix(X) is given by h(X)
mix(X) :- tails, h(X).
:- end_lpad.
For example we can now perform the query
mc_sample_arg(mix(X),1000,X,Values).
Values
will contain a list of
couples L-N
where L
is the list of values
of X
for which query succeeds in a world
sampled at random and N
is the number of
samples returning L.
Notice that, in every couple L-N
,
L
will contain just
one element and N
will be always 1. This
is because the random variable X
is continuous
and mix(X)
always succeeds exactly once,
therefore the predicate mc_sample_arg/4
will sample 1000 different worlds and every world will
have a different value for X
.
Of course watching the
resulting values as a list of numerical
values could make you feel the pesky sensation of your eyes melting.
It would be better if we could use some
kind of graphical feature to represent the results.
In other words our desire is to draw the
probability density function of the argument.
cplint on SWISH
provides the predicate histogram/3
that comes
in handy for these situations.
histogram(+List:list,-Chart:dict,+Options:list).
where Options
is a list of options, the following are recognised:
min(+Min:float)
the minimum value of domain, default value the minimum in List
max(+Max:float)
the maximum value of domain, default value the maximum in List
nbins(+NBins:int)
the number of bins for dividing the domain, default value 40
This predicate draws a vertical histogram of the samples in
List
dividing the domain in NBins
bins.
List
must be a list of couples of the
form L-W
where L
is a
list of sampled values and W
is its weight.
So we can use the output list Values
of
mc_sample_arg/4
as input list List
for histogram/3
. In our example, if we
want to plot the probability density function of X
,
we can use
mc_sample_arg(mix(X),1000,X,_Values), histogram(_Values,Chart,[]).
Note: we wrote _Values
instead of Values
because we are not interested
in printing the results of the sampling.
Complete examples: gaussian_mixture.pl
cplint on SWISH also allows to execute conditional query over hybrid programs.
As in the previous section
we are interested in
sampling arguments of goals representing
continuous random variables and drawing a
probability density of the sampled argument.
To perform this
kind of query we must distinguish three cases depending on what
type of evidence we have:
mc_rejection_sample_arg/6
and
mc_mh_sample_arg/6
, the predicates for rejection sampling and
Metropolis-Hastings MCMC (see section )
Let us consider the same program of the
previous
section
We want to take 1000 samples of X
in mix(X)
given that heads
was true
using rejection sampling and Metropolis-Hastings MCMC
mc_rejection_sample_arg(mix(X),heads,1000,X,Values,[]).
mc_mh_sample_arg(mix(X),heads,1000,X,Values,[lag(2)]).
To plot the distribution of the arguments we can use
histogram/3
.
mc_rejection_sample_arg(mix(X),heads,1000,X,_V), histogram(_V,Chart,[]).
mc_mh_sample_arg(mix(X),heads,1000,X,_V,[lag(2)]), histogram(_V,Chart,[]).
Let us consider the same program of the
previous
section
We want to take 1000 samples of X
in mix(X)
given that X>2
was true using rejection sampling and draw an
histogram with 40 bins representing the probability
density of X
. To do that we can ask
the query
mc_rejection_sample_arg(mix(X),(mix(Y),Y>2),1000,X,_V,[]), histogram(_V,Chart,[]).
We can do the same by using Metropolis-Hastings MCMC
mc_mh_sample_arg(mix(X),(mix(Y),Y>2),1000,X,_Values,[lag(2)]),
histogram(_Values,Chart,[]).
When you have evidence on ground atoms that have
continuous values as arguments, you cannot use neither
rejection sampling nor Metropolis-Hastings,
as the probability of the evidence is 0,
but you need to use likelihood weighting
to obtain samples of continuous arguments of a goal.
Let us take as example the following program.
We are trying to estimate the true value of a Gaussian distributed random variable, given some observed data. The variance is known (2) and we suppose that the mean has a Gaussian distribution with mean 1 and variance 5. We take different measurement (e.g. at different times), indexed with an integer. Given that we observe 9 and 8 at indexes 1 and 2, how does the distribution of the random variable (value at index 0) changes with respect to the case of no observations?
:- use_module(library(mcintyre)).
:- use_rendering(c3).
:- mc.
:- begin_lpad.
% at time I we see X sampled from a Gaussian with mean M and variance 2.0
val(I,X) :-
mean(M),
measurement(I,M,X).
% Gaussian distribution of the mean of the Gaussian of the variable
mean(M): gaussian(M,1.0, 5.0).
% Gaussian distribution of the variable
measurement(_,M,X): gaussian(X,M, 2.0).
:- end_lpad.
As in the previous cases we are interested in finding the posterior probability density function of an argument. To do that we can use the predicate
mc_lw_sample_arg(:Query:atom,:Evidence:atom,+N:int,?Arg:var,-ValList)
This predicate returns in ValList
a list of couples
V-W
where V
is a value
of Arg
for which Query
succeeds and W
is the weight computed
by likelihood weighting according to Evidence
(a conjunction of atoms is allowed here).
For example we may ask given that we observe 9 and 8 at indexes 1 and 2,
what is the distribution of the random
variable (value at index 0)?
mc_lw_sample_arg(val(0,X),(val(1,9),val(2,8)),1000,X,V)
Of course it would be better if we could plot the
results in a graph, instead of just printing them (aah! my eyes!).
We cannot use
histogram/4
because it takes as input
a list of couples L-W
where L
is a list
of values, whereas mc_lw_sample_arg/5
returns
a list of couples V-W
where V
is a value.
In this case you can use densities/4
densities(+PriorList:list,+PostList:list,-Chart:dict,+Options:list)
This predicate draws a line chart of the density of two sets
of samples, usually prior and post observations.
The samples from the prior are in PriorList
as couples L-W
, where L
is a
list of values and W
is its weight,
while the samples from the posterior are in
PostList
as couples V-W
where V
is a value and W
its weigth.
The same options as histogram/3
are recognized.
The lines
are drawn dividing the domain in the number of bins provided as an option.
If we want to plot the densities of the random variable before and after
observing 9 and 8 by taking 1000 samples and by dividing the domain
in 40 bins, we can run
mc_sample_arg(val(0,Y),1000,Y,_V0),
mc_lw_sample_arg(val(0,X),(val(1,9),val(2,8)),1000,X,_V),
densities(_V0,_V,Chart,[]).
Complete examples: gaussian_mixture.pl, gauss_mean_est.pl
In this section we will see how to learn the parameters given a background knowledge and an initial program. We take into account the Machines dataset .
To execute a learning algorithm the input source, a .pl file, must be divided in five parts:
Here we will define a program step by step and then execute the algorithm EMBLEM which learns the parameters of a given initial LPAD program.
For more information of how to perform learning see the cplint on SWISH manual (PDF version).
In order to perform both EMBLEM and SLIPCOVER you need to load the library slipcover
with the command
:- use_module(library(slipcover)).
After that you have to initialize slipcover
with the command
:- sc.
Now that we have defined the preamble, one can specify the background knowledge with a fact of the form
bg(<list of terms representing clauses>).
Alternatively, we can specify a set of clauses by including them in a section between the goals :- begin_bg.
and :- end_bg.
. We will use the latter approach.
:- begin_bg.
component(C):-
replaceable(C).
component(C):-
not_replaceable(C).
replaceable(gear).
replaceable(wheel).
replaceable(chain).
not_replaceable(engine).
not_replaceable(control_unit).
not_worn(C):-
component(C),
\+ worn(C).
one_worn:-
worn(_).
none_worn:-
\+ one_worn.
:- end_bg.
At this point we can define an initial program for which we want to learn the parameters. We can do it with a fact of the form
in(<list of terms representing clauses>).
Alternatively, you can specify an input program in a section between :- begin_in.
and :- end_in.
. We will use the latter method. Therefore in our example
:- begin_in.
class(sendback):0.5 :-
worn(A),
not_replaceable(A).
class(fix):0.6 :-
worn(A),
replaceable(A).
class(ok):0.5 :-
not_worn(_A).
:- end_in.
The language bias part contains the declarations of the input and output predicates.
The typical input for EMBLEM will be a set of interpretations, i.e. sets of ground facts. Among the predicates for the input facts the user has to indicate which are the output predicates. Output predicates are declared as
output(<predicate>/<arity>).
In our example
output(class/1).
Input predicates are those whose atoms you are not interested in predicting.
You can declare closed world input predicates with
input_cw(<predicate>/<arity>).
For these predicates, the only true atoms are those in the interpretations and those derivable from them using the background knowledge; the clauses in the input/hypothesized program are not used to derive atoms for these predicates. Moreover, clauses in the background knowledge that define closed world input predicates and that call an output predicate in the body will not be used for deriving examples. In our example
input_cw(replaceable/1).
input_cw(not_replaceable/1).
input_cw(worn/1).
input_cw(not_worn/1).
input_cw(none_worn/0).
Besides closed world input predicates we can declare open world input predicates with
input(<predicate>/<arity>).
In our example we do not have open world input predicates.
The last part of the input file contains the data.
You can specify data with two modalities: models and keys.
In the models type, you specify an example model (or interpretation)
as a list of Prolog facts initiated by begin(model(<name>)).
and terminated by end(model(<name>)).
.
Alternatively, with the keys modality, you can directly
write the facts and the first argument will be interpreted as a model identifier.
The two modalities, models and keys, can be mixed in the same source.
If we use the model modality for the example/interpretation 1, we get
begin(model(1)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(gear).
worn(engine).
end(model(1)).
If we use this modality the system asserts a int(<name>).
for each model enclosed in begin(model(<name>)).
and end(model(<name>)).
.
Instead, if we use the key modality, our example will be (note the first argument of each fact)
class(1,sendback).
neg(1,class(fix)).
neg(1,class(ok)).
worn(1,gear).
worn(1,engine).
If we use this modality, facts for int/1 are not asserted for interpretations, but can be explicitily added by the user.
After defining the examples/interpretations we must indicate how the examples are divided in folds with facts of the form:
fold(<fold_name>,<list of model identifiers>)
as for example
fold(train1,[1,2,3]).
fold(train2,[4,5,6,7,8,9,10]).
We can also define intensionally the folds as in
fold(all,F) :- findall(I,int(I),F).
The complete Machines input file is
% PREAMBLE %
:- use_module(library(slipcover)).
% use the renderer 'lpad'. It not mandatory to use it, but it prints the learned clauses in a more readable way
:- use_rendering(lpad).
:- sc.
:- set_sc(depth_bound,false).
:- set_sc(neg_ex,given).
:- set_sc(megaex_bottom,15).
:- set_sc(max_iter,10).
:- set_sc(max_iter_structure,50).
:- set_sc(verbosity,1).
% BACKGROUND KNOWLEDGE %
:- begin_bg.
component(C):-
replaceable(C).
component(C):-
not_replaceable(C).
replaceable(gear).
replaceable(wheel).
replaceable(chain).
not_replaceable(engine).
not_replaceable(control_unit).
not_worn(C):-
component(C),
\+ worn(C).
one_worn:-
worn(_).
none_worn:-
\+ one_worn.
:- end_bg.
% INITIAL PROGRAM %
:- begin_in.
class(sendback):0.5 :-
worn(A),
not_replaceable(A).
class(fix):0.6 :-
worn(A),
replaceable(A).
class(ok):0.5 :-
not_worn(_A).
:- end_in.
% LANGUAGE BIAS %
% output predicates
output(class/1).
% input closed world predicates
input_cw(replaceable/1).
input_cw(not_replaceable/1).
input_cw(worn/1).
input_cw(not_worn/1).
input_cw(none_worn/0).
% EXAMPLES (interpretations) %
begin(model(1)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(gear).
worn(engine).
end(model(1)).
begin(model(2)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(2)).
begin(model(3)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(gear).
end(model(3)).
begin(model(4)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
end(model(4)).
begin(model(5)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(gear).
worn(chain).
end(model(5)).
begin(model(6)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
end(model(6)).
begin(model(7)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(wheel).
worn(control_unit).
end(model(7)).
begin(model(8)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(8)).
begin(model(9)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
worn(chain).
end(model(9)).
begin(model(10)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
worn(chain).
end(model(10)).
begin(model(11)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
worn(control_unit).
end(model(11)).
begin(model(12)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(chain).
worn(wheel).
worn(gear).
end(model(12)).
begin(model(13)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(chain).
worn(wheel).
worn(gear).
worn(engine).
end(model(13)).
begin(model(14)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(14)).
begin(model(15)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
worn(gear).
end(model(15)).
fold(all, F) :- findall(I,int(I),F).
To execute the parameter learning algorithm EMBLEM, we need to ask a query of the form
induce_par(<list of folds>,P).
where <list of folds>
is a list of the folds for training and P
will contain the
input program with updated parameters.
In our example we want to learn the parameters by using the fold which contains all the examples (all
). Therefore
induce_par([all],P).
Complete example: mach.pl
For more information about how to perform learning and EMBLEM see the manual (or PDF version) and the references in the About page
cplint on SWISH allows the user to
perform structure learning.
In this tutorial section we will learn how to do that.
We take into account the Machines dataset
already illustrated in the
previous section.
Note: the learning algorithms are available only if you use the Prolog editor.
To execute a learning algorithm the input source must be divided in five parts:
Here we will define a program step by step and then
execute the algorithm SLIPCOVER
which learns the structure and the parameters given a background knowledge.
We keep the background knowledge, the initial LPAD,
and the interpretations unchanged. However, in order to
perform structure learning, we need to
modify the preamble and the language bias.
For more information of how to perform learning see the cplint on SWISH manual (PDF version).
We load the library slipcover
, initialize
it and then set some parameters. We use the same preamble
of the previous section, plus
we add the settings set_sc(max_iter_structure,50).
and
:- set_sc(megaex_bottom,15).
. So the preamble becomes
:-use_module(library(slipcover)).
:-sc.
:- set_sc(depth_bound,false).
:- set_sc(neg_ex,given).
:- set_sc(max_iter,10).
:- set_sc(max_iter_structure,50).
:- set_sc(megaex_bottom,15).
:- set_sc(verbosity,1).
In the previous example we already defined the set of output and input predicates for the EMBLEM algorithm. However now we want to learn new clauses (the structure) for the program, in order to do that we need to add mode declarations in the style of Progol.
To specify the atoms that can appear in the head of clauses you must use facts of the form
modeh(<recall>,<predicate>(<arg1>,...)).
To specify the atoms that can appear in the body of clauses you must use facts of the form
modeb(<recall>,<predicate>(<arg1>,...)).
where <recall>
can be an integer or *. <recall>
indicates how many atoms for the predicate specification are retained in the bottom clause during a saturation step. * stands for all those that are found.
We refer to the cplint on SWISH manual for further details.
In our example we use the following mode declarations
modeh(*,class(fix)).
modeh(*,class(ok)).
modeh(*,class(sendback)).
modeb(*,not_replaceable(-comp)).
modeb(*,replaceable(-comp)).
modeb(*,worn(-comp)).
modeb(*,not_worn(-comp)).
modeb(*,none_worn).
SLIPCOVER also requires facts for the determination/2
predicate Aleph-style that indicate which predicates
can appear in the body of clauses. For our program we have
determination(class/1,replaceable/1).
determination(class/1,not_replaceable/1).
determination(class/1,worn/1).
determination(class/1,not_worn/1).
determination(class/1,none_worn/0).
For example the first determination/2
fact
states that the predicate replaceable/1
can appear in the body of hypothesized clauses having
class/1
in the head.
Below there is the complete program
% PREAMBLE %
:- use_module(library(slipcover)).
% use the renderer 'lpad'. It not mandatory to use it, but it prints the learned clauses in a more readable way
:- use_rendering(lpad).
:- use_rendering(c3).
:- sc.
:- set_sc(depth_bound,false).
:- set_sc(neg_ex,given).
:- set_sc(megaex_bottom,15).
:- set_sc(max_iter,10).
:- set_sc(max_iter_structure,50).
:- set_sc(verbosity,1).
% BACKGROUND KNOWLEDGE %
:- begin_bg.
component(C):-
replaceable(C).
component(C):-
not_replaceable(C).
replaceable(gear).
replaceable(wheel).
replaceable(chain).
not_replaceable(engine).
not_replaceable(control_unit).
not_worn(C):-
component(C),
\+ worn(C).
one_worn:-
worn(_).
none_worn:-
\+ one_worn.
:- end_bg.
% INITIAL PROGRAM %
:- begin_in.
class(sendback):0.5 :-
worn(A),
not_replaceable(A).
class(fix):0.6 :-
worn(A),
replaceable(A).
class(ok):0.5 :-
not_worn(_A).
:- end_in.
% LANGUAGE BIAS %
% output predicates
output(class/1).
% input closed world predicates
input_cw(replaceable/1).
input_cw(not_replaceable/1).
input_cw(worn/1).
input_cw(not_worn/1).
input_cw(none_worn/0).
modeh(*,class(fix)).
modeh(*,class(ok)).
modeh(*,class(sendback)).
modeb(*,not_replaceable(-comp)).
modeb(*,replaceable(-comp)).
modeb(*,worn(-comp)).
modeb(*,not_worn(-comp)).
modeb(*,none_worn).
determination(class/1,replaceable/1).
determination(class/1,not_replaceable/1).
determination(class/1,worn/1).
determination(class/1,not_worn/1).
determination(class/1,none_worn/0).
% EXAMPLES (interpretations) %
begin(model(1)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(gear).
worn(engine).
end(model(1)).
begin(model(2)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(2)).
begin(model(3)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(gear).
end(model(3)).
begin(model(4)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
end(model(4)).
begin(model(5)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(gear).
worn(chain).
end(model(5)).
begin(model(6)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
end(model(6)).
begin(model(7)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(wheel).
worn(control_unit).
end(model(7)).
begin(model(8)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(8)).
begin(model(9)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
worn(chain).
end(model(9)).
begin(model(10)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
worn(chain).
end(model(10)).
begin(model(11)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
worn(control_unit).
end(model(11)).
begin(model(12)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(chain).
worn(wheel).
worn(gear).
end(model(12)).
begin(model(13)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(chain).
worn(wheel).
worn(gear).
worn(engine).
end(model(13)).
begin(model(14)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(14)).
begin(model(15)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
worn(gear).
end(model(15)).
fold(all,F):- findall(I,int(I),F).
:- fold(all,F),
sample(4,F,FTr,FTe),
assert(fold(train,FTr)),
assert(fold(test,FTe)).
To execute the structure learning algorithm SLIPCOVER (which learns also the parameters of the learned program), we need to execute a query with the form
induce(<list of folds>,P).
where <list of folds>
is a list of the folds for training and P
will contain the learned program.
In our example we want to learn the structure (and the parameters) by using the fold which contains all the examples (all
). Therefore
induce([all],P).
Complete example: mach.pl
For more information about how to perform learning and SLIPCOVER see the cplint on SWISH manual (or PDF version) and .
In this tutorial section we will see how to execute a test on a program. We take into account the Machines dataset already seen in the previous sections ( Parameter Learning and Structure Learning).
A program can also be tested on a test set with a query of the form
?- test(<program>,<list_of_folds>,LL,AUCROC,ROC,AUCPR,PR).
where <program>
is a list of terms representing clauses and <list_of_folds>
is a list of folds. This returns the log likelihood of the test examples in LL
, the Area Under the ROC curve in AUCROC
, a dictionary containing the list of points (in the form of Prolog pairs x-y) of the ROC curve in ROC
, the Area Under the PR curve in AUCPR, a dictionary containing the list of points of the PR curve in PR
.
Let us suppose now that we have two disjunt folds of examples named train
and test
. We will now see how to test a (learned) program.
we can test the input program on the fold test with a query of the form
?- in(P), test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
Suppose we want to perform parameter learning on the initial program by using the train
fold and then test the resulting program by using the test
fold. Then we have just to run the query
?- induce_par([train],P), test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
Suppose we want to learn new clauses (i.e. we perform structure learning) by using the train
fold and then test the resulting program by using the test
fold. Then we have just to run the query
?- induce([train],P), test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
It is possible to see the curves AUCROC, ROC and PR as graphs by including the renderer c3
before :- sc.
. Morover we include the renderer lpad
to have the output program pretty printed. Therefore we add the following commands in the preamble before :- sc.
.
:- use_rendering(c3).
:- use_rendering(lpad).
We can intensionally create the fold containing all the example with
fold(all,F):- findall(I,int(I),F).
We can dynamically create the folds train
and test
with the following command
:- fold(all,F),
sample(4,F,FTr,FTe),
assert(fold(train,FTr)),
assert(fold(test,FTe)).
This last command should however be inserted after the input interpretations. As can be seen, it uses sample(N,List,Sampled,Rest)
exported from the library slipcover
that samples N
elements from List
and returns the sampled elements in Sampled
and the rest in Rest
. If List
has N
elements or less, Sampled
is equal to List
and Rest
is empty.
If we want to learn the parameters of the initial program and then test the resulting program, we can use the following query
induce_par([train],P), test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
If we want to learn a program and then test it, we can use the following query
induce([train],P), test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
Complete example: mach.pl
For more information about how to perform learning see the cplint on SWISH manual (or PDF version).
PhD student - University of Ferrara