cplint on SWISH Manual

cplint permits the definition of discrete probability distributions and continuous probaility densities.

Discrete Probability Distributions

LPAD and CP-logic programs consist of a set of annotated disjunctive clauses. Disjunction in the head is represented with a semicolon and atoms in the head are separated from probabilities by a colon. For the rest, the usual syntax of Prolog is used. A general CP-logic clause has the form

where Body is a conjunction of goals as in Prolog. No parentheses are necessary. The pi are numeric expressions. It is up to the user to ensure that the numeric expressions are legal, i.e. that they sum up to less than one.

If the clause has a single head with probability 1, the annotation can be omitted and the clause takes the form of a normal prolog clause, i.e.

The first clause states that if we toss a coin that is not biased it has equal probability of landing heads and tails. The second states that if the coin is biased it has a slightly higher probability of landing heads. The third states that the coin is fair with probability 0.9 and biased with probability 0.1 and the last clause states that we toss a coin with certainty.

Moreover, the bodies of rules may contain built-in predicates, predicates from the libraries lists, apply and clpr/nf_r plus the predicate

The body of rules may also contain the predicate prob/2 that computes the probability of an atom, thus allowing nested probability computations. For example (meta.pl)

Moreover, the probabilistic annotations can be variables, as in (flexprob.pl))

Variables in probabilistic annotations must be ground when resolution reaches the end of the body, otherwise an exception is raised.

where A is an atom containg variable Var and D is a list of couples Value:Prob assigning probability Prob to Value. Moreover, you can use

where A is an atom containg variable Var and D is a list of values each taking the same probability (1 over the length of D).

ProbLog Syntax

PRISM Syntax

You can also use PRISM [19] syntax, so a program is composed of a set of regular Prolog rules whose body may contain calls to the msw/2 predicate (multi-ary switch). A call msw(term,value) means that a random variable associated to term assumes value value. The admissible values for a discrete random variable are specified using facts for the values/2 predicate of the form

where T is a term (possibly containing variables) and L is a list of values. The distribution over values is specified using directives for set_sw/2 of the form

where T is a term (possibly containing variables) and LP is a list of probability values. Remember that in PRISM each call to msw/2 refers to a different random variable, i.e., no memoing is performed, differently from the case of LPAD/CP-Logic/ProbLog.

Continuous Probability Densities

cplint handles continuous random variables as well with its sampling inference module. To specify a probability density on an argument Var of an atom A you can used rules of the form

where Density is a special atom identifying a probability density on variable Var and Body (optional) is a regular clause body. Allowed Density atoms are

states that argument X of g(X) follows a Gaussian distribution with mean 0 and variance 1, while

states that argument X of g(X) follows a Gaussian multivariate distribution with mean vector \([0,0]\) and covariance matrix \[\left[\begin{array}{rr} 1&0\\ 0&1 \end{array}\right]\].

The argument X of mix(X) follows a distribution that is a mixture of two Gaussian, one with mean 0 and variance 1 with probability 0.6 and one with mean 5 and variance 2 with probability 0.4.

The parameters of the distribution atoms can be taken from the probabilistic atom, the example gauss_mean_est.pl

states that for an index I the continuous variable X is sampled from a Gaussian whose variance is 2 and whose mean is sampled from a Guassian with mean 1 and variance 5.

Any operation is allowed on continuous random variables. The example below (kalman_filter.pl) encodes a Kalman filter:

Continuous random variables are involved in arithmetic expressions (in trans/3 and emit/3). It is often convenient, as in this case, to use CLP(R) constraints (by including the directive :- use_module(library(clpr)).) as in this way the expressions can be used in multiple directions and the same clauses can be used both to sample and to evaluate the weight the sample on the basis of evidence, otherwise different clauses have to be written. In case random variables are not sufficiently instantiated to exploit expressions for inferring the values of other variables, inference will return an error.

Distributional Clauses Syntax

You can also use the syntax of Distributional Clauses (DC) [13]. Continuous random variables are represented in this case by term whose distribution can be specified with density atoms as in

Here := replaces the implication symbol, T is a term and Density' is one of the density atoms above witthout the Var argument, because T itself represents a random variables. In the body of clauses you can use the infix operator ~= to equate a term representing a random variable with a logical variable or a constant as in T ~= X. Internally cplint transforms the terms representing random variables into atoms with an extra argument for holding the variable. DC can be used to represent also discrete distributions using

where L is a list of values and D is a list of couples P:V with P a probability and V a value. If Body is empty, as in regular Prolog, the implication symbol := can be omitted.

Semantics

The semantics of LPADs for the case of programs without functions symbols can be given as follows. An LPAD defines a probability distribution over normal logic programs called worlds. A world is obtained from an LPAD by first grounding it, by selecting a single head atom for each ground clause and by including in the world the clause with the selected head atom and the body. The probability of a world is the product of the probabilities associated to the heads selected. The probability of a ground atom (the query) is given by the sum of the probabilities of the worlds where the query is true.

If the LPAD contains function symbols, the definition is more complex, see [15,18,20].

For the semantics of programs with continuous random variables, see [10] that defines the probability space for \(N\) continuous random variables by considering the Borel \(\sigma\)-algebra over \(\mathbb{R}^N\) and defines a Lebesgue measure on this set as the probability measure. The probability space is lifted to cover the entire program using the least model semantics of constraint logic programs. Alternatively, [13] defines the semantics of distributional clauses by resorting to a stochastic \(Tp\) operator. cplint allows more freedom than distributional clauses in the use of continuous random variables in expressions, for example kalman_filter.pl would not be allowed by distributional clauses.

Inference

cplint answers queries using the module pita or mcintyre. The first performs the program transformation technique of [16]. Differently from that work, techniques alternative to tabling and answer subsumption are used. The latter performs approximate inference by sampling using a different program transformation technique and is described in [17]. Only mcintyre is able to handle continuous random variables.

For answering queries, you have to prepare a Prolog file where you first load the inference module (for example pita), initialize it with a directive (for example :- pita) and then enclose the LPAD clauses in :-begin_lpad. or :-begin_plp. and :-end_lpad. or :-end_plp. For example, the coin program above can be stored in coin.pl for performing inference with pita as follows

You can have also (non-probabilistic) clauses outside :-begin/end_lpad. These are considered as database clauses. In pita subgoals in the body of probabilistic clauses can query them by enclosing the query in db/1. For example (testdb.pl)

You can also use findall/3 on subgoals defined by database clauses (persons.pl)

Aggregate predicates on probabilistic subgoals are not implemented due to their high computational cost (if the aggregation is over \(n\) atoms, the values of the aggregation are potentially \(2^n\)). The Yap version of cplint includes reasoning algorithms that allows aggregate predicates on probabilistic subgoals, see http://ds.ing.unife.it/~friguzzi/software/cplint/manual.html.

In mcintyre you can query database clauses in the body of probabilistic clauses without any special syntax. You can also use findall/3.

Unconditional Queries

The unconditional probability of an atom can be asked using pita with the predicate

If the query is non-ground, prob/2 returns in backtracking the succesful instantiations together with their probability.

that samples heads(coin) 1000 times and returns in S the number of successes, in F the number of failures and in P the estimated probability (S/1000).

Differently from exact inference, in approximate inference the query can be a conjunction of atoms.

that samples heads(coin) 1000 times and returns the estimated probability that a sample is true (i.e., that a sample succeeds).

The predicate samples Query a number of Samples times. Arg should be a variable in Query. The predicate returns in Values a list of couples L-N where L is the list of values of Arg for which Query succeeds in a world sampled at random and N is the number of samples returning that list of values. If L is the empty list, it means that for that sample the query failed. If L is a list with a single element, it means that for that sample the query is determinate. If, in all couples L-N, L is a list with a single element, it means that the clauses in the program are mutually exclusive, i.e., that in every sample, only one clause for each subgoal has the body true. This is one of the assumptions taken for programs of the PRISM system [20]. For example pfcglr.pl and plcg.pl satisfy this constraint while markov_chain.pl and var_obj.pl don’t.

of markov_chain.pl that takes 50 samples of L in findall(S,(reach(s0,0,S),L).

that samples Query a number of Samples times The predicate returns in Values a list of values of verb|Arg| returned as the first answer by Query in a world sampled at random. The value is failure if the query fails.

samples Query a number of Samples times and returns in Values a list of couples V-N where V is the value of Arg returned as the first answer by Query in a world sampled at random and N is the number of samples returning that value. V is failure if the query fails. mc_sample_arg_first/4 differs from mc_sample_arg/4 because the first just computes the first answer of the query for each sampled world.

samples Query a number of Samples times and returns in Values a list of couples V-N where V is a value sampled with uniform probability from those returned by Query in a world sampled at random and N is the number of samples returning that value. V is failure if the query fails.

that computes the expected value of Arg in Query by sampling. It takes N samples of Query and sums up the value of Arg for each sample. The overall sum is divided by N to give Exp.

of pctl_slep.pl that returns in E the expected value of T by taking 1000 samples.

Drawing BDDs

The first write the BDD to a file, the latter returns it as a string. The BDD is represented in the dot format of graphviz. Solid edges indicate 1-children, dashed edges indicate 0-children and dotted edges indicate 0-children with negation applied to the sub BDD. Each level of the BDD is associated to a variable of the form XI_J indicated on the left: I indicates the multivalued variable index and J the index of the Boolean variable of rule I. The hexadecimal number in each node is part of its address in memory and is not significant. The table Var contains the associations between the rule groundings and the multivalued variables: the first column contains contains the multivalued variable index, the second column contains the rule index, corresponding to its position in the program, and the last column contains the list of constants grounding the rule, each replacing a variable in the order of appearance in the rule.

Conditional Queries on Discrete Variables

The conditional probability of an atom query given another atom evidence can be asked using pita with the predicate

If the query/evidence are non-ground, prob/3 returns in backtracking ground instantiations together with their probability.

The query and the evidence can be conjunctions of literals (positive or negative).

When using mcintyre, you can ask conditional queries with rejection sampling or with Metropolis-Hastings Markov Chain Monte Carlo. In rejection sampling [22], you first query the evidence and, if the query is successful, query the goal in the same sample, otherwise the sample is discarded. In Metropolis-Hastings MCMC, mcintyre follows the algorithm proposed in [12] (the non adaptive version). A Markov chain is built by building an initial sample and by generating successor samples.

The initial sample is built by randomly sampling choices so that the evidence is true. This is done with a backtracking meta-interpreter that starts with the goal and randomizes the order in which clauses are selected during the search so that the initial sample is unbiased. Each time the meta-interpreter encounters a probabilistic choice, it first checks whether a value has already been sampled, if not, it takes a sample and records it. If a failure is obtained, the meta-interpreter backtracks to other clauses but without deleting samples. Then the goal is queries using regular MCINTYRE.

A successor sample is obtained by deleting a fixed number (parameter Lag) of sampled probabilistic choices. Then the evidence is queried using regular MCINTYRE starting with the undeleted choices. If the query succeeds, the goal is queried using regular MCINTYRE. The sample is accepted with a probability of \(\min\{1,\frac{N_0}{N_1}\}\) where \(N_0\) is the number of choices sampled in the previous sample and \(N_1\) is the number of choices sampled in the current sample. In [12] the lag is always 1 but the proof in [12] that the above acceptance probability yields a valid Metropolis-Hastings algorithm holds also when forgetting more than one sampled choice, so the lag is user defined in cplint.

Then the number of successes of the query is increased by 1 if the query succeeded in the last accepted sample. The final probability is given by the number of successes over the total number of samples.

that takes 1000 samples where biased(coin) is true and returns in S the number of samples where heads(coin) is true, in F the number of samples where heads(coin) is false and in P the estimated probability (S/1000).

where Lag is the number of sampled choices to forget before taking a new sample. For example (arithm.pl)

takes 10000 accepted samples and returns in T the number of samples where eval(2,4) is true, in F the number of samples where eval(2,4) is false and in P the estimated probability (T/10000).

Moreover, you can sample arguments of queries with rejection sampling and Metropolis-Hastings MCMC using

that return the distribution of values for Arg in Query in Samples of Query given that Evidence is true. Mix indicates the number of mixing samples. The predicate returns in Values a list of couples L-N where L is the list of values of Arg for which Query succeeds in a world sampled at random where Evidence is true and N is the number of samples returning that list of values.

that computes the expected value of Arg in Query by sampling. It takes N samples of Query and sums up the value of Arg for each sample. The overall sum is divided by N to give Exp.

of pctl_slep.pl that returns in E the expected value of T by taking 1000 samples.

of arithm.pl that computes the expectation of argument Y of eval(2,Y) given that eval(1,3) is true by taking 1000 samples using Metropolis-Hastings MCMC.

Conditional Queries on Continuous Variables

When you have continuous random variables, you may be interested in sampling arguments of goals representing continuous random variables. In this way you can build a probability density of the sampled argument. When you do not have evidence or you have evidence on atoms not depending on continuous random variables, you can use the above predicates for sampling arguments.

from (gauss_mean_est.pl)) samples 1000 values for X in value(0,X) and returns them in L.

When you have evidence on ground atoms that have continuous values as arguments, you cannot use rejection sampling or Metropolis-Hastings, as the probability of the evidence is 0. For example, the probability of sampling a specific value from a Gaussian is 0. Continuous variables have probability densities instead of distributions as discrete variables. In this case, you can use likelihood weighting or particle filtering [9,11,13] to obtain samples of continuous arguments of a goal.

For each sample to be taken, likelihood weighting uses a meta-interpreter to find a sample where the goal is true, randomizing the choice of clauses when more than one resolves with the goal in order to obtain an unbiased sample. This meta-interpreter is similar to the one used to generate the first sample in Metropolis-Hastings.

Then a different meta-interpreter is used to evaluate the weight of the sample. This meta-interpreter starts with the evidence as the query and a weight of 1. Each time the meta-interpreter encounters a probabilistic choice over a continuous variable, it first checks whether a value has already been sampled. If so, it computes the probability density of the sampled value and multiplies the weight by it. If the value has not been sampled, it takes a sample and records it, leaving the weight unchanged. In this way, each sample in the result has a weight that is 1 for the prior distribution and that may be different from the posterior distribution, reflecting the influence of evidence.

In particle filtering, the evidence is a list of atoms. Each sample is weighted by the likelihood of an element of the evidence and constitutes a particle. After weighting, particles are resampled and the next element of the evidence is considered.

samples Query a number of Samples times given that Evidence (a conjunction of atoms is allowed here). is true. The predicate returns in Prob the probability that the query is true. It performs likelihood weighting: each sample is weighted by the likelihood of evidence in the sample. For example

from indian_gpa.pl samples 1000 the query nation(a) given that student_gpa(4.0) has been observed.

returns in ValList a list of couples V-W where V is a value of Arg for which Query succeeds and W is the weight computed by likelihood weighting according to Evidence (a conjunction of atoms is allowed here). For example

from gauss_mean_est.pl samples 100 values for X in value(0,X) given that value(1,9) and value(2,8) have been observed.

samples Query a number of Samples times given that Evidence is true using particle filtering. Evidence is a list of goals. The predicate returns in Prob the probability that the query is true. For each goal of Evidence, Samples samples of Query are taken and weighted by the likelihood of the evidence goal. Then particles are resampled and the next element of Evidence is considered.

samples argument Arg of Query using particle filtering given that Evidence is true. Evidence is a list of goals and Query can be either a single goal or a list of goals. When Query is a single goal, the predicate returns in Values a list of couples V-W where V is a value of Arg for which Query succeeds in a particle in the last set of particles and W is the weight of the particle. For each element of Evidence, the particles are obtained by sampling Query in each current particle and weighting the particle by the likelihood of the evidence element.

When Query is a list of goals, Arg is a list of variables, one for each query of Query and Arg and Query must have the same length of Evidence. Values is then list of the same length of Evidence and each of its elements is a list of couples V-W where V is a value of the corresponding element of Arg for which the corresponding element of Query succeeds in a particle and W is the weight of the particle. For each element of Evidence, the particles are obtained by sampling the corresponding element of Query in each current particle and weighting the particle by the likelihood of the evidence element.

from kalman_filter.pl performs particle filtering for a Kalman filter with four observations. For each observation, the value of the state at the same time point is sampled. The list of samples is returned in [F1,F2,F3,F4], with each element being the sample for a time point.

Causal Inference

pita and mcintyre support causal reasoning, i.e., computing the effect of actions using the do-calculus [14].

Actions in this setting are represented as literals of action predicates, that must be declared as such with the directive

When performing causal reasoning, action literals must be enclosed in the do/1 functor and included in the evidence conjunction. More than one action can be included (each with in a separate do/1 term) and actions and observations can be freely mixed. All conditional inference goals can be used except those for particle filtering.

from simpson.swinb computes the probability of recovery of a patient given that the action of administering a drug has been performed.

Graphing the Results

In cplint on SWISH you can draw graphs for visualizing the results either with C3.js or with R. Similar predicates are avaiiable for the two methods. There are two types of graphs: those that represent individual probability values with a bar chart and those that visualize the results of sampling arguments.

Using C3.js

You can draw the probability of a query being true and being false as a bar chart with prob_bar(:Query:atom,-Probability:dict) as in

before :- pita. P will be instantiated with a dict for rendering with c3. It will be shown as a bar chart with a bar for the probability of heads(coin) true and a bar for the probability of heads(coin) false.

before :- pita. Solid edges indicate 1-children, dashed edges indicate 0-children and dotted edges indicate 0-children with negation applied to the sub BDD. Each level of the BDD is associated to a variable of the form XI_J indicated on the left: I indicates the multivalued variable index and J the index of the Boolean variable of rule I. The hexadecimal number in each node is part of its address in memory and is not significant. The table =Var= contains the associations between the rule groundings and the multivalued variables.

returns the BDD for the query heads(coin) and the list of associations between rule groundings and multivalued variables.

that returns in Chart a diagram with one bar for the number of successes and one bar for the number of failures.

that return in Chart a bar chart with a bar for each possible sampled value whose size is the number of samples returning that value.

Drawing a graph is particularly interesting when sampling values for continuous arguments of goals. In this case, you can use the samples to draw the probability density function of the argument. The predicate

draws a histogram of the samples in List dividing the domain in NBins bins. List must be a list of couples of the form [V]-W or V-W where V is a sampled value and W is its weight. This is the format of the list of samples returned by argument sampling predicates.

draws a line chart of the density the samples in List dividing the domain in NBins bins. List must be as for histogram/3.

is similar to density/3 except that you can specify the limits of the \(X\) axis.

draws a line chart of the density of two sets of samples, usually prior and post observations. The samples in PriorList and PostList can be either couples [V]-W or V-W where V is a value and W its weight. The lines are drawn dividing the domain in NBins bins.

from gauss_mean_est.pl takes 1000 samples of argument X of value(0,X) and draws the density of the samples using an histogram.

from gauss_mean_est.pl takes 1000 amples of argument X of value(0,X) before and after observing (value(1,9),value(2,8) and draws the prior and posterior densities of the samples using a line chart.

Using R

that works as histogram/3 but does not return the graph as an argument as the graph is printed with a different mechanism.

is like density/3 but does not require the number of bins in input, they are determined by R.

Parameters

The inference modules have a number of parameters in order to control their behavior. They can be set with the directive

after initialization (:-pita. or :-mc.) but outside :-begin/end_lpad. The current value can be read with

from the top-level. The available parameters common to both pita and mcintyre are:

If depth_bound is set to true, derivations are depth-bounded so you can query also programs containing infinite loops, for example programs where queries have an infinite number of explanations. However the probability that is returned is guaranteed only to be a lower bound, see for example markov_chaindb.pl

The example markov_chain.pl shows that mcintyre can perform inference in presence of an infinite number of explanations for the goal. Differently from pita, no depth bound is necessary, as the probability of selecting the infinite computation branch is 0. However, also mcintyre may not terminate if loops not involving probabilistic predicates are present.

If you want to set the seed of the random number generator, you can use SWI-Prolog predicates setrand/1 and getrand/1, see SWI-Prolog manual.

Tabling

You can also use tabling in inference to speed up the computation and/or avoid loops, see the SWI-Prolog manual.

To do so you have to use the tabling library module and declare some of the predicates as tabled. The tabling declarations go after the :-pita. or :- mc. directives.

For example, to compute the probability of paths in undirected graphs you can use the program (path_tabling.swinb)

This programs has loops so if you run the above query without tabling pita would loop forever.

Learning

Input

Preamble

At this point you can start setting parameters for SLIPCOVER such as for example

A parameter that is particularly important for both SLIPCOVER and LEMUR is verbosity: if set to 1, nothing is printed and learning is fastest, if set to 3 much information is printed and learning is slowest, 2 is in between. This ends the preamble.

Background and Initial LPAD/CPL-program

where the clauses must currently be deterministic. Alternatively, you can specify a set of clauses by including them in a section between :- begin_bg. and :- end_bg. For example

from the mach.pl example. If you specify both a bg/1 fact and a section, the clauses of the two will be combined.

The initial program is used in parameter learning for providing the structure. The indicated parameters do not matter as they are first randomized. Remember to enclose each clause in parentheses because :- has the highest precedence.

Alternatively, you can specify an input program in a section between :- begin_in. and :- end_in., as for example

If you specify both a in/1 fact and a section, the clauses of the two will be combined.

Language Bias

The language bias part contains the declarations of the input and output predicates. Output predicates are declared as

and indicate the predicate whose atoms you want to predict. Derivations for the atoms for this predicates in the input data are built by the system. These are the predicates for which new clauses are generated.

Input predicates are those whose atoms you are not interested in predicting. You can declare closed world input predicates with

For these predicates, the only true atoms are those in the interpretations and those derivable from them using the background knowledge, the clauses in the input/hypothesized program are not used to derive atoms for these predicates. Moreover, clauses of the background knowledge that define closed world input predicates and that call an output predicate in the body will not be used for deriving examples.

In this case, if a subgoal for such a predicate is encountered when deriving a subgoal for the output predicates, both the facts in the interpretations, those derivable from them and the background knowledge, the background clauses and the clauses of the input program are used.

Then, you have to specify the language bias by means of mode declarations in the style of Progol.

specifies the atoms that can appear in the body of clauses. <recall> can be an integer or *. <recall> indicates how many atoms for the predicate specification are retained in the bottom clause during a saturation step. * stands for all those that are found. Otherwise the indicated number is randomly chosen.

For SLIPCOVER, two specialization modes are available: bottom and mode. In the first, a bottom clause is built and the literals to be added during refinement are taken from it. In the latter, no bottom clause is built and the literals to be added during refinement are generated directly from the mode declarations. LEMUR has only specialization mode.

specifies that the argument should be an input variable of type <type>, i.e., a variable replacing a +<type> argument in the head or a -<type> argument in a preceding literal in the current hypothesized clause.

for specifying that the argument should be a output variable of type <type>. Any variable can replace this argument, either input or output. The only constraint on output variables is that those in the head of the current hypothesized clause must appear as output variables in an atom of the body.

for specifying an argument which should be replaced by a constant of type <type> in the bottom clause but should not be used for replacing input variables of the following literals when building the bottom clause or

for specifying an argument which should be replaced by a constant of type <type> in the bottom clause and that should be used for replacing input variables of the following literals when building the bottom clause.

Note that arguments of the form #<type> -#<type> are not available in specialization mode mode, if you want constants to appear in the literals you have to indicate them one by one in the mode declarations.

SLIPCOVER and LEMUR also require facts for the determination/2 Aleph-style predicate that indicate which predicates can appear in the body of clauses. For example

These mode declarations are used to generate clauses with more than two head atoms. In them, <s1>,...,<sn> are schemas, <a1>,...,<an> are atoms such that <ai> is obtained from \(\verb|<si>|\) by replacing placemarkers with variables, <Pi/Ari> are the predicates admitted in the body. <a1>,...,<an> are used to indicate which variables should be shared by the atoms in the head. An example of such a mode declaration (from uwcselearn.pl) is

If you want to specify negative literals for addition in the body of clauses, you should define a new predicate in the background as in

Note that successful negative literals do not instantiate the variables, so if you want a variable appearing in a negative literal to be an output variable you must instantiate before calling the negative literals. The new predicates must also be declared as input

In this case when a literal matching <literal> is added to the body of clause during refinement, then also the literals matching <list of literals> will be added. An example of such declaration (from muta.pl) is

Note that <list of literals> is copied with copy_term/2 before matching, so variables in common between <literal> and <list of literals> may not be in common in the refined clause.

It is also possible to specify that a literal can only be added together with other literals with facts of the form

In this case <literal> is added to the body of clause during refinement only together with literals matching <list of literals>. An example of such declaration is

Also here <list of literals> is copied with copy_term/2 before matching, so variables in common between <literal> and <list of literals> may not be in common in the refined clause.

In this case <literal> is added to the body of clause during refinement only together with literals matching <list of literals> and <list of literals> is not copied before matching, so variables in common between <literal> and <list of literals> are in common also in the refined clause. This is allowed only with specialization set to bottom. An example of such declaration is

Example Interpretations

The last part of the file contains the data. You can specify data with two modalities: models and keys. In the models type, you specify an example model (or interpretation or megaexample) as a list of Prolog facts initiated by begin(model(<name>)). and terminated by end(model(<name>)). as in

assigning a probability (0.3 in this case) to the interpretations. If this is omitted, the probability of each interpretation is considered equal to \(1/n\) where \(n\) is the total number of interpretations. prob/1 can be used to set a different multiplicity for the interpretations.

The facts in the interpretation are loaded in SWI-Prolog database by adding an extra initial argument equal to the name of the model. After each interpretation is loaded, a fact of the form int(<id>) is asserted, where id is the name of the interpretation. This can be used in order to retrieve the list of interpretations.

Alternatively, with the keys modality, you can directly write the facts and the first argument will be interpreted as a model identifier. The above interpretation in the keys modality is

which is contained in the bongardkeys.pl This is also how model 2 above is stored in SWI-Prolog database. The two modalities, models and keys, can be mixed in the same file. Facts for int/1 are not asserted for interpretations in the key modality but can be added by the user explicitly.

Note that you can add background knowledge that is not probabilistic directly to the file writing the clauses taking into account the model argument. For example (carc.pl) contains

that defines intensionally the target predicate party/1. Here M is the model and participant/4 is defined in the interpretations. You can also define intensionally the negative examples with

Then you must indicate how the examples are divided in folds with facts of the form: fold(<fold_name>,<list of model identifiers>), as for example

As the input file is a Prolog program, you can define intensionally the folds as in

which however must be inserted after the input interpretations otherwise the facts for int/1 will not be available and the fold all would be empty. This command uses sample(N,List,Sampled,Rest) exported from slipcover that samples N elements from List and returns the sampled elements in Sampled and the rest in Rest. If List has N elements or less, Sampled is equal to List and Rest is empty.

Commands

Parameter Learning

To execute EMBLEM, prepare an input file in the editor panel as indicated above and call

where <list of folds> is a list of the folds for training and P will contain the input program with updated parameters.

For example bongard.pl, you can perform parameter learning on the train fold with

Structure Learning

To execute SLIPCOVER, prepare an input file in the editor panel as indicated above and call

where <list of folds> is a list of the folds for training and P will contain the learned program.

For example bongard.pl, you can perform structure learning on the train fold with

A program can also be tested on a test set with test/7 or test_prob/6 as described below.

Between two executions of induce/2 you should exit SWI-Prolog to have a clean database.

To execute LEMUR, prepare an input file in the editor panel as indicated above and call

where <list of folds> is a list of the folds for training and P will contain the learned program.

For example bongard.pl, you can perform structure learning on the train fold with

A program can also be tested on a test set with test_lm/7 or test_prob_lm/6 that are LEMUR versions of the SLIPCOVER test predicates described below.

Between two executions of induce_lm/2 you should exit SWI-Prolog to have a clean database.

Testing

where <program> is a list of terms representing clauses and <list of folds> is a list of folds.

test/7 returns the log likelihood of the test examples in LL, the Area Under the ROC curve in AUCROC, a dictionary containing the list of points (in the form of Prolog pairs x-y) of the ROC curve in ROC, the Area Under the PR curve in AUCPR, a dictionary containing the list of points of the PR curve in PR.

test_prob/6 returns the log likelihood of the test examples in LL, the numbers of positive and negative examples in NPos and NNeg and the list ExampleList containing couples Prob-Ex where Ex is a for a a positive example and \+(a) for a a negative example and Prob is the probability of example a.

(from pack auc) that takes as input a list ExampleList of pairs probability-literal of the form that is returned by test_prob/6.

For example, to test on fold test the program learned on fold train you can run the query

in the code before :- sc. the curves will be shown as graphs using C3.js and the output program will be pretty printed.

You can also draw the curves in cplint on SWISH using R by loading library cplint_r with

that takes as input a list ExampleList of pairs probability-literal of the form that is returned by test_prob/6.

Parameters

Download Query Results through an API

The results of queries can also be downloaded programmatically by directly approaching the Pengine API. Example client code is available. For example, the swish-ask.sh client can be used with bash to download the results for a query in CSV. The call below downloads a CSV file for the coin example.

Results can be downloaded in JSON using the option --json-s or --json-html. With the first the output is in a simple string format where Prolog terms are sent using quoted write, the latter serialize responses as HTML strings. E.g.

Prolog can exploit the Pengine API directly. For example, the above can be called as:

Manual in PDF

Bibliography

1. Elena Bellodi and Fabrizio Riguzzi. 2011. EM over binary decision diagrams for probabilistic logic programs. Proceedings of the 26th italian conference on computational logic (CILC2011), pescara, italy, 31 august 31-2 september, 2011. Retrieved from http://www.ing.unife.it/docenti/FabrizioRiguzzi/Papers/BelRig-CILC11.pdf

2. Elena Bellodi and Fabrizio Riguzzi. 2011. EM over binary decision diagrams for probabilistic logic programs. Dipartimento di Ingegneria, Università di Ferrara, Italy. Retrieved from http://www.unife.it/dipartimento/ingegneria/informazione/informatica/rapporti-tecnici-1/CS-2011-01.pdf/view

3. Elena Bellodi and Fabrizio Riguzzi. 2013. Expectation Maximization over binary decision diagrams for probabilistic logic programs. Intelligent Data Analysis 17, 2: 343–363. Retrieved from http://ds.ing.unife.it/~friguzzi/Papers/BelRig13-IDA-IJ.pdf

4. Elena Bellodi and Fabrizio Riguzzi. 2015. Structure learning of probabilistic logic programs by searching the clause space. Theory and Practice of Logic Programming 15, 2: 169–212. Retrieved from http://arxiv.org/abs/1309.2080

5. William W. Cohen. 1995. Pac-learning non-recursive prolog clauses. Artif. Intell. 79, 1: 1–38.

6. L. De Raedt and W. Van Laer. 1995. Inductive constraint logic. Proceedings of the 6th conference on algorithmic learning theory (alt 1995), Springer, 80–94.

7. L. De Raedt, A. Kimmig, and H. Toivonen. 2007. ProbLog: A probabilistic Prolog and its application in link discovery. International joint conference on artificial intelligence, 2462–2467.

8. Nicola Di Mauro, Elena Bellodi, and Fabrizio Riguzzi. 2015. Bandit-based Monte-Carlo structure learning of probabilistic logic programs. Mach. Learn. 100, 1: 127–156. http://doi.org/10.1007/s10994-015-5510-3

9. Robert M Fung and Kuo-Chu Chang. 1990. Weighing and integrating evidence for stochastic simulation in bayesian networks. Fifth annual conference on uncertainty in artificial intelligence, North-Holland Publishing Co., 209–220.

10. Muhammad Asiful Islam, CR Ramakrishnan, and IV Ramakrishnan. 2012. Inference in probabilistic logic programs with continuous random variables. tplp_j 12, Special Issue 4-5: 505–523. http://doi.org/10.1017/S1471068412000154

11. D. Koller and N. Friedman. 2009. Probabilistic graphical models: Principles and techniques. MIT Press, Cambridge, MA.

12. Arun Nampally and CR Ramakrishnan. 2014. Adaptive mcmc-based inference in probabilistic logic programs. arXiv preprint arXiv:1403.6036. Retrieved from http://arxiv.org/pdf/1403.6036.pdf

13. Davide Nitti, Tinne De Laet, and Luc De Raedt. 2016. Probabilistic logic programming for hybrid relational domains. Mach. Learn. 103, 3: 407–449. http://doi.org/10.1007/s10994-016-5558-8

14. J. Pearl. 2000. Causality. Cambridge University Press.

15. David Poole. 1997. The independent choice logic for modelling multiple agents under uncertainty. Artificial Intelligence 94, 1-2: 7–56.

16. Fabrizio Riguzzi and Terrance Swift. 2010. Tabling and Answer Subsumption for Reasoning on Logic Programs with Annotated Disjunctions. Technical communications of the international conference on logic programming, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 162–171. http://doi.org/10.4230/LIPIcs.ICLP.2010.162

17. Fabrizio Riguzzi. 2013. MCINTYRE: A Monte Carlo system for probabilistic logic programming. Fundamenta Informaticae 124, 4: 521–541. Retrieved from http://ds.ing.unife.it/~friguzzi/Papers/Rig13-FI-IJ.pdf

18. Fabrizio Riguzzi. 2015. The distribution semantics is well-defined for all normal programs. Proceedings of the 2nd international workshop on probabilistic logic programming (plp), Sun SITE Central Europe, 69–84. Retrieved from http://ceur-ws.org/Vol-1413/#paper-06

19. Taisuke Sato and Yoshitaka Kameya. 1997. PRISM: A language for symbolic-statistical modeling. International joint conference on artificial intelligence, 1330–1339.

20. Taisuke Sato and Yoshitaka Kameya. 2001. Parameter learning of logic programs for symbolic-statistical modeling. J. Artif. Intell. Res. 15: 391–454.

21. J. Vennekens, S. Verbaeten, and M. Bruynooghe. 2004. Logic programs with annotated disjunctions. International conference on logic programming, Springer, 195–209.

22. John Von Neumann. 1951. Various techniques used in connection with random digits. Nat. Bureau Stand. Appl. Math. Ser. 12: 36–38.