\documentclass[11pt]{article}
\usepackage{times}
\usepackage{pl}
\usepackage{html}
\usepackage{plpage}
\sloppy
\makeindex

\onefile
\htmloutput{.}					% Output directory
\htmlmainfile{plunit}				% Main document file
\bodycolor{white}				% Page colour

\renewcommand{\runningtitle}{Prolog Unit Tests}

\begin{document}

\title{Prolog Unit Tests}
\author{Jan Wielemaker \\
	University of Amsterdam \\
	VU University Amsterdam \\
	The Netherlands \\
	E-mail: \email{jan@swi-prolog.org}}

\maketitle

\begin{abstract}
This document describes a Prolog unit-test framework. This framework was
initially developed for \href{http://www.swi-prolog.org}{SWI-Prolog}.
The current version also runs on
\href{http://www.sics.se/sicstus/}{SICStus Prolog}, providing a portable
testing framework. See \secref{sicstus}.
\end{abstract}

\pagebreak
\tableofcontents

\vfill
\vfill

\newpage

\section{Introduction}
\label{sec:plunit-intro}

There is really no excuse not to write tests!

Automatic testing of software during development is probably the most
important Quality Assurance measure. Tests can validate the final
system, which is nice for your users.  However, most (Prolog) developers
forget that it is not just a burden during development.

\begin{itemize}
    \item Tests document how the code is supposed to be used.
    \item Tests can validate claims you make on the Prolog
          implementation.  Writing a test makes the claim
	  explicit.
    \item Tests avoid big applications saying `No' after
          modifications.  This saves time during development,
	  and it saves \emph{a lot} of time if you must return
	  to the application a few years later or you must
	  modify and debug someone else's application.
\end{itemize}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{A Unit Test box}
\label{sec:unitbox}

Tests are written in pure Prolog and enclosed within the directives
begin_tests/1,2 and end_tests/1. They can be embedded inside a normal
source module, or be placed in a separate test-file that loads the files
to be tested. Code inside a test box is normal Prolog code. The
entry points are defined by rules using the head \term{test}{Name} or
\term{test}{Name, Options}, where \arg{Name} is a ground term and
\arg{Options} is a list describing additional properties of the test.
Here is a very simple example:

\begin{code}
:- begin_tests(lists).
:- use_module(library(lists)).

test(reverse) :-
	reverse([a,b], [b,a]).

:- end_tests(lists).
\end{code}

The optional second argument of the test-head defines additional processing
options.  Defined options are:

\begin{description}
    \termitem{blocked}{+Reason:atom}
The test is currently disabled.   Tests are flagged as blocked if they
cannot be run for some reason.  E.g.\ they crash Prolog, they rely on
some service that is not available, they take too much resources, etc.
Tests that fail but do not crash, etc.\ should be flagged using
\term{fixme}{Fixme}.

    \termitem{fixme}{+Reason:atom}
Similar to \term{blocked}{Reason}, but the test it executed anyway.  If
it fails, a \const{-} is printed instead of the \const{.} character.  If
it passes a \const{+} and if it passes with a choicepoint, \const{!}.
A summary is printed at the end of the test run and the goal
\term{test_report}{fixme} can be used to get details.

    \termitem{condition}{:Goal}
Pre-condition for running the test.  If the condition fails
the test is skipped.  The condition can be used as an alternative
to the \const{setup} option.  The only difference is that failure
of a condition skips the test and is considered an error when using
the \const{setup} option.

    \termitem{cleanup}{:Goal}
\arg{Goal} is always called after completion of the test-body,
regardless of whether it fails, succeeds or throws an exception.  This
option or call_cleanup/2 must be used by tests that require side-effects
that must be reverted after the test completes.  \arg{Goal} may share
variables with the test body.

\begin{code}
create_file(Tmp) :-
	tmp_file(plunit, Tmp),
	open(Tmp, write, Out),
	write(Out, 'hello(World).\n'),
	close(Out).

test(read, [ setup(create_file(Tmp)),
	     cleanup(delete_file(Tmp))
	   ]) :-
	read_file_to_terms(Tmp, Terms, []),
	Term = hello(_).
\end{code}

    \termitem{setup}{:Goal}
\arg{Goal} is run before the test-body.  Typically used together with
the \const{cleanup} option to create and destroy the required execution
environment.

    \termitem{forall}{:Generator}
Run the same test for each solution of \arg{Generator}. Each run invokes
the setup and cleanup handlers. This can be used to run the same test
with different inputs.  If an error occurs, the test is reported as
\mbox{\texttt{name (forall bindings = } <vars> \texttt{)}}, where
<vars> indicates the bindings of variables in \arg{Generator}.

    \termitem{true}{AnswerTerm Cmp Value}
Body should succeed deterministically. If a choicepoint is left open, a
warning is printed to STDERR ("Test succeeded with choicepoint"). That
warning can be suppressed by adding the \term{nondet} keyword. \arg{AnswerTerm}
is compared to \arg{Value} using the comparison operator \arg{Cmp}. \arg{Cmp}
is typically one of =/2, ==/2, =:=/2 or =@=/2,%
    \footnote{The =@= predicate (denoted \emph{structural equivalence})
	      is the same as variant/2 in SICStus.}
but any test can be used. This is the same as inserting the test at the
end of the conjunction, but it allows the test engine to distinguish
between failure of copy_term/2 and producing the wrong value. Multiple
variables must be combined in an arbitrary compound term. E.g.\
\verb$A1-A2 == v1-v2$

\begin{code}
test(copy, [ true(Copy =@= hello(X,X))
	   ]) :-
	copy_term(hello(Y,Y), Copy).
\end{code}

    \termitem{AnswerTerm Cmp Value}
Equivalent to \term{true}{AnswerTerm Cmp Value} if \arg{Cmp} is one
of the comparison operators given above.

    \termitem{fail}{}
Body must fail.

    \termitem{throws}{Error}
Body must throw \arg{Error}. The thrown error term is matched against
term \arg{Error} using \term{subsumes_term}{Error, ThrownError}. I.e., the
thrown error must be more specific than the specified \arg{Error}.  See
subsumes_term/2.

    \termitem{error}{Error}
Body must throw \term{error}{Error, _Context}.  See keyword \const{throws}
(as well as predicate throw/1 and library(error)) for details.

    \termitem{all}{AnswerTerm Cmp Instances}
Similar to \term{true}{AnswerTerm Cmp Values}, but used for non-deterministic
predicates. Each element is compared using \arg{Cmp}. Order matters. For
example:

\begin{code}
test(or, all(X == [1,2])) :-
	( X = 1 ; X = 2 ).
\end{code}

    \termitem{set}{AnswerTerm Cmp Instances}
Similar to \term{all}{AnswerTerm Cmp Instances}, but before testing both
the bindings of \arg{AnswerTerm} and \arg{Instances} are sorted using
sort/2. This removes duplicates and places both sets in the same
order.\footnote{The result is only well-defined of \arg{Cmp} is
\texttt{==}.}

    \termitem{nondet}{}
If this keyword appears in the option list, non-deterministic success
of the body is not considered an error.

    \termitem{occurs_check}{Mode}
Run the test in a particular \jargon{occurs check mode}.  Mode is
one of \const{false} (default), \const{true} or \const{error}.  See
the Prolog flag \prologflag{occurs_check} for details.

\end{description}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Test Unit options}
\label{sec:unitoptions}

\begin{description}
    \predicate{begin_tests}{1}{+Name}
Start named test-unit.  Same as \verb$begin_tests(Name, [])$.

    \predicate{begin_tests}{2}{+Name, +Options}
Start named test-unit with options. Options provide conditional
processing, setup and cleanup similar to individual tests (second
argument of test/2 rules).

Defined options are:
    \begin{description}
	\termitem{blocked}{+Reason}
Test-unit has been blocked for the given \arg{Reason}.

        \termitem{condition}{:Goal}
Executed before executing any of the tests.  If \arg{Goal} fails,
the test of this unit is skipped.

        \termitem{setup}{:Goal}
Executed before executing any of the tests.

        \termitem{cleanup}{:Goal}
Executed after completion of all tests in the unit.

	\termitem{occurs_check}{+Mode}
Specify default for subject-to-occurs-check mode. See \secref{unitbox}
for details on the \const{occurs_check} option.
    \end{description}
\end{description}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Writing the test body}
\label{sec:testbody}

The test-body is ordinary Prolog code. Without any options, the body
must be designed to succeed \emph{deterministically}. Any other result
is considered a failure. One of the options \const{fail}, \const{true},
\const{throws}, \const{all} or \const{set} can be used to specify a
different expected result. See \secref{unitbox} for details.  In this
section we illustrate typical test-scenarios by testing SWI-Prolog
built-in and library predicates.

\subsubsection{Testing deterministic predicates}
\label{sec:testdet}

Deterministic predicates are predicates that must succeed exactly once
and, for well behaved predicates, leave no choicepoints. Typically they
have zero or more input- and zero or more output arguments. The test
goal supplies proper values for the input arguments and verifies the
output arguments. Verification can use test-options or be explicit in
the body.  The tests in the example below are equivalent.

\begin{code}
test(add) :-
	A is 1 + 2,
	A =:= 3.

test(add, [true(A =:= 3)]) :-
	A is 1 + 2.
\end{code}

The test engine verifies that the test-body does not leave a
choicepoint.  We illustrate that using the test below:

\begin{code}
test(member) :-
	member(b, [a,b,c]).
\end{code}

Although this test succeeds, member/2 leaves a choicepoint which is
reported by the test subsystem.  To make the test silent, use one of
the alternatives below.

\begin{code}
test(member) :-
	member(b, [a,b,c]), !.

test(member, [nondet]) :-
	member(b, [a,b,c]).
\end{code}

\subsubsection{Testing semi-deterministic predicates}
\label{sec:testsemidet}

Semi-deterministic predicates are predicates that either fail or succeed
exactly once and, for well behaved predicates, leave no choicepoints.
Testing such predicates is the same as testing deterministic
predicates.  Negative tests must be specified using the option
\const{fail} or by negating the body using \verb$\+/1$.

\begin{code}
test(is_set) :-
	\+ is_set([a,a]).

test(is_set, [fail]) :-
	is_set([a,a]).
\end{code}


\subsubsection{Testing non-deterministic predicates}
\label{sec:testnondet}

Non-deterministic predicates succeed zero or more times.  Their results
are tested either using findall/3 or setof/3 followed by a value-check
or using the \const{all} or \const{set} options.  The following are
equivalent tests:

\begin{code}
test(member) :-
	findall(X, member(X, [a,b,c]), Xs),
	Xs == [a,b,c].

test(member, all(X == [a,b,c])) :-
	member(X, [a,b,c]).
\end{code}

\subsubsection{Testing error conditions}
\label{sec:testerror}

Error-conditions are tested using the option \term{throws}{Error} or
by wrapping the test in a catch/3.  The following tests are equivalent:

\begin{code}
test(div0) :-
     catch(A is 1/0, error(E, _), true),
     E =@= evaluation_error(zero_divisor).

test(div0, [error(evaluation_error(zero_divisor))]) :-
     A is 1/0.
\end{code}


\subsubsection{One body with multiple tests using assertions}
\label{sec:testassertion}

PlUnit is designed to cooperate with the assertion/1 test provided by
library(debug).\footnote{This integration was suggested by G\"unter
Kniesel.} If an assertion fails in the context of a test, the test
framework reports this and considers the test failed, but does not trap
the debugger. Using assertion/1 in the test-body is attractive for two
scenarios:

\begin{itemize}
    \item Confirm that multiple claims hold.  Where multiple claims
	  about variable bindings can be tested using the == option
	  in the test header, arbitrary boolean tests, notably about
	  the state of the database, are harder to combine.  Simply
	  adding them in the body of the test has two disadvantages:
	  it is less obvious to distinguish the tested code from the
	  test and if one of the tests fails there is no easy way to
	  find out which one.
    \item Testing `scenarios' or sequences of actions.  If one step
          in such a sequence fails there is again no easy way to find
	  out which one.  By inserting assertions into the sequence
	  this becomes obvious.
\end{itemize}

Below is a simple example, showing two failing assertions. The first
line of the failure message gives the test. The second reports the
location of the assertion.\footnote{If known. The location is determined
by analysing the stack. The second failure shows a case where this does
not work because last-call optimization has already removed the context
of the test-body.} If the assertion call originates from a different
file this is reported appropriately. The last line gives the actually
failed goal.

\begin{code}
:- begin_tests(test).

test(a) :-
	A is 2^3,
	assertion(float(A)),
	assertion(A == 9).

:- end_tests(test).
\end{code}

\begin{code}
?- run_tests.
% PL-Unit: test
ERROR: /home/jan/src/pl-devel/linux/t.pl:5:
	test a: assertion at line 7 failed
	Assertion: float(8)
ERROR: /home/jan/src/pl-devel/linux/t.pl:5:
	test a: assertion failed
	Assertion: 8==9
. done
% 2 assertions failed
\end{code}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Using separate test files}
\label{sec:testfiles}

Test-units can be embedded in normal Prolog source-files. Alternatively,
tests for a source-file can be placed in another file alongside the file
to be tested. Test files use the extension \fileext{plt}. The predicate
load_test_files/1 can load all files that are related to source-files
loaded into the current project.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Running the test-suite}
\label{sec:pldoc-running}

\subsection{Running the test suite from Prolog}
\label{sec:pldoc-running-repl}

To run tests from the Prolog prompt, first load the program and
then run run_tests/0 or \term{run_tests}{+Unit}.

\begin{description}
    \predicate{run_tests}{0}{}
Run all test-units.

    \predicate{run_tests}{1}{+Spec}
Run only the specified tests.  \arg{Spec} can be a list to run multiple
tests.  A single specification is either the name of a test unit or
a term <Unit>:<Tests>, running only the specified test.  <Tests> is
either the name of a test or a list of names. Running particular
tests is particularly useful for tracing a test:%
\footnote{Unfortunately the body of the test is called through
meta-calling, so it cannot be traced. The called user-code can be traced
normally though.}

\begin{code}
?- gtrace, run_tests(lists:member).
\end{code}
\end{description}

To identify nonterminating tests, interrupt the looping process with
\emph{Control-C}. The test name and location will be displayed.

\subsection{Running the test suite from the command line}
\label{sec:pldoc-running-cli}

To run a file's tests from the command line, run the following command,
replacing \file{your/file.pl} with the path to your file.

\begin{code}
swipl -g run_tests -t halt your/file.pl
\end{code}

Prolog will (1) load the file you specify, as well as any modules it
depends on; (2) run all tests in those files, and (3) exit with status 0
or 1 depending on whether the test suite succeeds or fails.

If you want to test multiple files, you can pass multiple \fileext{.pl}
files.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Tests and production systems}
\label{sec:state}

Most applications do not want the test-suite to end up in the
final application.  There are several ways to achieve this.  One
is to place all tests in separate files and not to load the tests
when creating the production environment.  Alternatively, use the
directive below before loading the application.

\begin{code}
:- set_test_options([load(never)]).
\end{code}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Controlling the test suite}
\label{sec:options}

\begin{description}
    \predicate{set_test_options}{1}{+Options}
Defined options are:

\begin{description}
    \termitem{load}{+Load}
Determines whether or not tests are loaded. When \const{never},
everything between begin_tests/1 and end_tests/1 is simply ignored.
When \const{always}, tests are always loaded.  Finally, when using
the default value \const{normal}, tests are loaded if the code is
not compiled with optimisation turned on.

    \termitem{run}{+Run}
Specifies when tests are run. Using \const{manual}, tests can only be
run using run_tests/0 or run_tests/1. Using \const{make}, tests will be
run for reloaded files, but not for files loaded the first time. Using
\const{make(all)} make/0 will run all test-suites, not only those that
belong to files that are reloaded.

    \termitem{format}{+Format}
Currently one of \const{tty} (default if there is a console) or
\const{log}. \const{tty} uses terminal control to overwrite successful
tests, allowing the user to see the currently running tests and output
from failed tests. This is the default of the output is a tty.
\const{log} prints a full log of the executed tests and their result and
is intended for non-interactive usage.

    \termitem{output}{+When}
If \const{always}, emit all output as it is produced, if \const{never},
suppress all output and if \const{on_failure}, emit the output if the
test fails.

    \termitem{show_blocked}{+Bool}
Show individual blocked tests during the report.

    \termitem{occurs_check}{+Mode}
Defines the default for the \const{occurs_check} flag during testing.

    \termitem{cleanup}{+Bool}
If \const{true} (default \const{false}), cleanup report at the end of
run_tests/1. Used to improve cooperation with memory debuggers such as
dmalloc.

    \termitem{jobs}{Num}
Number of jobs to use for concurrent testing. Default is one, implying
sequential testing.

    \termitem{timeout}{+Seconds}
Set timeout for each individual test. This acts as a default that may be
overuled at the level of units or individual tests. A timeout of 0 or
negative is handled as \emph{inifinite}.
\end{description}

    \predicate{load_test_files}{1}{+Options}
Load \fileext{plt} test-files that belong to the currently loaded
sources.

    \predicate{running_tests}{0}{}
Print all currently running tests to the terminal.  It can be used
to find running thread in multi-threaded test operation or find the
currently running test if a test appears to be blocking.

    \predicate{test_report}{1}{+What}
Print report on the executed tests.  \arg{What} defines the type
of report.  Currently this only supports \const{fixme}, providing
details on how the fixme-flagged tests proceeded.
\end{description}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Auto-generating tests}
\label{sec:wizard}

Prolog is an interactive environment. Where users of non-interactive
systems tend to write tests as code, Prolog developers tend to run
queries interactively during development. This interactive testing is
generally faster, but the disadvantage is that the tests are lost at the
end of the session. The test-wizard tries to combine the advantages. It
collects toplevel queries and saves them to a specified file.  Later,
it extracts these queries from the file and locates the predicates that
are tested by the queries.  It runs the query and creates a test clause
from the query.

Auto-generating test cases is experimentally supported through the
library \pllib{test_wizard}. We briefly introduce the functionality
using examples. First step is to log the queries into a file. This is
accomplished with the commands below. \file{Queries.pl} is the name in
which to store all queries. The user can choose any filename for this
purpose.   Multiple Prolog instances can share the same name, as data
is appended to this file and write is properly locked to avoid file
corruption.

\begin{code}
:- use_module(library(test_wizard)).
:- set_prolog_flag(log_query_file, 'Queries.pl').
\end{code}

Next, we will illustrate using the library by testing the predicates
from library \pllib{lists}.  To generate test cases we just make calls
on the terminal.  Note that all queries are recorded and the system will
select the appropriate ones when generating the test unit for a
particular module.

\begin{code}
?- member(b, [a,b]).
Yes
?- reverse([a,b], [b|A]).
A = [a] ;
No
\end{code}

Now we can generate the test-cases for the module list using
make_tests/3:

\begin{code}
?- make_tests(lists, 'Queries.pl', current_output).
:- begin_tests(lists).

test(member, [nondet]) :-
        member(b, [a, b]).
test(reverse, [true(A==[a])]) :-
        reverse([a, b], [b|A]).

:- end_tests(lists).
\end{code}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\input{testcover}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Portability of the test-suite}
\label{sec:porting}

One of the reasons to have tests is to simplify migrating code between
Prolog implementations. Unfortunately creating a portable test-suite
implies a poor integration into the development environment. Luckily,
the specification of the test-system proposed here can be ported quite
easily to most Prolog systems sufficiently compatible to SWI-Prolog to
consider porting your application. Most important is to have support for
term_expansion/2.

In the current system, test units are compiled into sub-modules of the
module in which they appear.  Few Prolog systems allow for sub-modules
and therefore ports may have to fall-back to inject the code in the
surrounding module.  This implies that support predicates used inside
the test unit should not conflict with predicates of the module being
tested.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{PlUnit on SICStus}
\label{sec:sicstus}

The directory of \file{plunit.pl} and \file{swi.pl} must be in the
\const{library} search-path.  With PLUNITDIR replaced accordingly,
add the following into your \file{.sicstusrc} or \file{sicstus.ini}.

\begin{code}
:- set_prolog_flag(language, iso). % for maximal compatibility
library_directory('PLUNITDIR').
\end{code}

The current version runs under SICStus 3.  Open issues:

\begin{itemize}

    \item Some messages are unformatted because SICStus 3 reports
          all ISO errors as instantiation errors.

    \item Only \file{plunit.pl}.  Both coverage analysis and the test
	  generation wizard currently require SWI-Prolog.

    \item The \const{load} option \const{normal} is the same as \const{always}.
	  Use \exam{set_test_options(load, never)} to avoid loading the
	  test suites.

    \item The \const{run} option is not supported.

    \item Tests are loaded into the enclosing module instead of a separate
          test module. This means that predicates in the test module must
	  not conflict with the enclosing module, nor with other test
	  modules loaded into the same module.
\end{itemize}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Motivation of choices}
\label{sec:plunit-motivation}

\subsection*{Easy to understand and flexible}

There are two approaches for testing. In one extreme the tests are
written using declarations dealing with setup, cleanup, running and
testing the result. In the other extreme a test is simply a Prolog goal
that is supposed to succeed. We have chosen to allow for any mixture of
these approaches. Written down as test/1 we opt for the simple
succeeding goal approach. Using options to the test the user can choose
for a more declarative specification.  The user can mix both approaches.

The body of the test appears at the position of a clause-body. This
simplifies identification of the test body and ensures proper layout and
colouring support from the editor without the need for explicit support
of the unit test module. Only clauses of test/1 and test/2 may be marked
as non-called in environments that perform cross-referencing.

%\subsection*{Well integrated}

\printindex

\end{document}