% This LaTeX document was generated using the LaTeX backend of PlDoc, % The SWI-Prolog documentation system \subsection{library(pure_input): Pure Input from files and streams} \label{sec:pureinput} \begin{tags} \tag{To be done} Provide support for alternative input readers, e.g. reading terms, tokens, etc. \end{tags} This module is part of \file{pio.pl}, dealing with \textit{pure} \textit{input}: processing input streams from the outside world using pure predicates, notably grammar rules (DCG). Using pure predicates makes non-deterministic processing of input much simpler. Pure input uses attributed variables to read input from the external source into a list \textit{on demand}. The overhead of lazy reading is more than compensated for by using block reads based on \predref{read_pending_codes}{3}. Ulrich Neumerkel came up with the idea to use coroutining for creating a \textit{lazy list}. His implementation repositioned the file to deal with re-reading that can be necessary on backtracking. The current implementation uses destructive assignment together with more low-level attribute handling to realise pure input on any (buffered) stream.\vspace{0.7cm} \begin{description} \predicate[nondet]{phrase_from_file}{2}{:Grammar, +File} Process the content of \arg{File} using the DCG rule \arg{Grammar}. The space usage of this mechanism depends on the length of the not committed part of \arg{Grammar}. Committed parts of the temporary list are reclaimed by the garbage collector, while the list is extended on demand due to unification of the attributed tail variable. Below is an example that counts the number of times a string appears in a file. The library dcg/basics provides \dcgref{string}{1} matching an arbitrary string and \dcgref{remainder}{1} which matches the remainder of the input without parsing. \begin{code} :- use_module(library(dcg/basics)). file_contains(File, Pattern) :- phrase_from_file(match(Pattern), File). match(Pattern) --> string(_), string(Pattern), remainder(_). match_count(File, Pattern, Count) :- aggregate_all(count, file_contains(File, Pattern), Count). \end{code} This can be called as (note that the pattern must be a string (code list)): \begin{code} ?- match_count('pure_input.pl', `file`, Count). \end{code} \predicate[nondet]{phrase_from_file}{3}{:Grammar, +File, +Options} As \predref{phrase_from_file}{2}, providing additional \arg{Options}. \arg{Options} are passed to \predref{open}{4}. \predicate{phrase_from_stream}{2}{:Grammar, +Stream} Run Grammer against the character codes on \arg{Stream}. \arg{Stream} must be buffered. \dcg{syntax_error}{1}{+Error} Throw the syntax error \arg{Error} at the current location of the input. This predicate is designed to be called from the handler of \predref{phrase_from_file}{3}. \begin{tags} \tag{throws} \verb$error(syntax_error(Error), Location)$ \end{tags} \dcg[det]{lazy_list_location}{1}{-Location} Determine current (error) location in a lazy list. True when \arg{Location} is an (error) location term that represents the current location in the DCG list. \begin{arguments} \arg{Location} & is a term \verb$file(Name, Line, LinePos, CharNo)$ or \verb$stream(Stream, Line, LinePos, CharNo)$ if no file is associated to the stream RestLazyList. Finally, if the Lazy list is fully materialized (ends in \verb$[]$), \arg{Location} is unified with \verb$end_of_file-CharCount$. \\ \end{arguments} \begin{tags} \tag{See also} \dcgref{lazy_list_character_count}{1} only provides the character count. \end{tags} \dcg{lazy_list_character_count}{1}{-CharCount} True when \arg{CharCount} is the current character count in the Lazy list. The character count is computed by finding the distance to the next frozen tail of the lazy list. \arg{CharCount} is one of: \begin{shortlist} \item An integer \item A term end_of_file-Count \end{shortlist} \begin{tags} \tag{See also} \dcgref{lazy_list_location}{1} provides full details of the location for error reporting. \end{tags} \predicate[det]{stream_to_lazy_list}{2}{+Stream, -List} Create a lazy list representing the character codes in \arg{Stream}. \arg{List} is a partial list ending in an attributed variable. Unifying this variable reads the next block of data. The block is stored with the attribute value such that there is no need to re-read it. \begin{tags} \tag{Compatibility} Unlike the previous version of this predicate this version does not require a repositionable stream. It does require a buffer size of at least the maximum number of bytes of a multi-byte sequence (6). \end{tags} \end{description}