This library deals with the analysis and construction of a URL, Universal Resource Locator. URL is the basis for communicating locations of resources (data) on the web. A URL consists of a protocol identifier (e.g. HTTP, FTP, and a protocol-specific syntax further defining the location. URLs are standardized in RFC-1738.
The implementation in this library covers only a small portion of the defined protocols. Though the initial implementation followed RFC-1738 strictly, the current is more relaxed to deal with frequent violations of the standard encountered in practical use.
<Action> <Location> HTTP/<version>
Location | Atom or list of character codes. |
url:
, an
identifier separated from the remainder of the URL using :.
parse_url/2 assumes the http
protocol if no protocol is specified and the URL can be
parsed as a valid HTTP url. In addition to the RFC-1738 specified
protocols, the file
protocol is supported as well.
\
arg{Host}. This only
appears if the port is explicitly specified in the URL.
Implicit default ports (e.g., 80 for HTTP) do not appear in the
part-list.
ftp
, http
and file
protocols. If
no path appears, the library generates the path /
.
?
, normally used to transfer data from HTML forms that use
the HTTP GET method. In the URL it consists of a
www-form-encoded list of Name=Value pairs. This is mapped to a list of
Prolog Name=Value terms with decoded names and values.
#
character.
The example below illustrates all of this for an HTTP URL.
?- parse_url('http://www.xyz.org/hello?msg=Hello+World%21#x', P). P = [ protocol(http), host('www.xyz.org'), fragment(x), search([ msg = 'Hello World!' ]), path('/hello') ]
By instantiating the parts-list this predicate can be used to create a URL.
alnum
(see code_type/2)),
and one of "-._~
" using percent encoding. Newline is mapped
to %OD%OA
. When decoding, newlines appear as a single
newline (10) character.
Note that a space is encoded as %20
instead of +
.
Decoding decodes both to a space.
utf8
.
The only other defined value is iso_latin_1
.
application/x-www-form-urlencoded
as used in HTTP GET
requests.//
URL.