# The Scrape data source
The `scrape` data source provides access to data from HTML or XML web pages. It takes the following arguments:
1. The *url* to scrape from
2. The *projection*, binding variable names to variables as a _dict_,
3. A variable that is bound to the *DOM* of the page
4. A Prolog _body term_ encloses in `{}` that is used to translate the
DOM into a series of facts. This often uses xpath/3.
5. An option list that is used for http_open/3 for the HTTP request and the load_structure/3 predicate used to realise the DOM.
The data source contains all answers of the _body term_, typically applied on `DOM`, using the column names and values as defined by the _projection_.
## Example
Below is an example that scrapes an HTML table and a query that reproduces the add-on download table.
:- use_module(library(xpath)).
:- data_source(addon,
scrape('http://www.swi-prolog.org/pack/list',
_{name:Name, version:Version, downloads:Downloads,
title:Title},
DOM,
{ xpath(DOM, //table(@class=packlist), Table),
xpath(Table, //tr, Row),
xpath(Row, td/a(text), Name),
xpath(Row, td(@class='pack-version', text), Version),
xpath(Row, td(@class='pack-downloads', self),
element(td, _, [DownloadsAtom|_])),
atom_number(DownloadsAtom, Downloads),
xpath(Row, td(@class='pack-title', text), Title)
},
[])).
order_by([desc(Downloads)],
addon{name:Name, version:Version, downloads:Downloads, title:Title}).