diff --git a/03-discovery-specification.bs b/03-discovery-specification.bs
index 49754e7..5b41bd7 100644
--- a/03-discovery-specification.bs
+++ b/03-discovery-specification.bs
@@ -11,18 +11,18 @@ Mailing List: public-treecg@w3.org
Mailing List Archives: https://lists.w3.org/Archives/Public/public-treecg/
Editor: Pieter Colpaert, https://pietercolpaert.be
Abstract:
- This specification defines how a client can find specific search trees of interest, as well as list the context information.
+ This specification defines how a client selects a specific dataset and search tree, as well as extracts relevant context information.
-# The overview # {#overview}
+# Definitions # {#overview}
-A tree:Collection
is a subclass of dcat:Dataset
([[!vocab-dcat-3]]).
+A `tree:Collection` is a subclass of `dcat:Dataset` ([[!vocab-dcat-3]]).
The specialization being that this particular dataset is a collection of _members_.
-A tree:SearchTree
is a subClassOf dcat:Distribution
.
+A `tree:SearchTree` is a subClassOf `dcat:Distribution`.
The specialization being that it uses the main TREE specification to publish a search tree.
-A node from which all other nodes can be found is a `tree:RootNode`, which MAY be explicitely typed as such.
+A node from which all other nodes can be found is a `tree:RootNode`.
Note: The `tree:SearchTree` and the `tree:RootNode` MAY be identified by the same IRI when no disambiguation is needed.
@@ -30,70 +30,95 @@ A TREE client MUST be provided with a URL to start from, which we call the _entr
# Initializing a client with a url # {#starting-from}
-The goal of the client is to understand what `tree:Collection` it is using, and to find a `tree:RootNode` or search form to start the traversal phase from.
+The goal of the client is to understand what `tree:Collection` it is using, and to find a `tree:RootNode` to start the traversal phase from.
+This discovery specification extends the initialization step in the TREE specification for the cases in which multiple options are possible.
-```
-IN: E: a URL of the entrypoint
-OUT: N: tree:RootNode IRI and/or S: search form
- ```
+The client MUST dereference the URL, which will result in a set of quads. The client now MUST first perform the init step from the main specification.
+If that did not return any result, then the client MUST check whether the URL before redirects (`E`) has been used in one of the following discovery patterns described in the subsections:
+ 1. `E` is a `tree:Collection`: then the client needs to [select the right search tree](#tree-search-trees)
+ 2. `E` is a `dcat:Dataset`: then the client needs to [select the right distribution or dataservice from a catalog](#dcat-dataset)
+ 3. `E` is a `ldes:EventStream`: then the client MAY take into account [LDES specific properties](#ldes)
+ 4. `E` is a `dcat:Distribution`: then the client needs to [process it accordingly](#dcat-distribution)
+ 5 `E` is a `dcat:DataService`: then the client needs to [process it accordingly](#dcat-dataservice)
+ 6. `E` is a catalog or is not explicitly mentioned: then it needs to select a dataset based on [shape information](#tree-collection-shapes) and [DCAT Catalog information](#dcat-catalog)
-The client MUST dereference the URL, which will result in a set of quads.
-When the URL given to the TREE client, after all redirects, is used in a triple ex:C tree:view <> .
, a client MUST assume the URL after redirects (`E'`) is an identifier of the intended `tree:RootNode` of the collection `ex:C`.
-The client MUST check for this `tree:view` property and return the result of the discovery algorithm with `<> → N`.
+## Selecting a collection via shapes ## {#tree-collection-shapes}
-If there is no such triple, then the client MUST check whether the URL before redirects (`E`) has been used in one of the following patterns:
- * `E tree:view ?N.` where there’s exactly one `?N`, then the algorithm MUST return `?N → N`.
- * `E tree:rootNode ?N ; tree:search ?S .` then the algorithm MUST return `?N → N` and `?S → S`.
- * `?DS dcat:servesDataset E ; dcat:endpointURL ?U` or `E dcat:endpointURL ?U`, then the algorithm MUST repeat the algorithm with `?U` as the entrypoint.
+When multiple collections are found by a client, it can choose to prune the collections based on the `tree:shape` property.
+The `tree:shape` property will refer to a first `sh:NodeShape`.
+The collection MAY be pruned in case there is no overlap in properties the client needs.
-Note: When data about the dataset, data service or search tree is found, it is a good idea to also pass this on to the client.
+Issue: Will we document the precise algorithm to use? Should we extend shapes with cardinality approximations as well?
-## tree:Collection ## {#collection}
+## Selecting a collection via a catalog ## {#dcat-catalog}
-In order to prioritize a specific view link, the relations and search forms in the entry nodes can be studied for their relation types, path or remaining items.
-The class tree:ViewDescription
indicates a specific TREE structure on a tree:Collection
.
-Through the property tree:viewDescription
a tree:Node
can link to an entity that describes the view, and can be reused in data portals as the dcat:DataService
.
+A DCAT Catalog is an overview of datasets, data services and distributions.
+As TREE clients first need to select a dataset, and then a search tree to use, it aligns wll with how DCAT-AP works.
+DCAT discovery extends upon the previous section in which a collection or dataset can be selected based on the `tree:shape` property.
-
tree:viewDescription
property in this page, a client either already discovered the description of this view in an earlier tree:Node
, either the current tree:Node
is implicitly the ViewDescription. Therefore, when the property path tree:view → tree:viewDescription
does not yield a result, the view properties MUST be extracted from the object of the tree:view
triple.
-A tree:Node
can also be double typed as the tree:ViewDescription
. A client must thus check for ViewDescriptions on both the current node without the tree:viewDescription
qualification, as on the current node with the tree:viewDescription
link.
+### Selecting a search tree via DCAT Distribution ### {#dcat-distribution}
-## dcat:Catalog ## {#collection}
+`E dcat:distribution ?D . ?D dcat:downloadURL ?N .` then ?N is a rootnode of E.
-When multiple collections are found by a client, it can choose to prune the collections based on the tree:shape
property.
-Therefore a data publisher SHOULD annotate a tree:Collection
instance with a SHACL shape.
-The tree:shape
points to a SHACL description of the shape (sh:NodeShape
).
+Issue: This is yet to be done
-Note: the shape can be a blank node, or a named node on which you should follow your nose when it is defined at a different HTTP URL.
+### Selecting a search tree from a DCAT data service ### {#dcat-dataservice}
-# Context data # {#context}
+ * `?DS dcat:servesDataset E ; dcat:endpointURL ?U` or `E dcat:endpointURL ?U`, then the algorithm MUST repeat the algorithm with `?U` as the entrypoint.
+
+Issue: This is yet to be done
+
+## Linked Data Event Streams ## {#ldes}
+
+In case the client is not made for query answering, but only for setting up a replication and synchronization system, then there is a special type that can be used to indicate the search tree is made for this purpose: the `ldes:EventSource`.
+Clients that want to prioritize taking a _full_ copy MAY give full priority to this server hint.
+
+