This document explains the structure and content of a STAC Catalog. A STAC Catalog is a collection of STAC Items. These Items can be linked to directly from a Catalog, or the Catalog can link to other Catalogs (often called sub-Catalogs) that contain links to Items. The division of sub-catalogs is up to the implementor, but is generally done to aid the ease of online browsing by people.
Catalogs are not intended to be queried. Their purpose is discovery: to be browsed by people and crawled by machines to build a search index. A Catalog can be represented in JSON format. Any JSON object that contains all the required fields is a valid STAC Catalog.
- Examples and Implementations
- JSON Schema - please see the validation instructions
This Catalog specification primarily defines a structure for information to be discoverable. Any use that is publishing a set of related spatiotemporal assets is strongly recommended to also use the STAC Collection specification to provide additional information about a set of Items contained in a catalog, to give contextual information to aid in discovery. Every STAC Collection is also a valid STAC Catalog.
This is still an early version of the STAC spec, expect that there may be some changes before everything is finalized.
Implementations are encouraged, however, as good effort will be made to not change anything too drastically. Using the specification now will ensure that needed changes can be made before everything is locked in. So now is an ideal time to implement, as your feedback will be directly incorporated.
There are two required element types of a Catalog: Catalog and Item. A STAC Catalog points to STAC Items, or to other STAC catalogs. It provides a simple linking structure that can be used recursively so that many Items can be included in a single Catalog, organized however the implementor desires.
STAC makes no formal distinction between a "root" catalog and the "child" catalogs. A root catalog is simply a top-most catalog (which has no parent). A nested catalog structure is useful (and recommended) for breaking up massive numbers of catalog items into logical groupings. For example, it might make sense to organize a catalog by date (year, month, day), or geography (continent, country, state/prov). Any scheme may be used, but it's considered a best practice to keep the size of each catalog under a megabyte.
A simple Catalog structure might look like this:
catalog (root)
- catalog
- catalog
- item
- asset
- item
- asset
- item
- asset
- asset
This example might be considered a somewhat "typical" structure. However, catalogs and items can describe a number of different relationships. The following shows various relationships between catalogs and items:
Catalog
->Item
(this is a common structure for a catalog to list links to items)Catalog
->Catalog
(this is a common tree structure to group sets of items. Each catalog in this relationship may also include item links as well as catalog links)Item
->Catalog
(example: an item may point to a catalog to describe a set of derived assets, where it may be desirable to have the origin asset as a "parent", such as NDVI generated from RGB/IR)Item
->Item
(example: this relationship may be used to describe a 1-1 parent-child relationship, such as a single derived item from one parent item)
As all STAC Collections are also valid STAC Catalogs, all Catalogs described here could also be Collections.
The relationships are all described by a common links
object structure, making use of
the rel
attribute to further describe the relationship.
There are a few types of catalogs that implementors occasionally refer to. These get defined by the links
structure.
- A sub-catalog is a Catalog that is linked to from another Catalog that is used to better organize data. For example a Landsat collection might have sub-catalogs for each Path and Row, so as to create a nice tree structure for users to follow.
- A root catalog is a Catalog that only links to sub-catalogs. These are typically entry points for browsing data. Often they will contain the STAC Collection definition, but in implementations that publish diverse information it may contain sub-catalogs that provide a variety of collections.
- A parent catalog is the Catalog that sits directly above a sub-catalog. Following parent catalog links continuously will naturally end up at a root catalog definition.
It should be noted that a Catalog does not have to link back to all the other Catalogs that point to it. Thus a published root catalog might be a sub-catalog of someone else's structure. The goal is for data providers to publish all the information and links they want to, while also encouraging a natural web of information to arise as Catalogs and Items are linked to across the web.
There are a number of emerging 'best practices' for how to organize and implement good catalogs. These can be found in the best practices document, and include things like catalog layout, use of self links, publishing catalogs, and more. This specification is designed for maximum flexbility, but the best practices provide guidance for good recommendations when implementing.
Though it is technically an implementation detail outside the scope of the core specification, it is worth mentioning that implementations generally fall into two different 'types':
- Static Catalogs can be implemented as simply files online, often stored in an cloud storage service like Amazon S3. or Google Cloud Storage. The core JSON documents and link structures are encoded in the file, and work as long as things are structured properly.
- Dynamic Catalogs are implemented in software, returning the JSON documents and links dynamically. This is mostly used when data holdings are already exposed through a dynamic interface, and STAC can be an alternate facade on the same core database or search cluster.
The two catalog types both implement the same fields and links, and can be treated as the same by clients. For more details on the two types and how you might use them see the Static and Dynamic Catalogs section of the best practices document.
Element | Type | Description |
---|---|---|
stac_version | string | REQUIRED. The STAC version the catalog implements. |
stac_extensions | [string] | A list of extensions the Catalog implements. |
id | string | REQUIRED. Identifier for the catalog. |
title | string | A short descriptive one-line title for the catalog. |
description | string | REQUIRED. Detailed multi-line description to fully explain the catalog. CommonMark 0.28 syntax MAY be used for rich text representation. |
summaries | Map<string, [*]|Stats Object> | A map of property summaries, either a set of values or statistics such as a range. |
links | [Link Object | REQUIRED. A list of references to other documents. |
stac_version: In general, STAC versions can be mixed, but please keep the recommended best practices in mind.
stac_extensions: A list of extensions the Catalog implements. This does NOT declare the extensions of children or Items. The list contains URLs to the JSON Schema files it can be validated against. For official extensions, a "shortcut" can be used. This means you can specify the folder name of the extension, for example pointcloud
for the Point Cloud extension. If the versions of the extension and the catalog diverge, you can specify the URL of the JSON schema file.
summaries: You can optionally summarize the potential values that are available as part of the properties
in STAC Items.
Summaries are used to inform users about values they can expect from items without having to crawl through them. It also helps do fully define collections, especially if they don't link to any Items.
Summaries are either a unique set of all values or statistics. Statistics by default only specify the range (minimum and maximum values), but can optionally be accompanied by additional statistical values.
The range can specify the potential range of values, but it is recommended to be as precise as possible. The set of values must contain at least one element and it is strongly recommended to list all values.
It is recommended to list as many properties as reasonable so that consumers get a full overview about the properties included in the Items. Nevertheless, it is not very useful to list all potential title
values of the Items. Also, a range for the datetime
property may be better suited to be included in the STAC Collection. In general, properties that are covered by the Collection specification (e.g. providers
and license
) may not be repeated in the summaries.
This object describes a relationship with another entity. Data providers are advised to be liberal with links.
Field Name | Type | Description |
---|---|---|
href | string | REQUIRED. The actual link in the format of an URL. Relative and absolute links are both allowed. |
rel | string | REQUIRED. Relationship between the current document and the linked document. See chapter "Relation types" for more information. |
type | string | Media type of the referenced entity. |
title | string | A human readable title to be used in rendered displays of the link. |
A more complete list of possible 'rel' types can be seen at the IANA page of Link Relation Types.
Please see the chapter 'relative vs absolute links' in the Item spec for a discussion on that topic, as well as the use of links section of the catalog best practices document.
The following types are commonly used as rel
types in the Link Object of a STAC Catalog:
Type | Description |
---|---|
self | STRONGLY RECOMMENDED. Absolute URL to the location that the catalog file can be found online, if available. This is particularly useful when in a download package that includes metadata, so that the downstream user can know where the data has come from. |
root | STRONGLY RECOMMENDED. URL to the root STAC Catalog or Collection. Catalogs should include a link to their root, even if it's the root and points to itself. |
parent | URL to the parent STAC Catalog or Collection. Non-root catalogs should include a link to their parent. |
child | URL to a child STAC Catalog or Collection. |
item | URL to a STAC Item. |
Note: A link to at least one item
or child
catalog is REQUIRED.
For a good understanding of the summarized field, statistics can be added. By default, only ranges with a minimum and a maximum value can be specified.
Ranges can be specified for ordinal values only, which means they need to have a rank order.
Therefore, ranges can only be specified for numbers and some special types of strings. Examples: grades (A to F), dates or times.
Implementors are free to add other derived statistical values to the object, for example mean
or stddev
.
Field Name | Type | Description |
---|---|---|
min | number|string | REQUIRED. Minimum value. |
max | number|string | REQUIRED. Maximum value. |
A catalog of NAIP imagery might look something like this:
{
"stac_version": "0.8.0",
"id": "NAIP",
"description": "Catalog of NAIP Imagery",
"links": [
{ "rel": "self", "href": "https://www.fsa.usda.gov/naip/catalog.json" },
{ "rel": "child", "href": "https://www.fsa.usda.gov/naip/30087/catalog.json" },
{ "rel": "root", "href": "https://www.fsa.usda.gov/catalog.json" }
]
}
In addition, the catalog shown above is strongly recommended to also follow the STAC Collection specification to add more information about the NAIP imagery such as the spatial and temporal extents, a license and more.
A typical 'child' sub-catalog could look similar:
{
"stac_version": "0.8.0",
"id": "NAIP",
"description": "Catalog of NAIP Imagery - 30087",
"links": [
{ "rel": "self", "href": "https://www.fsa.usda.gov/naip/30087/catalog.json" },
{ "rel": "parent", "href": "../catalog.json" },
{ "rel": "root", "href": "https://www.fsa.usda.gov/catalog.json" },
{ "rel": "item", "href": "https://www.fsa.usda.gov/naip/30087/m_3008718_sw_16_1_20130805.json" },
{ "rel": "item", "href": "https://www.fsa.usda.gov/naip/30087/m_3008718_sw_16_1_20130806.json" }
]
}
The root
catalog in this example could hold a set of sub-catalogs with different STAC collections, e.g. data from other satellites or processed variants of the NAIP imagery.
There are emerging best practices, which in time will evolve in to specification extensions for particular domains or uses.
The extensions page gives an overview about relevant extensions for STAC Catalogs.