This document describes the toplevel keys of the data format. See this directory's README for the basic concepts.
{
"annotations": {},
"backend_version": "",
"bucket_date": "2019-10-10",
"data_format_version": "0.2.0",
"extensions": {},
"id": "bc1ff44a-04e7-45e0-81a6-46bc95c7c6b0",
"input": "http://example.com/",
"input_hashes": [],
"measurement_start_time": "2019-10-10 23:59:23",
"options": [],
"probe_asn": "AS13285",
"probe_network_name": "TalkTalk Communications Limited",
"probe_cc": "GB",
"probe_city": "",
"probe_ip": "127.0.0.1",
"report_filename": "2019-10-10/20191010T235813Z-GB-AS13285-web_connectivity-20191010T235815Z_AS13285_SCHbEXPZ59vF8wmd6SHGGCaPxYGiEg8tSPwN85fJIFHrG4ZfVP-0.2.0-probe.json",
"report_id": "20191010T235815Z_AS13285_SCHbEXPZ59vF8wmd6SHGGCaPxYGiEg8tSPwN85fJIFHrG4ZfVP",
"resolver_asn": "AS15169",
"resolver_ip": "8.8.8.8",
"resolver_network_name": "Google LLC",
"software_name": "ooniprobe-ios",
"software_version": "2.1.0",
"test_helpers": {},
"test_keys": {},
"test_name": "web_connectivity",
"test_runtime": 2.2955930233,
"test_start_time": "2019-10-10 23:58:13",
"test_version": "0.0.1"
}
-
annotations
(map[string]string
; optional): key-value annotations to the report that provide metadata to this measurement. See below. -
backend_version
(string
; optional): version of the backend that has collected this specific measurement. Note that clients of course are not supposed to emit this field. -
bucket_date
(string
): a date like"2006-01-02"
that indicates when this measurement was processed by the data pipeline. Note that clients of course are not supposed to emit this field. -
data_format_version
(string
): indicates the data format version. See README.md for the current version and for the versions history. -
extensions
(map[string]int
; optional): SHOULD describe the extensions to the base data format included in thetest_keys
field. The name of an extension is the obtained directly from the file name describing the extension in this directory, with thedf-xxx
prefix and the.md
suffix removed. A probe SHOULD describe the extensions included by its measurements. -
id
(string
; optional): client-generated UUID4 identifying this measurement in the context of a set of measurements (i.e. a report). Consumers of OONI data SHOULD NOT trust this identifier to uniquely identify the measurement. This identifier is only meaningful for measurements that have not been submitted to a OONI collector yet. In fact, OONI collectors SHOULD clear this field to avoid any potential confusion caused by it. -
input
(string
; nullable): if this experiment accepts any input, the input that was used to produce this measurement. For example, the Web Connectivity experiment uses URLs as input. Otherwise, this field SHOULD be present and set tonull
. -
input_hashes
([]string
; optional; deprecated): historical field that used to contain the SHA256s of all inputs provided to the experiment. Modern implementations, e.g. Measurement Kit, typically emit an empty list. All modern clients SHOULD NOT emit this field at all. -
measurement_start_time
(string
): time when this measurement was started in UTC, using the"2006-01-02 08:04:05"
format. Note that ooniprobe <= 1.4.0 generates skewed time information. -
options
([]string
; optional): list of options passed on the command line when running this specific experiment. Modern implementations, e.g. Measurement Kit, typically emit an empty list here. There is a use case for using this field when you are allowing users to heavily customise the experiment; for this reason we record the option being used by our experimental, research-oriented clientminiooni
. -
probe_asn
(string
): AS Number of the probe (prefixed by AS, e.g.,"AS1234"
), or"AS0"
if the user does not want to share their ASN. -
probe_network_name
(string
; optional; since2020-04-22
): The organisation name corresponding to the AS of the probe. -
probe_cc
(string
): two letter country code of the probe (e.g.,"IT"
) or"ZZ"
if the user does not want to share their country code. -
probe_city
(string
; optional; deprecated): name of the city where the measurement was run. If the user does not want to share this information, this field was historically set tonull
; modern clients SHOULD NOT emit it. -
probe_ip
(string
): IP address of the probe, or"127.0.0.1"
if the user does not want to share their IP. -
report_filename
(string
): name of the file containing the report, i.e., a set of related measurements, in our infrastructure. Note that clients of course are not supposed to emit this field. -
report_id
(string
): identifier of a set of related measurements generated by OONI backends when submitting one or more measurements. -
resolver_asn
(string
; optional; since2019-12-29
): likeprobe_asn
but forresolver_ip
rather than forprobe_ip
. -
resolver_ip
(string
; optional; since2019-11-11
): IP of the DNS resolver used by the probe, as determined by the measurement engine. -
resolver_network_name
(string
; optional; since2019-12-29
): likeprobe_network_name
but forresolver_ip
rather than forprobe_ip
. -
software_name
(string
): name of the software that has generated this specific measurement (e.g.,"ooniprobe"
). -
software_version
(string
): version of the software used to generate this specific measurement (e.g.,"3.0.0"
). -
test_helpers
(map[string]any
): map containing information regarding what test helpers have been used for running this measurement. See below for more information regarding this field's format. -
test_keys
(object
): object containing specific keys that depend upon the specific network experiment that we're running as well as upon the specific test helpers that are used. -
test_name
(string
): name of the experiment in snake case. For example, Web Connectivity SHOULD be indicated as"web_connectivity"
. -
test_runtime
(float
): runtime of this specific measurement in seconds with arbitrary sub-seconds precision. All modern implementations, i.e. Measurement Kit andgithub.com/ooni/probe-engine
, measure this value as the time elapsed since when we start measuring a specific input (or when we start an experiment without input) until when the measurement is complete (i.e. all the fields insidetest_keys
have been computed or there has been an error or timeout causing the measurement to be aborted and the error to be recorded inside it). This specifically means that this field does not include the time spent communicating with OONI backends such as the bouncer and the collector, but it includes the communication with any backend that is required to finish off the measurement (e.g. the Web Connectivity test helper). Note that this field's name is misleading and it should have been calledmeasurement_runtime
instead. -
test_start_time
(string
): likemeasurement_start_time
except that it indicates the moment in which a related set of measurements started rather than the moment where the current measurement started. For example, for the Web Connectivity experiment, this is the momement where we start processing a list of input URLs.
{
"engine_name": "libmeasurement_kit",
"engine_version": "0.10.4",
"engine_version_full": "v0.10.4",
"network_type": "wifi",
"ooni_run_link_id": "123456",
"platform": "ios",
"architecture": "arm64"
}
Annotations is defined as map[string]string
but the consumer of this field
SHOULD NOT assume that measurements use string values. A client SHOULD always
add to the map of annotations:
-
architecture
(string
): one ofarm
,arm64
,386
,amd64
-
engine_name
(string
): the name of the measurement engine -
engine_version
(string
): the version of the measurement engine -
engine_version_full
(string
): the version of the measurement engine as generated bygit describe --tags
-
go_version
(string
): the version of Go we're using -
network_type
(string
): one of:-
mobile
: when OONI Probe Mobile is using 2G/3G/4G/5G networks. -
wifi
: when OONI Probe Mobile is using Wi-Fi networks.
-
-
ooni_run_link_id
(string
): the OONI-Run-v2 link ID that caused this measurement to be performed. -
platform
(string
): one of:-
android
-
freebsd
-
ios
-
lepidopter
-
linux
-
macos
-
windows
-
-
vcs_modified
(string
):"true"
or"false"
depending on whether the tree used for building was dirty -
vcs_revision
(string
): the revision we're building -
vcs_time
(string
): the time of the revision we're building -
vcs_tool
(string
): the version control system (VCS) tool we're using, which typically should be"git"
Historically we have saved into test_helpers
two different data structures:
{"backend": "1.1.1.1:853"}
used, e.g., by HTTP Invalid Request Line, and
"backend": {
"address": "https://mia-wcth.ooni.io",
"type": "https"
}
used, e.g., by Web Connectivity.
The former is typically used when there can only be a single type of backend. The latter when more types are possible.
In the following example we omitted the content of test_keys
because it was not relevant for this discussion.
{
"annotations": {
"platform": "macos"
},
"data_format_version": "0.2.0",
"extensions": {
"dnst": 0,
"httpt": 0,
"tcpconnect": 0
},
"input": null,
"measurement_start_time": "2020-01-10 17:25:19",
"probe_asn": "AS30722",
"probe_network_name": "Vodafone Italia S.p.A.",
"probe_cc": "IT",
"probe_ip": "127.0.0.1",
"report_id": "20200110T172519Z_AS30722_5UdG13d6rEfOVCTHEdMjuXGah8vF6dpShA0jditnrHCmH10o1K",
"resolver_asn": "AS15169",
"resolver_ip": "172.217.34.2",
"resolver_network_name": "Google LLC",
"software_name": "miniooni",
"software_version": "0.1.0-dev",
"test_keys": {},
"test_name": "telegram",
"test_runtime": 4.426603178,
"test_start_time": "2020-01-10 17:25:19",
"test_version": "0.0.4"
}