Skip to content

Commit

Permalink
Merge pull request #2 from narrative-io/AccessRulesRefreshViews
Browse files Browse the repository at this point in the history
added sections 10.7, 10.7.1,  10.5.2, 10.6, 10.6.1, 10.6.2
  • Loading branch information
urbushey authored Dec 29, 2023
2 parents 76d1e82 + 6a32391 commit e37d0f5
Showing 1 changed file with 142 additions and 7 deletions.
149 changes: 142 additions & 7 deletions nql.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,23 +186,24 @@ Using `EXPLAIN <query>`, a forecast can be generated. This forecast estimates bo

### 10.3 CREATING MATERIALIZED VIEWS

#### 10.4.1 Materialized Views
#### 10.3.1 Materialized Views
In databases that support them, a materialized view is a database object that stores the result of a query physically. It provides a way to cache expensive query results and improve query performance by reading from this pre-computed result set, which can be refreshed periodically or on-demand. Similarly, in NQL, creating a materialized view creates a new, unique dataset within the Narrative Data Collaboration Platform that can be further queried or actioned on downstream.

Creating a materialized view effectively creates a new dataset with a unique name. Such datasets cannot ingest data from other sources. Data purchase costs may or may not be incurred when executing a query that materializes data as a new dataset, depending on the underlying query's access rules.

#### 10.4.2 Materialized View Syntax
#### 10.3.2 Materialized View Syntax

```sql
CREATE MATERIALIZED VIEW "<view_name>"
[ DISPLAY_NAME = '<display_name>' ]
[ DESCRIPTION = '<description_text>' ]
[ EXPIRE = { <ISO 8601 PERIOD> } // Supported syntax: "expire_when > P60", default retains all data.
[ EXPIRE = { <ISO 8601 PERIOD> } // Supported syntax: "expire_when > P60", default retains all data. ]
[ STATUS = { 'active' | 'updating' | 'draft' } ]
[ TAGS = ( '_nio_materialized_view', '<tag1>', ... ) ]
[ WRITE_MODE = { 'append' | 'overwrite' } ]
[ EXTENDED_STATS = { 'all' | 'none' } ]
[ PARTITIONED_BY <field> <transform>, <field2> <transform> ]
[ PARTITIONED_BY <field> <transform>, <field2> <transform> ]
[ REFRESH_SCHEDULE = { '@hourly' | '@daily' | '@weekly' | '@monthly' | '@once' | cron expression} ]
AS SELECT <column_names> FROM <table_name>
```

Expand Down Expand Up @@ -257,16 +258,27 @@ The following parameters apply to the dataset that is generated by the `CREATE M
- Type: expression
- Default: Narrative sample partition is always present; users can add additional partitions.

- `REFRESH_SCHEDULE`: Defines the frequency of updates for the materialized view.
- Allowed Values:
- '@hourly'
- '@daily'
- '@weekly'
- '@monthly'
- '@once'
- cron expressions
- Type: enum | cron
- Default: '@once'


## 10.5 Specialized Functions in NQL
### 10.5 Specialized Functions in NQL

NQL also supports a variety of specialized functions or User-Defined Functions (UDFs) to cater to specific use-cases.

### 10.5.1 `ADDRESS_HASHES()`
#### 10.5.1 `ADDRESS_HASHES()`

The `ADDRESS_HASHES()` function generates libpostal address hashes from an unstructured address string. This is especially useful for conducting fuzzy address comparisons where exact string matching isn't sufficient. By hashing the addresses and then joining two input lists based on these hashes, users can find approximate address matches with high efficiency.

#### Example:
##### Example:

```sql
CREATE MATERIALIZED VIEW "address_hashes_sample_v2" AS
Expand All @@ -290,6 +302,117 @@ SELECT
) as address_hashes,
...
```
#### 10.5.2 `country_code_3_to_2()`

The country_code_3_to_2('column') function takes in a single column of ISO 3166-1 alpha-3 country code(s) and converts it to ISO 3166-1 alpha-2 country code(s). The function is useful for matching the standard output of the `iso_3166_1_country` Rosetta Stone attribute, expressed as ISO 3166-1 alpha-2 country codes.

##### Example:

```sql
CREATE MATERIALIZED VIEW "country_code_sample" AS
SELECT
country_code_3_to_2(my_dataset.country_code) as two_letter_codes
FROM
company_data.my_dataset
...
```

### 10.6 Querying an Access Rule Directly

An access rule has two identifiers: an `access_rule_name` and an `access_rule_id`. Access rule names are human readable and must be created explicitly, while access rule ids are created automatically during the initial set up for each access rule. NQL supports querying access rules directly through `access_rule_name` and not `access_rule_id`.

NQL supports querying internal access rules (access rules on datasets in the same company seat) or external access rules (access rules on datasets in a different company seat) directly. Querying an access rule is the third method of querying datasets in NQL, in addition to the Rosetta Stone attribute catalog and dataset ids.

#### 10.6.1 Querying Internal Access Rules

An access rule name is added after the company identifier. When querying data in your own company seat, an access rule name always follows `company_data`.

##### Example

```sql
SELECT pd.hashed_emails
FROM company_data.access_rule_for_private_deal pd
```

#### 10.6.2 Querying External Access Rules

An access rule name is added after the company identifier. When querying data in your own company seat, an access rule name always follows `company_slug`.

##### Example

```sql
SELECT teams.baseball_teams
FROM company_slug.access_rule_unique_name_1 teams
```

### 10.7 Embedded Namespaces in NQL

#### 10.7.1 `_rosetta_stone`

NQL supports attribute querying via the Rosetta Embedded Namespace. This namespace is facilitated by `_rosetta_stone`, a direct method to query Rosetta Stone attributes. `_rosetta_stone` acts as an attribute reference within the dataset or access rule. `_rosetta_stone` must follow either a dataset's `unique_name`, a dataset's `id`, or an access rule's `name`. In case of an absence of mappings or an incorrect attribute reference, the query will return an error.

##### Basic Usage

```sql
SELECT ds_identifier._rosetta_stone."attribute_name" AS alias_name
FROM dataset_source
```

- **`ds_identifier`**: Alias or identifier for the dataset. A dataset can be referenced by its `id` or `unique_name`.
- **`attribute_name`**: The name of the Rosetta Stone attribute that is being selected.
- **`alias_name`**: An optional alias for the selected attribute.

##### Example with Single Dataset

```sql
SELECT ds_123._rosetta_stone."event_timestamp" AS event_time
FROM company_data.ds_123 AS ds_123
```

##### Example Joining Multiple Datasets

```sql
SELECT
ds_123._rosetta_stone."attribute_1" AS attribute_from_a,
ds_456._rosetta_stone."attribute_2" AS attribute_from_b,
ds_123.email,
ds_456.username
FROM
company_data.ds_123 AS ds_123
JOIN
company_data.ds_456 AS ds_456
ON
ds_123.user_id = ds_456.user_id
```

In this example:

- The first Rosetta Stone attribute (**`attribute_1`**) is being pulled from dataset **`ds_123`**.
- The second Rosetta Stone attribute (**`attribute_2`**) is being pulled from dataset **`ds_456`**.

##### Example of Nested Properties
For nested properties, the same dot notation is used within the **`_rosetta_stone`** namespace.

```sql
SELECT
ds_123._rosetta_stone."nested"."attribute" AS nested_attribute
FROM
company_data.ds_123 AS ds_123
```

##### Example of Filtering with Rosetta Attributes

```sql
SELECT
ds_123._rosetta_stone."unique_id"."value" AS id
FROM
company_data.ds_123 AS ds_123
WHERE
ds_123.id = 123
```

Here, the **`WHERE`** clause uses the Rosetta attribute **`unique_id.value`** from dataset **`ds_123`** for filtering.


## 12. Example Queries

Expand Down Expand Up @@ -350,6 +473,18 @@ WHERE

# CHANGE LOG

## Update 2023-12-26

### Section 10 - CREATING MATERIALIZED VIEWS

- Create Materialized View syntax was updated to include `REFRESH_SCHEDULE`, which defines the frequency of updates for the materialized view.
- The UDF section includes a new function: `country_code_3_to_2()`.
- NQL supports targeting access rules directly.
- Internal access rules are targeted using `access_rule_name` and the `company_data` identifier.
- External access rules are targeted using `company_slug` and `access_rule_name`.
- Introduction of the Rosetta Embedded Namespace as a way to query attributes from specific access rules or datasets. The Rosetta Embedded Namespace is facilitated by `_rosetta_stone`.


## Update 2023-11-05
### Section 2 - Scope
- Revised to highlight NQL's integration with the Narrative platform.
Expand Down

0 comments on commit e37d0f5

Please sign in to comment.