Skip to content

Standard Link

dpolatscalefree edited this page Feb 23, 2023 · 13 revisions

This macro creates a link entity, connecting two or more entities, or an entity with itself. It can be loaded by one or more source staging tables, if multiple sources share the same buisness definitions. Typically a link would only be loaded by multiple sources, if those multiple sources also share the business defintions of the hubs, and therefor load the connected hubs together aswell. If multiple sources are used, it is requried that they all have the same number of foreign keys inside, otherwise they would not share the same business definition of that link. Additionally, a multi-source link needs a rsrc_static Attribute defined for each source.

Features:

  • Loadable by multiple sources
  • Supports multiple updates per batch and therefor initial loading
  • Using a dynamic high-water-mark to optimize loading performance of multiple loads
  • Allows source mappings for deviations between source column names and hub column names
Parameters Data Type Explanation
link_hashkey string Name of the link hashkey column inside the stage. Should got calculated out of all business keys inside the link.
foreign_hashkeys list of strings List of all hashkey columns inside the link, that refer to other hub entities. All hashkey columns must be available inside the stage area.
source_models string | dictionary For a single source entity, a String with the name of the source staging model is required. For multi source entities, a dictionary with information about the source models is required. The keys of the dict are the names of the source models, and the value of each source model is another dictionary. This inner dictionary optionally has the keys 'rsrc_static', 'hk_column' and 'fk_columns'.

For further information about the rsrc_static attribute, please visit the following wiki page: rsrc_static Attribute

src_ldts string Name of the ldts column inside the source models. Is optional, will use the global variable 'datavault4dbt.ldts_alias'. Needs to use the same column name as defined as alias inside the staging model.
src_rsrc string Name of the rsrc column inside the source models. Is optional, will use the global variable 'datavault4dbt.rsrc_alias'. Needs to use the same column name as defined as alias inside the staging model.

Example 1

{{ config(materialized='incremental') }}

{%- set yaml_metadata -%}
link_hashkey: 'hk_opportunity_account_l'
foreign_hashkeys: 
    - 'hk_opportunity_h'
    - 'hk_account_h'
source_models: stage_opportunity
{%- endset -%}    

{%- set metadata_dict = fromyaml(yaml_metadata) -%}

{%- set link_hashkey = metadata_dict['link_hashkey'] -%}
{%- set foreign_hashkeys = metadata_dict['foreign_hashkeys'] -%}
{%- set source_models = metadata_dict['source_models'] -%}


{{ datavault4dbt.link(link_hashkey=link_hashkey,
        foreign_hashkeys=foreign_hashkeys,
        source_models=source_models) }}

Description

With this example, a regular standard link is created. The created link represents a single source link because there is only one underlying source model defined in the metadata.
  • link_hashkey:
    • hk_opportunity_account_l: This hashkey column belongs to the link between opportunity and account, and was created at the staging layer by the stage macro.
  • foreign_hashkeys:
    • ['hk_opportunity_h', 'hk_account_h'] The link between opportunity and account needs to containt both the hashkey of account and contact to enable joins the the corresponding hub entities.
  • source_models:
    • This would create a link loaded from only one source, which is not uncommon. It uses the model 'stage_account'. The rsrc_static attribute is not set, because it is not required for single source entities. For further information about the rsrc_static attribute, please visit the following wiki page: rsrc_static Attribute.

Example 2

{{ config(materialized='incremental') }}

{%- set yaml_metadata -%}
link_hashkey: 'hk_opportunity_account_l'
foreign_hashkeys: 
    - 'hk_opportunity_h'
    - 'hk_account_h'
source_models:
    stage_opportunity:
        rsrc_static: '*/SALESFORCE/Opportunity/*'
    stage_account:
        rsrc_static: '*/SAP/Account/*'
        link_hk: 'hashkey_account_opportunity'
        fk_columns: 
            - hashkey_opportunity
            - hashkey_account
{%- endset -%}    

{%- set metadata_dict = fromyaml(yaml_metadata) -%}

{%- set link_hashkey = metadata_dict['link_hashkey'] -%}
{%- set foreign_hashkeys = metadata_dict['foreign_hashkeys'] -%}
{%- set source_models = metadata_dict['source_models'] -%}


{{ datavault4dbt.link(link_hashkey=link_hashkey,
        foreign_hashkeys=foreign_hashkeys,
        source_models=source_models) }}

Description

With this example, a regular standard link is created. The created link represents a multi source link because there are multiple underlying source models defined.
  • link_hashkey:
    • hk_opportunity_account_l: This hashkey column belongs to the link between opportunity and account, and was created at the staging layer by the stage macro.
  • foreign_hashkeys:
    • ['hk_opportunity_h', 'hk_account_h'] The link between opportunity and account needs to containt both the hashkey of account and contact to enable joins the the corresponding hub entities.
  • source_models:
    • This would create a link loaded from two sources, which is also not uncommon. With "link_hk" and "fk_columns" defined differently for stage_account, a source mapping is enabled, that allows users to use different input columns for different source models.
    • For further information about the rsrc_static attribute, please visit the following wiki page: rsrc_static Attribute
Clone this wiki locally