Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notifications for lineage change #1026

Open
abhineet13 opened this issue Feb 16, 2022 · 3 comments
Open

Notifications for lineage change #1026

abhineet13 opened this issue Feb 16, 2022 · 3 comments
Labels

Comments

@abhineet13
Copy link

Background [Optional]

Hi, we would like to be alerted for any changes to the lineage of an ETL job like changes in data source path/attributes or transformations.

Question

Is there a way to get alerts/notifications for changes to a Lineage?

@wajda
Copy link
Contributor

wajda commented Feb 16, 2022

Thanks for the suggestion.
In theory, yes. But in practice it needs more precise definition of what is considered a change in lineage. In Spline, lineage is represented by a combination of execution plans and execution events. Currently, any minor logically irrelevant change in the execution plan (e.g. an extra user metadata, or a timestamp somewhere) would lead to creating another instance of the execution plan in the Spline database. Spline server doesn't make any efforts to compare them node by node as it cannot decide with certain what difference is considered meaningful and what is not. The answer heavily depends on the context (what data framework you are using, of what version, any other extensions, what is semantics of every operator and its parameters etc.) and the use-case (e.g. a change in the username can be considered important or irrelevant in different situations). So depending on the use-case it could be easy or tricky to implement.

We have similar feature requests for other use-cases, like #735 or #123
In the future versions we'll add an admin API through which it will possible to configure alerts on certain conditions. This is not a priority for us however at the moment, so don't expect it to be done in the next few months at least.

@wajda wajda added the feature label Feb 16, 2022
@Mjlgh
Copy link

Mjlgh commented Feb 17, 2022

abhineet13,

We're interested in this feature as well. We recognize there's a fair amount of complexity in this as it requires some choices to be made about what constitutes a change. Today we're writing Spline's generated lineage to our enterprise data catalog and plan to create a hash as part of the identifier for that lineage on the attribute paths created, ignoring literals and anything else that might be nondeterministic.

@abhineet13
Copy link
Author

Thanks @wajda and @Mjlgh for your response.

@wajda wajda added this to Spline Mar 31, 2022
@wajda wajda moved this to New in Spline Mar 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: New
Development

No branches or pull requests

3 participants