-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc: Define Python SDK Public API #8
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe changes introduce a standardized method for defining the public API of the Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (4)
active/008-metadata-ingestion-public-api.md (4)
2-2
: Format the RFC PR link using proper markdown syntax.Convert the bare URL to a proper markdown link for better readability.
-- RFC PR: https://github.com/datahub-project/rfcs/pull/8 +- RFC PR: [#8](https://github.com/datahub-project/rfcs/pull/8)🧰 Tools
🪛 Markdownlint
2-2: null
Bare URL used(MD034, no-bare-urls)
64-94
: Standardize markdown list markers.The alternatives section uses a mix of
*
and+
for list items. For consistency, use-
throughout the document as per markdown best practices.-- Mark internal APIs with a leading underscore - * Pros - + Simple and widely understood convention - * Cons: - + Requires renaming the majority of existing classes and methods +- Mark internal APIs with a leading underscore + - Pros + - Simple and widely understood convention + - Cons: + - Requires renaming the majority of existing classes and methods(Apply similar changes throughout the alternatives section)
🧰 Tools
🪛 Markdownlint
65-65: Expected: dash; Actual: asterisk
Unordered list style(MD004, ul-style)
66-66: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
67-67: Expected: dash; Actual: asterisk
Unordered list style(MD004, ul-style)
68-68: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
69-69: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
70-70: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
72-72: Expected: dash; Actual: asterisk
Unordered list style(MD004, ul-style)
73-73: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
74-74: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
75-75: Expected: dash; Actual: asterisk
Unordered list style(MD004, ul-style)
76-76: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
77-77: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
79-79: Expected: dash; Actual: asterisk
Unordered list style(MD004, ul-style)
80-80: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
81-81: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
82-82: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
83-83: Expected: dash; Actual: asterisk
Unordered list style(MD004, ul-style)
84-84: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
85-85: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
86-86: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
88-88: Expected: dash; Actual: asterisk
Unordered list style(MD004, ul-style)
89-89: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
90-90: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
91-91: Expected: dash; Actual: asterisk
Unordered list style(MD004, ul-style)
92-92: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
93-93: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
94-94: Expected: dash; Actual: plus
Unordered list style(MD004, ul-style)
102-103
: Enhance the future work section with specific validation criteria.Consider adding more details about:
- What aspects of the public API will be validated (documentation, type hints, backward compatibility)
- How the validation will be implemented (e.g., custom tools, existing linters)
- Criteria for adding new classes/functions to the public API
21-23
: Consider elaborating on the versioning strategy.While the requirement for semantic versioning is mentioned, it would be helpful to explicitly define:
- What constitutes a breaking change in the public API
- How version numbers will be incremented for different types of changes
- How deprecation notices will be handled
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
active/008-metadata-ingestion-public-api.md
(1 hunks)
🧰 Additional context used
🪛 Markdownlint
active/008-metadata-ingestion-public-api.md
65-65: Expected: dash; Actual: asterisk
Unordered list style
(MD004, ul-style)
66-66: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
67-67: Expected: dash; Actual: asterisk
Unordered list style
(MD004, ul-style)
68-68: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
69-69: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
70-70: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
72-72: Expected: dash; Actual: asterisk
Unordered list style
(MD004, ul-style)
73-73: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
74-74: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
75-75: Expected: dash; Actual: asterisk
Unordered list style
(MD004, ul-style)
76-76: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
77-77: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
79-79: Expected: dash; Actual: asterisk
Unordered list style
(MD004, ul-style)
80-80: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
81-81: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
82-82: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
83-83: Expected: dash; Actual: asterisk
Unordered list style
(MD004, ul-style)
84-84: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
85-85: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
86-86: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
88-88: Expected: dash; Actual: asterisk
Unordered list style
(MD004, ul-style)
89-89: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
90-90: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
91-91: Expected: dash; Actual: asterisk
Unordered list style
(MD004, ul-style)
92-92: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
93-93: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
94-94: Expected: dash; Actual: plus
Unordered list style
(MD004, ul-style)
2-2: null
Bare URL used
(MD034, no-bare-urls)
🔇 Additional comments (1)
active/008-metadata-ingestion-public-api.md (1)
41-47
: Verify the existence and accessibility of the public API entities.
The example looks good, but let's verify that these classes exist and are properly exposed in their respective modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our public and private interfaces are confusingly intermixed right now. I think we probably want to do some refactoring here. For example, we probably should have a datahub.sdk
with all the mce builder helpers and the graph.
If we did that, we'd be able to say something like "everything in datahub.api, datahub.sdk, datahub.metadata., datahub.ingestion.api., etc" are public, everything else is internal.
E.g. having end users use long import paths like datahub.ingestion.graph.client.DataHubGraph
is pretty ugly. Ideally they'd be short like datahub.sdk.DataHubGraph
. Some of our other import paths e.g datahub.specific.form
also are pretty confusing
Importing and re-exporting things in init.py is pretty standard. We'll need some lint rules to make sure that
- the dependency graph is logical e.g. datahub.sdk should not depend on datahub.ingestion
- that we use the "true" import path instead of the aliased path everywhere. We already have issues with this, where we import
datahub.utilities.urns.urn
instead ofdatahub.metadata.urns
In init.py files we don't need to explicitly set __all__
if we're careful about what we import. I like how dagster does it https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/dagster/__init__.py
Broadly, we should do some refactoring instead of just setting all everywhere. I'm also not actually opposed to renaming a bunch of things to have underscore prefixes e.g. |
Are you concerned at all about moving files, in case users depend on them in their own forks / scripts? Or would we move them + include a deprecated import reference. Also, what's the advantage of importing from I like the dagster impl as a way to avoid an explicit If we were to do a large renaming of files, then I'd probably advocate for a parent |
Re moving files - we can keep deprecated import references around for a bit Advantage of datahub.sdk and similar over top-level datahub -> yup mainly just organizational. That said, this isn't a strongly held opinion. I think you're probably right that dumping everything in top-level datahub init for now is fine and will be easiest, and we can move things around later if needed |
Summary by CodeRabbit
New Features
acryl-datahub
package to enhance user understanding.__all__
variable to clearly list public classes and functions.Documentation
Future Work