Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Proposal: CloudWorkspace.get_custom_source() #574

Open
aaronsteers opened this issue Jan 15, 2025 · 3 comments
Open

Feature Proposal: CloudWorkspace.get_custom_source() #574

aaronsteers opened this issue Jan 15, 2025 · 3 comments

Comments

@aaronsteers
Copy link
Contributor

aaronsteers commented Jan 15, 2025

Challenge

As of today, it's a manual process to test custom connectors in PyAirbyte. Whenever the source definition changes, there's a manual step needed to copy-paste the yaml into the runtime environment where you are running PyAirbyte.

Proposal

Add to CloudWorkspace:

  • CloudWorkspace.custom_sources: list[CustomSourceConnector] - a list of custom sources defined in this workspace.
  • CloudWorkspace.get_custom_source(*, definition_id: str | None, name: str) -> CustomSourceConnector - Allow getting custom connector by ID or by name. User should provide one or the other, but not both.

Pseudocode

CustomSourceConnector might have a definition something like this:

@dataclass
class CustomSourceConnector:
    workspace: CloudWorkspace
    name: str
    manifest: dict
    version: int | None
    definition_id: str

    def as_local_source(
        self,
        /,
        config: dict | None = None,
        config_overlay: dict | None = None,
    ) -> Source:
        """Return a local source object which can be executed locally.
        If `config` is provided, it will replace the Cloud config.
        If `config_overlay` is provided, it will be overlayed on top of the Cloud config.

        Note: By design, PyAirbyte cannot retrieve secrets from the Cloud API endpoints.
        Any secret config parameters will be returned as `******` and will need to be
        replaced using a config overlay.
        """
        ...

Usage Example A

We pass the manifest to get_source().

import airbyte as ab

my_workspace = ab.CloudWorkspace(
    client_id=...,
    client_secret=...,
)

my_source_definition = my_workspace.get_custom_source(name="My Test")
my_source = ab.get_source(
    "source-my-test",
    declarative_manifest=my_source_definition.manifest,
    config={...},
)

# Now we can work on it like a normal local source...
my_source.check()
my_source.read()

Usage Example B

We use the as_local_source() method to get a local Source object.

import airbyte as ab

my_workspace = ab.CloudWorkspace(
    client_id=...,
    client_secret=...,
)

my_source_definition = my_workspace.get_custom_source(name="My Test")
my_source = my_source_definition.as_local_source(config_overlay={"password": ...})

# Now we can work on it like a normal local source...
my_source.check()
my_source.read()
@jscheel
Copy link

jscheel commented Jan 21, 2025

I think that this mostly makes sense. However, we would also need to introduce the concept of manifest versions. For example:

my_source_definition = my_workspace.get_custom_source(name="My Test")

would become

my_source_definition = my_workspace.get_custom_source(name="My Test", version=2)

with the default for version being whatever is returned when a version is not specified. Currently that behavior is the last published or the draft.

May also benefit from having a special identifier for "most recently published".

@aaronsteers
Copy link
Contributor Author

@jscheel - Good point - and agreed!

I agree we can use the same convention as the API. As you stated: No version requested will get the latest published version, or the latest draft if it is newer than the latest published. And any specified version would retrieve exactly that version.

@jscheel
Copy link

jscheel commented Feb 5, 2025

@aaronsteers another issue I've run into a few times now: the airbyte-cdk version keeps increasing at a steady clip. But this means that pulling the manifest from cloud can quickly result in a version mis-match that requires a package update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants