Proposal: Airflow integration for scalable asynchronous processing #1153

anastasds · 2025-02-19T15:04:27Z

anastasds
Feb 19, 2025

In order for Llama Stack to be deployable either as a standalone application and or as distributed, service oriented architecture, some sort of abstraction over computation-heavy resources is needed. This can be custom or it can leverage existing, mature solutions.

Use cases include at least training, synthetic data generation, and document processing.

It seams reasonable to say that basic requirements would be that such an abstraction should be able to function in a single-node deployment, ideally without requiring a message queue in that case, as a distributed cluster, or on Kubernetes.

Options include:

Celery: can be backed by Redis or RabbitMQ, or can use the filesystem as broker.
Dramatiq: A celery alternative that also supports RabbitMQ and Redis, intended to be simpler and easier to use than Celery.
Dask: Has an ML/data science leaning with Parquet, Arrow, etc. Deployable on Kubernetes with a CRD, and designed to work on a single node.
Airflow: Mature tool with executors for running locally, using Celery, using Dask, on Kubernetes, and more.

It seems that Airflow would be the most general solution that subsumes at least most of the other major options.

Has this been discussed before for Llama Stack?

franciscojavierarceo · 2025-02-20T18:46:31Z

franciscojavierarceo
Feb 20, 2025

I believe @booxter is proposing something as well.

0 replies

booxter · 2025-02-24T23:17:12Z

booxter
Feb 24, 2025

I posted #1238 that is relevant here. I'm focusing on API modeling and provider interfaces to Jobs API layer. Particular backends are an interesting discussion; I would need to explore the space a bit more to make any judgement one way or another.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Airflow integration for scalable asynchronous processing #1153

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Proposal: Airflow integration for scalable asynchronous processing #1153

anastasds Feb 19, 2025

Replies: 2 comments

franciscojavierarceo Feb 20, 2025

booxter Feb 24, 2025

anastasds
Feb 19, 2025

franciscojavierarceo
Feb 20, 2025

booxter
Feb 24, 2025