Skip to content

Commit

Permalink
update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
magurh committed Feb 25, 2025
1 parent 0b19cf4 commit 3c1dc51
Show file tree
Hide file tree
Showing 4 changed files with 16 additions and 3 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ cython_debug/
*.pdf
*.svg
# *.jpeg
*.png
# *.png
*.bmp

### VirtualEnv template
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -249,8 +249,8 @@ If you encounter issues, follow these steps:
- _Chain of Thought_ prompting techniques are a linear problem solving approach where each step builds upon the previous one. Google's approach in [arXiv:2201.11903](https://arxiv.org/pdf/2201.11903) is to augment each prompt with an additional example and chain of thought for an associated answer. (See the paper for multiple examples.)
- **Dynamic resource allocation and Semantic Filters**:
- An immediate improvement to the current approach would be to use dynamically-adjusted parameters. Namely, the number of iterations and number of models used in the algorithm could be adjusted to the input prompt: _e.g._ simple prompts do not require too many resources. For this, a centralized model could be used to decide the complexity of the task, prior to sending the prompt to the other LLMs.
- On a similar note, the number of iterations for making progress could adjusted according to how _different_ are the model responses. Semantic entailment for LLM outputs is an active field of research, but a rather quick solution is to rely on _embeddings_. [TBC]
the use of [LLM-as-a-Judge](https://arxiv.org/pdf/2306.05685) for evaluating other LLM outputs has shown good progress -- see also this [Confident AI blogpost](https://www.confident-ai.com/blog/why-llm-as-a-judge-is-the-best-llm-evaluation-method).
- On a similar note, the number of iterations for making progress could adjusted according to how _different_ are the model responses. Semantic entailment for LLM outputs is an active field of research, but a rather quick solution is to rely on _embeddings_. These are commonly used in RAG pipelines, and could also be used here with _e.g._ cosine similarity. You can get started with [GCloud's text embeddings](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings) -- see [flare-ai-rag](https://github.com/flare-foundation/flare-ai-rag/tree/main) for more details.
- The use of [LLM-as-a-Judge](https://arxiv.org/pdf/2306.05685) for evaluating other LLM outputs has shown good progress -- see also this [Confident AI blogpost](https://www.confident-ai.com/blog/why-llm-as-a-judge-is-the-best-llm-evaluation-method).
- In line with the previously mentioned LLM-as-a-Judge, a model could potentially be used for filtering _bad_ responses. LLM-Blender, for instance, introduced in [arXiv:2306.02561](https://arxiv.org/abs/2306.02561), uses a PairRanker that achieves a ranking of outputs through pairwise comparisons via a _cross-attention encoder_.
- **AI Agent Swarm**:
- The structure of the reference CL implementation can be changed to adapt _swarm_-type algorithms, where tasks are broken down and distributed among specialized agents for parallel processing. In this case a centralized LLM would act as an orchestrator for managing distribution of tasks -- see _e.g._ [swarms repo](https://github.com/kyegomez/swarms).
13 changes: 13 additions & 0 deletions src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,19 @@

# Flare AI Consensus

## flare-ai-consensus Pipeline

The flare-ai-consensus template consists of the following components:

* **Router:** The primary interface that receives user requests, distributes them to the various AI models, and collects their intermediate responses.
* **Aggregator:** synthesizes multiple model responses into a single, coherent output.
* **Consensus Layer:** Defines logic for the consensus algorithm. The reference implementation is setup in the following steps:
* The initial prompt is sent to a set of models, with additional system instructions.
* Initial responses are aggregated by the Aggregator.
* Improvement rounds follow up where aggregated responses are sent as additional context or system instructions to the models.

<img width="500" alt="flare-ai-consensus" src="./cl_pipeline.png" />

## OpenRouter Clients

We implement two OpenRouter clients for interacting with the OpenRouter API: a standard sync client and an asynchronous client.
Expand Down
Binary file added src/cl_pipeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 3c1dc51

Please sign in to comment.