Current version: 0.0.35
- Support for Anthropic's updated Claude 3.5 Sonnet released today
claude-3.5-sonnet
now points to versionclaude-3-5-sonnet-latest
Caution
This release has breaking changes! Please read the changelog carefully.
- New supported models
gemma-2-9b
,llama-3.2-1b
, andllama-3.2-3b
via Groq.
- In order to be more consistent with l2m2's naming scheme, the following model ids have been updated:
llama3-8b
→llama-3-8b
llama3-70b
→llama-3-70b
llama3.1-8b
→llama-3.1-8b
llama3.1-70b
→llama-3.1-70b
llama3.1-405b
→llama-3.1-405b
- This is a breaking change!!! Calls using the old
model_id
s (llama3-8b
, etc.) will fail.
- Provider
octoai
has been removed as they have been acquired and are shutting down their cloud platform. This is a breaking change!!! Calls using theoctoai
provider will fail.- All previous OctoAI supported models (
mixtral-8x22b
,mixtral-8x7b
,mistral-7b
,llama-3-70b
,llama-3.1-8b
,llama-3.1-70b
, andllama-3.1-405b
) are still available via Mistral, Groq, and/or Replicate.
- All previous OctoAI supported models (
- Updated gpt-4o version from
gpt-4o-2024-05-13
togpt-4o-2024-08-06
.
-
Mistral provider support via La Plateforme.
-
Mistral Large 2 model availibility from Mistral.
-
Mistral 7B, Mixtral 8x7B, and Mixtral 8x22B model availibility from Mistral in addition to existing providers.
-
0.0.30 and 0.0.31 are skipped due to a packaging error and a model key typo.
Caution
This release has breaking changes! Please read the changelog carefully.
alt_memory
andbypass_memory
have been added as parameters tocall
andcall_custom
inLLMClient
andAsyncLLMClient
. These parameters allow you to specify alternative memory streams to use for the call, or to bypass memory entirely.
- Previously, the
LLMClient
andAsyncLLMClient
constructors tookmemory_type
,memory_window_size
, andmemory_loading_type
as arguments. Now, it just takesmemory
as an argument, whilewindow_size
andloading_type
can be set on the memory object itself. These changes make the memory API far more consistent and easy to use, especially with the additions ofalt_memory
andbypass_memory
.
- The
MemoryType
enum has been removed. This is a breaking change!!! Instances ofclient = LLMClient(memory_type=MemoryType.CHAT)
should be replaced withclient = LLMClient(memory=ChatMemory())
, and so on.
- Providers can now be activated by default via the following environment variables:
OPENAI_API_KEY
for OpenAIANTHROPIC_API_KEY
for AnthropicCO_API_KEY
for CohereGOOGLE_API_KEY
for GoogleGROQ_API_KEY
for GroqREPLICATE_API_TOKEN
for ReplicateOCTOAI_TOKEN
for OctoAI
- OctoAI provider support.
- Llama 3.1 availibility, in sizes 8B (via OctoAI), 70B (via OctoAI), and 405B (via both OctoAI and Replicate).
- Mistral 7B and Mixtral 8x22B via OctoAI.
LLMOperationError
exception, raised when a feature or mode is not supported by a particular model.
- Rate limit errors would sometimes give the model id as
None
in the error message. This has been fixed.
- GPT-4o-mini availibility.
- Custom exception
LLMRateLimitError
, raised when an LLM call returns a 429 status code.
- The ability to specify a custom timeout for LLM calls by passing a
timeout
argument tocall
orcall_custom
(defaults to 10 seconds). - A custom exception
LLMTimeoutError
which is raised when an LLM call times out, along with a more helpful message than httpx's default timeout error.
- Calls to Anthropic with large context windows were sometimes timing out, prompting this change.
- Major bug where l2m2 would cause environments without
typing_extensions
installed to crash due to it not being listed as an external dependency. This has been fixed by addingtyping_extensions
as an external dependency.
- This bug wasn't caught becuase integration tests were not running in a clean environment – (i.e.,
typing_extensions
was already installed from one of the dev dependencies). To prevent this from happening again, I mademake itest
uninstall all Python dependencies before running.
- In 0.0.21, async calls were blocking due to the use of
requests
. 0.0.22 replacesrequests
withhttpx
to allow for fully asynchoronous behavior.
AsyncLLMClient
should now be instantiated with a context manager (async with AsyncLLMClient() as client:
) to ensure proper cleanup of thehttpx
client.- In
AsyncLLMClient
,call_async
andcall_custom_async
have been renamed tocall
andcall_custom
respectively, with asynchronous behavior.
call_concurrent
andcall_custom_concurrent
have been removed due to unnecessary complexity and lack of use.
- This changelog (finally – oops)
- Support for Anthropic's Claude 3.5 Sonnet released today
- L2M2 is now fully HTTP based with no external dependencies, taking the total recursive dependency count from ~60 to 0 and massively simplifying the unit test suite.
- Non-native JSON mode strategy now defaults to prepend for Anthropic models and strip for all others.