Skip to content

Commit

Permalink
Merge branch 'main' of github.com:oramasearch/orama-core
Browse files Browse the repository at this point in the history
  • Loading branch information
micheleriva committed Jan 31, 2025
2 parents 593f451 + b3fa984 commit f2fa514
Show file tree
Hide file tree
Showing 5 changed files with 161 additions and 2 deletions.
120 changes: 120 additions & 0 deletions docs/content/docs/api-key.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
---
title: "API Keys"
description: "API keys are used to authenticate requests to the OramaCore API"
---
import { Tab, Tabs } from 'fumadocs-ui/components/tabs';

[As explained in the introduction](/docs#write-side-read-side), OramaCore is split in two sides: the **reader side** and the **writer side**.

Therefore, depending on the operation you want to perform, you will need to use different API keys.

In total, OramaCore will give you access to three different kinds API keys:

## Master API Key

<Callout type='warn'>
**Not safe to share**. Never share the master API key publicly. Treat it as a password.
</Callout>

The **master API key** is an essential key that allows you to configure the OramaCore instance and create or delete new collections.

It's configurable via the [`config.yaml`](/docs/configuration) file under the `writer_side` section:

```yaml
# ...
writer_side:
master_api_key: foobar
# ...
```

You will need this API key to:

- Create a new collection
- Delete a collection
- Update the configuration of a collection

In all the cases above, you will need to pass the master API key in the `Authorization` header of the request as a `Bearer` token.

Example:

<Tabs groupId='create' persist items={['cURL']}>
```bash tab="cURL"
curl -X POST \
http://localhost:8080/v0/collections \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <master-api-key>' \
-d '{
"id": "products",
"read_api_key": "my-read-api-key",
"write_api_key": "my-write-api-key"
}'
```
</Tabs>

As you can see, when creating a new collection, you will need to create the `read_api_key` and `write_api_key` as well. You will them use them to perform read and write operations on the collection.

## Write API Key

<Callout type='warn'>
**Not safe to share**. Never share the write API key publicly. Treat it as a password.
</Callout>

The **write API key** is used to insert, update, or delete documents in a collection, as well as creating new [hooks](/docs/javascript-hooks/introduction) and [actions](/docs/party-planner#writing-custom-actions).

Every collection has its own write API key, which is generated when you create the collection.

You will need this API key to:

- Insert one or more documents into a collection
- Update one or more documents in a collection
- Delete one or more documents from a collection
- Create a new hook or action

In all the cases above, you will need to pass the write API key in the `Authorization` header of the request as a `Bearer` token.

Example:

<Tabs groupId='insert' persist items={['cURL']}>
```bash tab="cURL"
curl -X PATCH \
http://localhost:8080/v0/collections/{COLLECTION_ID}/documents \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <write-api-key>' \
-d '[
{
"title": "My first document",
"content": "This is the content of my first document."
}
]'
```
</Tabs>

## Read API Key

<Callout type='info'>
**Safe to share**. This API key performs read operations only. You can share it publicly.
</Callout>

The **read API key** is used to perform read operations on a collection.

Every collection has its own read API key, which is generated when you create the collection.

You will need this API key to:

- Perform full-text, hybrid, or vector search
- Read the documents in a collection
- Perform answer sessions

In all the cases above, you will need to pass the read API key as a query parameter in the request.

Example:

<Tabs groupId='search' persist items={['cURL']}>
```bash tab="cURL"
curl -X POST \
http://localhost:8080/v0/collections/{COLLECTION_ID}/search?api_key=<read-api-key> \
-H 'Content-Type: application/json' \
-d '{ "term": "How to insert documents in OramaCore?" }'
```
</Tabs>

1 change: 0 additions & 1 deletion docs/content/docs/apis/create-collection.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ Creating a new collection is as simple as:
```bash tab="cURL"
curl -X POST \
http://localhost:8080/v0/collections \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <api-key>' \
-d '{
"id": "products",
Expand Down
3 changes: 3 additions & 0 deletions docs/content/docs/configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ http:

writer_side:
output: in-memory
# Replace the following value with your own API key
master_api_key: foobar
config:
data_dir: ./.data/writer
# The maximum number of embeddings that can be stored in the queue
Expand Down Expand Up @@ -77,6 +79,7 @@ The `http` section configures the HTTP server that serves the OramaCore API. Her
The `writer_side` section configures the writer side of OramaCore. Here are the available options:

- `output`: The output where the writer side will store the data. By default, it's set to `in-memory`.
- `master_api_key`: The master API key used to authenticate the requests to the writer side. By default, it's set to an empty string. See more about the available API keys in the [API Keys](/docs/api-key) section.
- `config`: The configuration options for the writer side. Here are the available options:
- `data_dir`: The directory where the writer side will persist the data on disk. By default, it's set to `./.data/writer`.
- `embedding_queue_limit`: The maximum number of embeddings that can be stored in the queue before the writer starts to be blocked. By default, it's set to `50000`.
Expand Down
38 changes: 37 additions & 1 deletion docs/content/docs/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,40 @@ When building OramaCore, we made a deliberate choice to create an opinionated sy

There are plenty of great vector databases and full-text search engines out there. But most of them don't work seamlessly together out of the box—they often require extensive fine-tuning to arrive at a functional solution.

Our goal is to provide you with a platform that's ready to go the moment you pull a single Docker file.
Our goal is to provide you with a platform that's ready to go the moment you pull a single Docker file.

## Write Side, Read Side

OramaCore is a modular system. We allow it to run as a monolith - where all the components are running in a single process - or as a distributed system, where you can scale each component independently.

To allow this, we split the system into two distinct sides: the **write side** and the **read side**.

If you're running OramaCore in a single node, you won't notice the difference. But if you're running it in a distributed system, you can scale the write side independently from the read side.

### Write Side

The write side is responsible for ingesting data, generating embeddings, and storing them in the vector database. It's also responsible for generating the full-text search index.

It's the part of the system that requires the most GPU power and memory, as it need to generate a lot of content, embeddings, and indexes.

In detail, the write side is responsible for:

- **Ingesting data**. It creates a buffer of documents and flushes them to the vector database and the full-text search index, rebuilding the immutable data structures used for search.
- **Generating embeddings**. It generates text embeddings for large datasets without interfering with the search performance.
- **Expanding content (coming soon)**. It is capable of reading images, code blocks, and other types of content, and generating descriptions and metadata for them.

Every insertion, deletion, or update of a document will be handled by the write side.

### Read Side

The read side is responsible for handling queries, searching for documents, and returning the results to the user.

It's also the home of the Answer Engine, which is responsible for generating answers to questions and performing chain of actions based on the user's input.

In detail, the read side is responsible for:

- **Handling queries**. It receives the user's query, translates it into a query that the vector database can understand, and returns the results.
- **Searching for documents**. It searches for documents in the full-text search index and the vector database.
- **Answer Engine**. It generates answers to questions, performs chain of actions, and runs custom agents.

Every query, question, or action will be handled by the read side.
1 change: 1 addition & 0 deletions docs/content/docs/meta.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
"pages": [
"---Getting Started---",
"index",
"api-key",
"configuration",
"running-oramacore",
"apis",
Expand Down

0 comments on commit f2fa514

Please sign in to comment.