Merge branch 'main' of github.com:oramasearch/orama-core

oramasearch · Jan 31, 2025 · f2fa514 · f2fa514
2 parents 593f451 + b3fa984
commit f2fa514
Show file tree

Hide file tree

Showing 5 changed files with 161 additions and 2 deletions.
diff --git a/docs/content/docs/api-key.mdx b/docs/content/docs/api-key.mdx
@@ -0,0 +1,120 @@
+---
+title: "API Keys"
+description: "API keys are used to authenticate requests to the OramaCore API"
+---
+import { Tab, Tabs } from 'fumadocs-ui/components/tabs';
+
+[As explained in the introduction](/docs#write-side-read-side), OramaCore is split in two sides: the **reader side** and the **writer side**.
+
+Therefore, depending on the operation you want to perform, you will need to use different API keys.
+
+In total, OramaCore will give you access to three different kinds API keys:
+
+## Master API Key
+
+<Callout type='warn'>
+**Not safe to share**. Never share the master API key publicly. Treat it as a password.
+</Callout>
+
+The **master API key** is an essential key that allows you to configure the OramaCore instance and create or delete new collections.
+
+It's configurable via the [`config.yaml`](/docs/configuration) file under the `writer_side` section:
+
+```yaml
+# ...
+writer_side:
+    master_api_key: foobar
+# ...
+```
+
+You will need this API key to:
+
+- Create a new collection
+- Delete a collection
+- Update the configuration of a collection
+
+In all the cases above, you will need to pass the master API key in the `Authorization` header of the request as a `Bearer` token.
+
+Example:
+
+<Tabs groupId='create' persist items={['cURL']}>
+```bash tab="cURL"
+curl -X POST \
+  http://localhost:8080/v0/collections \
+  -H 'Content-Type: application/json' \
+  -H 'Authorization: Bearer <master-api-key>' \
+  -d '{ 
+    "id": "products",
+    "read_api_key": "my-read-api-key",
+    "write_api_key": "my-write-api-key"
+  }'
+```
+</Tabs>
+
+As you can see, when creating a new collection, you will need to create the `read_api_key` and `write_api_key` as well. You will them use them to perform read and write operations on the collection.
+
+## Write API Key
+
+<Callout type='warn'>
+**Not safe to share**. Never share the write API key publicly. Treat it as a password.
+</Callout>
+
+The **write API key** is used to insert, update, or delete documents in a collection, as well as creating new [hooks](/docs/javascript-hooks/introduction) and [actions](/docs/party-planner#writing-custom-actions).
+
+Every collection has its own write API key, which is generated when you create the collection.
+
+You will need this API key to:
+
+- Insert one or more documents into a collection
+- Update one or more documents in a collection
+- Delete one or more documents from a collection
+- Create a new hook or action
+
+In all the cases above, you will need to pass the write API key in the `Authorization` header of the request as a `Bearer` token.
+
+Example:
+
+<Tabs groupId='insert' persist items={['cURL']}>
+```bash tab="cURL"
+curl -X PATCH \
+  http://localhost:8080/v0/collections/{COLLECTION_ID}/documents \
+  -H 'Content-Type: application/json' \
+  -H 'Authorization: Bearer <write-api-key>' \
+  -d '[
+    {
+      "title": "My first document",
+      "content": "This is the content of my first document."
+    }
+  ]'
+```
+</Tabs>
+
+## Read API Key
+
+<Callout type='info'>
+**Safe to share**. This API key performs read operations only. You can share it publicly.
+</Callout>
+
+The **read API key** is used to perform read operations on a collection.
+
+Every collection has its own read API key, which is generated when you create the collection.
+
+You will need this API key to:
+
+- Perform full-text, hybrid, or vector search
+- Read the documents in a collection
+- Perform answer sessions
+
+In all the cases above, you will need to pass the read API key as a query parameter in the request.
+
+Example:
+
+<Tabs groupId='search' persist items={['cURL']}>
+```bash tab="cURL"
+curl -X POST \
+  http://localhost:8080/v0/collections/{COLLECTION_ID}/search?api_key=<read-api-key> \
+  -H 'Content-Type: application/json' \
+  -d '{ "term": "How to insert documents in OramaCore?" }'
+```
+</Tabs>
+
diff --git a/docs/content/docs/apis/create-collection.mdx b/docs/content/docs/apis/create-collection.mdx
@@ -22,7 +22,6 @@ Creating a new collection is as simple as:
 ```bash tab="cURL"
 curl -X POST \
   http://localhost:8080/v0/collections \
-  -H 'Content-Type: application/json' \
   -H 'Authorization: Bearer <api-key>' \
   -d '{
     "id": "products",

diff --git a/docs/content/docs/configuration.mdx b/docs/content/docs/configuration.mdx
@@ -23,6 +23,8 @@ http:
 
 writer_side:
     output: in-memory
+    # Replace the following value with your own API key
+    master_api_key: foobar
     config:
         data_dir: ./.data/writer
         # The maximum number of embeddings that can be stored in the queue
@@ -77,6 +79,7 @@ The `http` section configures the HTTP server that serves the OramaCore API. Her
 The `writer_side` section configures the writer side of OramaCore. Here are the available options:
 
 - `output`: The output where the writer side will store the data. By default, it's set to `in-memory`.
+- `master_api_key`: The master API key used to authenticate the requests to the writer side. By default, it's set to an empty string. See more about the available API keys in the [API Keys](/docs/api-key) section.
 - `config`: The configuration options for the writer side. Here are the available options:
   - `data_dir`: The directory where the writer side will persist the data on disk. By default, it's set to `./.data/writer`.
   - `embedding_queue_limit`: The maximum number of embeddings that can be stored in the queue before the writer starts to be blocked. By default, it's set to `50000`.

diff --git a/docs/content/docs/index.mdx b/docs/content/docs/index.mdx
@@ -53,4 +53,40 @@ When building OramaCore, we made a deliberate choice to create an opinionated sy
 
 There are plenty of great vector databases and full-text search engines out there. But most of them don't work seamlessly together out of the box—they often require extensive fine-tuning to arrive at a functional solution.
 
-Our goal is to provide you with a platform that's ready to go the moment you pull a single Docker file.
+Our goal is to provide you with a platform that's ready to go the moment you pull a single Docker file.
+
+## Write Side, Read Side
+
+OramaCore is a modular system. We allow it to run as a monolith - where all the components are running in a single process - or as a distributed system, where you can scale each component independently.
+
+To allow this, we split the system into two distinct sides: the **write side** and the **read side**.
+
+If you're running OramaCore in a single node, you won't notice the difference. But if you're running it in a distributed system, you can scale the write side independently from the read side.
+
+### Write Side
+
+The write side is responsible for ingesting data, generating embeddings, and storing them in the vector database. It's also responsible for generating the full-text search index.
+
+It's the part of the system that requires the most GPU power and memory, as it need to generate a lot of content, embeddings, and indexes.
+
+In detail, the write side is responsible for:
+
+- **Ingesting data**. It creates a buffer of documents and flushes them to the vector database and the full-text search index, rebuilding the immutable data structures used for search.
+- **Generating embeddings**. It generates text embeddings for large datasets without interfering with the search performance.
+- **Expanding content (coming soon)**. It is capable of reading images, code blocks, and other types of content, and generating descriptions and metadata for them.
+
+Every insertion, deletion, or update of a document will be handled by the write side.
+
+### Read Side
+
+The read side is responsible for handling queries, searching for documents, and returning the results to the user.
+
+It's also the home of the Answer Engine, which is responsible for generating answers to questions and performing chain of actions based on the user's input.
+
+In detail, the read side is responsible for:
+
+- **Handling queries**. It receives the user's query, translates it into a query that the vector database can understand, and returns the results.
+- **Searching for documents**. It searches for documents in the full-text search index and the vector database.
+- **Answer Engine**. It generates answers to questions, performs chain of actions, and runs custom agents.
+
+Every query, question, or action will be handled by the read side.
diff --git a/docs/content/docs/meta.json b/docs/content/docs/meta.json
@@ -5,6 +5,7 @@
   "pages": [
     "---Getting Started---",
     "index",
+    "api-key",
     "configuration",
     "running-oramacore",
     "apis",