Skip to content

Commit

Permalink
Merge pull request #285 from biblioverse/ai-context
Browse files Browse the repository at this point in the history
Refactor AI context handling and update
  • Loading branch information
SergioMendolia authored Dec 30, 2024
2 parents f05fad5 + 2387bd0 commit 5c5a931
Show file tree
Hide file tree
Showing 11 changed files with 148 additions and 51 deletions.
3 changes: 3 additions & 0 deletions config/services.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ parameters:
AI_CONTEXT_AMAZON_ENABLED: '%env(bool:AI_CONTEXT_AMAZON_ENABLED)%'
env(AI_CONTEXT_AMAZON_ENABLED): false

AI_CONTEXT_FULL_EPUB: '%env(bool:AI_CONTEXT_FULL_EPUB)%'
env(AI_CONTEXT_FULL_EPUB): false


services:
# default configuration for services in *this* file
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,16 +91,23 @@ On a book page, you can click on the "generate summary" or "generate tags" butt
## Run it through your whole library
You can run the following command:
```
docker compose exec biblioteca bin/console books:tag
docker compose exec biblioteca bin/console books:ai <tags|summary|both>
```
It will tag all your books that currently don't have tags.

If you want to use a user's configured prompts:

```
docker compose exec biblioteca bin/console books:tag <userid>
docker compose exec biblioteca bin/console books:ai <tags|summary|both> <userid>
```

If you want to use it on a specific book:

```
docker compose exec biblioteca bin/console books:ai <tags|summary|both> -b <book-id>
```


## Add context
There are currently 2 ways to add context for better results. Both can be enabled at the same time.

Expand Down
10 changes: 2 additions & 8 deletions doc/src/content/docs/guides/Administrator/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,9 @@ Relocate all books to their calculated folder. This is necessary only if you wan
## `books:scan`
Will scan the `public/books` folder and add all books to the database. If a book already exists, it will be updated.

## `books:tag`
## `books:ai`

> First you will need to setup an openAi chatgpt key in a user's properties and run the command as this user
Then you can run the command to generate tag for all books in the library:

```bash
bin/console books:tag <user_id>
```
Check the documentation for this in the AI chapter

## `cache:clear`
Clears the cache
Expand Down
17 changes: 3 additions & 14 deletions src/Ai/CommunicatorDefiner.php
Original file line number Diff line number Diff line change
Expand Up @@ -7,20 +7,9 @@
class CommunicatorDefiner
{
public const BASE_PROMPT = "
As a highly skilled and experienced librarian AI model, I'm here to provide you with deep insights and practical recommendations
based on my vast experience in the field of literature and knowledge organization. Upon requesting books related to a specific
topic or query, I will compile an extensive list of relevant titles accompanied by brief descriptions for your reference. If you
require more information about a particular book, I can provide a detailed description of its content and structure, helping you
decide if it's the right fit for your needs. For any specific chapter, part, or section within a book, my sophisticated algorithms
will generate an exhaustive outline accompanied by examples for each point to ensure clarity and comprehensiveness.
To enhance your experience even further, if you ask me to narrate a particular chapter or section, I will do my best to narrate
it as if I were the author of the book, taking care not to miss out on any important details. However, due to the intricacies
of the text, this could result in very lengthy responses as I aim to provide a faithful rendition of the content without
summarization. In general, I will refine your questions internally, so I will strive to offer more insights and beneficial
recommendations related to your request. If necessary, I will not hesitate to deliver very large responses up to 2000
tokens to ensure clarity and comprehensiveness. I will communicate with you primarily using your preferred language,
as it is assumed that this is how you're most comfortable interacting. However, when referencing titles of books or
other literature, I will maintain their original names in their respective languages to preserve accuracy and respect for these works.";
As a highly skilled and experienced librarian AI model, I'm here to help you tag and summarize books as close to the
original as possible. I will never make up any information. I will only use the information you provide me.
I will communicate with you primarily using your preferred language.";

/**
* @param iterable<AiCommunicatorInterface> $handlers
Expand Down
11 changes: 7 additions & 4 deletions src/Ai/Context/ContextBuilder.php
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,18 @@ public function __construct(

public function getContext(AbstractBookPrompt $abstractBookPrompt): AbstractBookPrompt
{
$prompt = "Use the following pieces of context to answer the question at the end. If you don't know the answer, don't try to make up an answer.";

$prompt = "Use the following pieces of context to answer the query at the end. If you don't know the answer, don't try to make up an answer. Context:
---------------------
";
foreach ($this->handlers as $handler) {
if ($handler->isEnabled()) {
$prompt .= $handler->getContextForPrompt($abstractBookPrompt->getBook());
}
}

$prompt .= $abstractBookPrompt->getPrompt();
$prompt .= '
---------------------
Given the context information and not prior knowledge, answer the query.
Query: '.$abstractBookPrompt->getPrompt();

$abstractBookPrompt->setPrompt($abstractBookPrompt->replaceBookOccurrence($prompt));

Expand Down
56 changes: 56 additions & 0 deletions src/Ai/Context/EpubContextBuilder.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
<?php

namespace App\Ai\Context;

use App\Entity\Book;
use App\Service\BookFileSystemManagerInterface;
use Kiwilan\Ebook\Ebook;
use Kiwilan\Ebook\Formats\Epub\EpubModule;
use Symfony\Component\DependencyInjection\Attribute\Autowire;

class EpubContextBuilder implements ContextBuildingInteface
{
public function __construct(private readonly BookFileSystemManagerInterface $bookFileSystemManager, #[Autowire(param: 'AI_CONTEXT_FULL_EPUB')] private readonly bool $enable)
{
}

#[\Override]
public function isEnabled(): bool
{
return $this->enable;
}

#[\Override]
public function getContextForPrompt(Book $book): string
{
if ($book->getExtension() !== 'epub') {
return '';
}
$prompt = '';

$bookFile = $this->bookFileSystemManager->getBookFile($book);

$ebook = Ebook::read($bookFile->getPathname());

if (!$ebook instanceof Ebook) {
return '';
}

$epub = $ebook->getParser()?->getEpub();

if (!$epub instanceof EpubModule) {
return '';
}

$htmlArray = $epub->getHtml();

foreach ($htmlArray as $html) {
$text = $html->getBody();
$prompt .= strip_tags($text ?? '');
}

$prompt = preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", '', $prompt);

return preg_replace("/([\n]|[\r])+/", ' ', (string) $prompt) ?? '';
}
}
9 changes: 6 additions & 3 deletions src/Ai/OllamaCommunicator.php
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ private function getOllamaUrl(string $path): string
#[\Override]
public function initialise(string $basePrompt): void
{
$this->basePrompt = $basePrompt;
$this->sendRequest($this->getOllamaUrl('pull'), [
'model' => $this->model,
], 'POST');
Expand All @@ -73,14 +74,16 @@ public function initialise(string $basePrompt): void
#[\Override]
public function interrogate(BookPromptInterface $prompt): string|array
{
$response = $this->sendRequest($this->getOllamaUrl('generate'), [
$params = [
'model' => $this->model,
'prompt' => $prompt->getPrompt(),
'system' => $this->basePrompt,
'options' => [
'temperature' => 0.3,
'temperature' => 0,
],
], 'POST');
];

$response = $this->sendRequest($this->getOllamaUrl('generate'), $params, 'POST');

return $prompt->convertResult($response);
}
Expand Down
7 changes: 6 additions & 1 deletion src/Ai/Prompt/AbstractBookPrompt.php
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,12 @@ public function getBook(): Book

public function replaceBookOccurrence(string $prompt): string
{
$bookString = '"'.$this->book->getTitle().'" by '.implode(' and ', $this->book->getAuthors());
$title = $this->book->getTitle();
if (preg_match('/T\d+/', $title) !== false) {
$bookString = '"'.$this->book->getTitle().'" by '.implode(' and ', $this->book->getAuthors());
} else {
$bookString = ' by '.implode(' and ', $this->book->getAuthors());
}

if ($this->book->getSerie() !== null) {
$bookString .= ' number '.$this->book->getSerieIndex().' in the series "'.$this->book->getSerie().'"';
Expand Down
70 changes: 53 additions & 17 deletions src/Command/BooksTagCommand.php → src/Command/BooksAiCommand.php
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
use App\Ai\AiCommunicatorInterface;
use App\Ai\CommunicatorDefiner;
use App\Ai\Context\ContextBuilder;
use App\Ai\Prompt\SummaryPrompt;
use App\Ai\Prompt\TagPrompt;
use App\Entity\Book;
use App\Entity\User;
Expand All @@ -18,10 +19,10 @@
use Symfony\Component\Console\Style\SymfonyStyle;

#[AsCommand(
name: 'books:tag',
name: 'books:ai',
description: 'Add a short description for your command',
)]
class BooksTagCommand extends Command
class BooksAiCommand extends Command
{
public function __construct(
private readonly EntityManagerInterface $em,
Expand All @@ -35,24 +36,34 @@ public function __construct(
protected function configure(): void
{
$this
->addArgument('type', InputArgument::REQUIRED, '`summary`, `tags` or `both`')
->addArgument('userid', InputArgument::OPTIONAL, 'user for the prompts. Default prompts used if not provided')
->addOption('book', 'b', InputArgument::OPTIONAL, 'book id to process (otherwise all books are processed)')
;
}

#[\Override]
protected function execute(InputInterface $input, OutputInterface $output): int
{
$io = new SymfonyStyle($input, $output);
$arg1 = $input->getArgument('userid');
$userId = $input->getArgument('userid');

$type = $input->getArgument('type');
if (!in_array($type, ['summary', 'tags', 'both'], true)) {
$io->error('Invalid type');

return Command::FAILURE;
}
$bookId = $input->getOption('book');

$user = null;
if ($arg1 !== null) {
$user = $this->em->getRepository(User::class)->find($arg1);
if ($userId !== null) {
$user = $this->em->getRepository(User::class)->find($userId);
if (!$user instanceof User) {
$io->error('User not found');
}

return Command::FAILURE;
return Command::FAILURE;
}
}

$communicator = $this->aiCommunicator->getCommunicator();
Expand All @@ -63,9 +74,22 @@ protected function execute(InputInterface $input, OutputInterface $output): int
return Command::FAILURE;
}

$qb = $this->em->getRepository(Book::class)->createQueryBuilder('book');
$qb->andWhere('book.tags = \'[]\'');
$books = $qb->getQuery()->getResult();
$io->title('AI data with '.$communicator::class);

if ($bookId === null) {
$io->note('Processing all books without tags or summary');
$qb = $this->em->getRepository(Book::class)->createQueryBuilder('book');
if ($type === 'tags' || $type === 'both') {
$qb->orWhere('book.tags = \'[]\'');
}
if ($type === 'summary' || $type === 'both') {
$qb->orWhere('book.summary is null');
}
$books = $qb->getQuery()->getResult();
} else {
$io->note('Processing book '.$bookId);
$books = $this->em->getRepository(Book::class)->findBy(['id' => $bookId]);
}

if (!is_array($books)) {
$io->error('Failed to get books');
Expand All @@ -79,16 +103,28 @@ protected function execute(InputInterface $input, OutputInterface $output): int
foreach ($books as $book) {
$progress->setMessage($book->getSerie().' '.$book->getTitle().' ('.implode(' and ', $book->getAuthors()).')');
$progress->advance();
$tagPrompt = new TagPrompt($book, $user);
$tagPrompt = $this->contextBuilder->getContext($tagPrompt);
$array = $communicator->interrogate($tagPrompt);

if (is_array($array)) {
$io->writeln('🏷️ '.implode(' 🏷️ ', $array));
$book->setTags($array);
if ($type === 'summary' || $type === 'both') {
$summaryPrompt = new SummaryPrompt($book, $user);
$summaryPrompt = $this->contextBuilder->getContext($summaryPrompt);
$summary = $communicator->interrogate($summaryPrompt);
$io->block($summary, padding: true);
$book->setSummary($summary);
}

if ($type === 'tags' || $type === 'both') {
$tagPrompt = new TagPrompt($book, $user);
$tagPrompt = $this->contextBuilder->getContext($tagPrompt);

$array = $communicator->interrogate($tagPrompt);

if (is_array($array)) {
$io->block(implode(' 🏷️ ', $array), padding: true);
$book->setTags($array);
}
}

$this->em->flush();
// $this->em->flush();
}

$progress->finish();
Expand Down
1 change: 1 addition & 0 deletions src/Controller/ConfigurationController.php
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ public function index(ParameterBagInterface $parameterBagInterface): Response
'OLLAMA_URL' => 'Url of the Ollama API. do not forget the trailing slash. Example: http://ollama:11434/api/',
'OLLAMA_MODEL' => 'Ollama model to use',
'AI_CONTEXT_AMAZON_ENABLED' => 'Do you want to give context from Amazon to AI completions? This will scrape the Amazon website. Use with caution.',
'AI_CONTEXT_FULL_EPUB' => 'Do you want to give the full epub context to AI completions? This will cause the prompt to be very long and could increase costs when using a paid model.',
'WIKIPEDIA_API_TOKEN' => 'Wikipedia API token. Required if you want to give context from Wikipedia to AI. You will need to generate a personal api token in wikipedia.org',
];
foreach ($documentedParams as $key => $value) {
Expand Down
4 changes: 2 additions & 2 deletions src/Twig/Components/AiSuggestion.php
Original file line number Diff line number Diff line change
Expand Up @@ -84,10 +84,10 @@ public function generate(): void
default => throw new \InvalidArgumentException('Invalid field'),
};

$promptObj = $this->contextBuilder->getContext($promptObj);

$promptObj->setPrompt($this->prompt);

$promptObj = $this->contextBuilder->getContext($promptObj);

$result = $communicator->interrogate($promptObj);

$this->result = is_array($result) ? $result : [$result];
Expand Down

0 comments on commit 5c5a931

Please sign in to comment.