Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement RAG #92

Open
wants to merge 84 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
120f9c3
Implement Sentence Transformer pipeline
RHeckerIntel Nov 23, 2024
69f7e0b
Start work on dart implementation of knowledge base
RHeckerIntel Nov 24, 2024
7614008
Merge branch 'fluent-ui-migration' into rhecker/fluent-ui-rag
RHeckerIntel Nov 24, 2024
9124504
Run sentence transformer from flutter ui
RHeckerIntel Nov 24, 2024
a410ab4
Implement basic langchain adapters for openvino
RHeckerIntel Nov 25, 2024
dd6a2d1
Add ObjectBox to store embeddings
RHeckerIntel Nov 25, 2024
38ce503
clean up knowledge base to start working on a UI
RHeckerIntel Nov 26, 2024
15038c8
Add crud for Knowledge Base groups
RHeckerIntel Nov 26, 2024
d563f7e
Try to extract tree.
RHeckerIntel Nov 26, 2024
2f8d7b8
Switch to provider to keep tree data
RHeckerIntel Nov 27, 2024
0bd2c7e
Somewhat working selection mode for the groups
RHeckerIntel Nov 27, 2024
10d0269
Clean up old code with embeddings
RHeckerIntel Nov 27, 2024
b1a21f8
Start doucment list stuff
RHeckerIntel Nov 27, 2024
46fdccf
Working RAG demo.
RHeckerIntel Nov 28, 2024
a89eadf
Implement streamer for OpenVINOLLM langchain and clean up
RHeckerIntel Nov 28, 2024
3e66bcf
Filter the embeddings based on a group when using ObjectBoxStore
RHeckerIntel Nov 29, 2024
65c9334
Add working document import dialog
RHeckerIntel Dec 2, 2024
69d2450
Merge branch 'main' into rhecker/fluent-ui-rag
RHeckerIntel Dec 3, 2024
fce62b3
Badly working knowledge base selector in llm
RHeckerIntel Dec 4, 2024
43f62b6
Merge branch 'main' into rhecker/fluent-ui-rag
RHeckerIntel Jan 14, 2025
8989b56
Fix streaming response with rag never finishing in ui
RHeckerIntel Jan 15, 2025
92ddc44
Allow simple rag using in memory store
RHeckerIntel Jan 15, 2025
cae4055
Allow deletion of documents from in memory vector
RHeckerIntel Jan 15, 2025
dcf089e
Add loader
RHeckerIntel Jan 15, 2025
3c3ad26
Render tooltip and error on userFile error
RHeckerIntel Jan 15, 2025
4bd1053
Align files to left
RHeckerIntel Jan 15, 2025
9a5bd01
Fix issue with shifting issue with loader
RHeckerIntel Jan 15, 2025
43f03c0
Improve document list for knowledge base
RHeckerIntel Jan 16, 2025
a7a9206
Add sources to show the sources used for the LLM in RAG
RHeckerIntel Jan 16, 2025
3a6b064
Implement JinjaTemplate for chat formatting in langchain
RHeckerIntel Jan 20, 2025
c2005c8
Fix database and improve snipping on document loaders
RHeckerIntel Jan 20, 2025
55edfb5
Remove own textSnipper and use langchain CharacterTextSplitter
RHeckerIntel Jan 21, 2025
0b9db66
Limit documents found in stores
RHeckerIntel Jan 21, 2025
2d3d723
Implement simple RAG UX design
RHeckerIntel Jan 21, 2025
cc65064
Enable performance metrics again
RHeckerIntel Jan 21, 2025
406a7cd
Sort knowledge base button above flow
RHeckerIntel Jan 21, 2025
fca432d
Change new knowledge base group button
RHeckerIntel Jan 21, 2025
76f0d5a
Rework download provider so it's global and can be background
RHeckerIntel Jan 23, 2025
f8f4075
Add download to knowledge base for the rag model
RHeckerIntel Jan 23, 2025
c9d8eb4
Add MiniLM-L6-H384-uncased openvino converted tokenizer to repo
RHeckerIntel Jan 23, 2025
7471bfb
Official converted version of all-MiniLM-L6-v2 uses token_type_ids
RHeckerIntel Jan 23, 2025
1892f1b
Use All Mini LM V6 tokenizer from assets
RHeckerIntel Jan 23, 2025
66e1df9
Remove unused experiment widget for rag
RHeckerIntel Jan 23, 2025
56d7099
Load or download embeddings model in text generation provider
RHeckerIntel Jan 23, 2025
535d2f2
Rename ensureEmbeddingsmodel to make more sense
RHeckerIntel Jan 23, 2025
5495900
Remove scratch
RHeckerIntel Jan 23, 2025
3416936
Disable knowledge base again for test build
RHeckerIntel Jan 23, 2025
fb0c0b9
Fix the missing copyright headers
RHeckerIntel Jan 23, 2025
ab5ea2d
Get tokenizerConfig instead and process that for hardcoded params in …
RHeckerIntel Jan 23, 2025
88efe49
Working RAG on windows. Added podofo, but its not working due to depe…
RHeckerIntel Jan 23, 2025
a95fe5e
Working podofo for reading pdfs on windows
RHeckerIntel Jan 26, 2025
ce1cdd3
Add docx file loader
RHeckerIntel Jan 27, 2025
4d99589
Add supportedExtensions for RAG
RHeckerIntel Jan 27, 2025
6ee98e9
Start working on knowledge base UX
RHeckerIntel Jan 27, 2025
47c605f
Move knowledge group rename to document list and use dialog
RHeckerIntel Jan 27, 2025
910de5e
Implement pretty table for document list
RHeckerIntel Jan 27, 2025
7c7bf85
Use stream in knowledge group tree
RHeckerIntel Jan 27, 2025
23efe14
Add confirm on delete button for knowledge base
RHeckerIntel Jan 27, 2025
f4e8f4d
Enable advanced RAG
RHeckerIntel Jan 27, 2025
7f5353f
Add copyright to new dialog
RHeckerIntel Jan 27, 2025
f2b1010
Merge branch 'main' into rhecker/fluent-ui-rag
RHeckerIntel Jan 27, 2025
18ad71c
Remove wrap of center in DropArea
RHeckerIntel Jan 27, 2025
8be0bb8
Remove json from required files for license header
RHeckerIntel Jan 27, 2025
a88f665
Blind attempt at adding podofo for linux
RHeckerIntel Jan 27, 2025
8e2cdc4
Fix user_message test
RHeckerIntel Jan 27, 2025
c22998f
Implement pure dart solution for reading pdfs
RHeckerIntel Jan 28, 2025
690769d
Remove podofo dependency
RHeckerIntel Jan 28, 2025
3787f6b
Add header to pdf parser
RHeckerIntel Jan 28, 2025
1e0b039
Fix drop area in knowledge base
RHeckerIntel Jan 28, 2025
7fbef2a
Fix wrong import for pdf loader
RHeckerIntel Jan 28, 2025
d5f55fb
Improve ux in knowledge base
RHeckerIntel Jan 28, 2025
d9762b5
Revert code back in to get extension
RHeckerIntel Jan 28, 2025
fb0049a
Merge branch 'main' into rhecker/fluent-ui-rag
RHeckerIntel Jan 28, 2025
f31b172
Use podofo again
RHeckerIntel Jan 29, 2025
476d977
Remove pdf parser code
RHeckerIntel Jan 29, 2025
69e850c
Add test for UserFileWidget
RHeckerIntel Jan 29, 2025
e4edb5e
Fix issues found by flutter analyze
RHeckerIntel Jan 29, 2025
42d0f82
Add custom jinja template for text generation
RHeckerIntel Jan 31, 2025
62e8881
Fix broken test when using flutter test
RHeckerIntel Feb 2, 2025
7f064d3
Merge branch 'main' into rhecker/fluent-ui-rag
RHeckerIntel Feb 4, 2025
2fd9257
Attempt to fix issue with building model api on macos workflow
RHeckerIntel Feb 5, 2025
3f05072
Merge branch 'main' into rhecker/fluent-ui-rag
RHeckerIntel Feb 11, 2025
f1a783f
Merge branch 'rhecker/fluent-ui-rag' of https://github.com/openvinoto…
RHeckerIntel Feb 11, 2025
f11799e
Merge branch 'main' into rhecker/fluent-ui-rag
RHeckerIntel Feb 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
assets/MiniLM-L6-H384-uncased/* filter=lfs diff=lfs merge=lfs -text
1 change: 1 addition & 0 deletions .licenserc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ header:
- test/**
paths-ignore:
- "**/BUILD"
- "**/*.json"
- "**/*.bzl"
- "**/*.proto"
- lib/interop/generated_bindings.dart
Expand Down
3 changes: 3 additions & 0 deletions assets/MiniLM-L6-H384-uncased/openvino_tokenizer.bin
Git LFS file not shown
3 changes: 3 additions & 0 deletions assets/MiniLM-L6-H384-uncased/openvino_tokenizer.xml
Git LFS file not shown
126 changes: 118 additions & 8 deletions lib/interop/generated_bindings.dart
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,21 @@ class OpenVINO {
late final _freeStatusOrDevices = _freeStatusOrDevicesPtr
.asFunction<void Function(ffi.Pointer<StatusOrDevices>)>();

void freeStatusOrEmbeddings(
ffi.Pointer<StatusOrEmbeddings> status,
) {
return _freeStatusOrEmbeddings(
status,
);
}

late final _freeStatusOrEmbeddingsPtr = _lookup<
ffi
.NativeFunction<ffi.Void Function(ffi.Pointer<StatusOrEmbeddings>)>>(
'freeStatusOrEmbeddings');
late final _freeStatusOrEmbeddings = _freeStatusOrEmbeddingsPtr
.asFunction<void Function(ffi.Pointer<StatusOrEmbeddings>)>();

void freeStatusOrCameraDevices(
ffi.Pointer<StatusOrCameraDevices> status,
) {
Expand Down Expand Up @@ -446,12 +461,14 @@ class OpenVINO {
ffi.Pointer<StatusOrModelResponse> llmInferencePrompt(
CLLMInference instance,
ffi.Pointer<pkg_ffi.Utf8> message,
bool apply_template,
double temperature,
double top_p,
) {
return _llmInferencePrompt(
instance,
message,
apply_template,
temperature,
top_p,
);
Expand All @@ -462,11 +479,12 @@ class OpenVINO {
ffi.Pointer<StatusOrModelResponse> Function(
CLLMInference,
ffi.Pointer<pkg_ffi.Utf8>,
ffi.Bool,
ffi.Float,
ffi.Float)>>('llmInferencePrompt');
late final _llmInferencePrompt = _llmInferencePromptPtr.asFunction<
ffi.Pointer<StatusOrModelResponse> Function(
CLLMInference, ffi.Pointer<pkg_ffi.Utf8>, double, double)>();
CLLMInference, ffi.Pointer<pkg_ffi.Utf8>, bool, double, double)>();

ffi.Pointer<Status> llmInferenceClearHistory(
CLLMInference instance,
Expand Down Expand Up @@ -496,20 +514,21 @@ class OpenVINO {
late final _llmInferenceForceStop = _llmInferenceForceStopPtr
.asFunction<ffi.Pointer<Status> Function(CLLMInference)>();

ffi.Pointer<StatusOrBool> llmInferenceHasChatTemplate(
ffi.Pointer<StatusOrString> llmInferenceGetTokenizerConfig(
CLLMInference instance,
) {
return _llmInferenceHasChatTemplate(
return _llmInferenceGetTokenizerConfig(
instance,
);
}

late final _llmInferenceHasChatTemplatePtr = _lookup<
late final _llmInferenceGetTokenizerConfigPtr = _lookup<
ffi
.NativeFunction<ffi.Pointer<StatusOrBool> Function(CLLMInference)>>(
'llmInferenceHasChatTemplate');
late final _llmInferenceHasChatTemplate = _llmInferenceHasChatTemplatePtr
.asFunction<ffi.Pointer<StatusOrBool> Function(CLLMInference)>();
.NativeFunction<ffi.Pointer<StatusOrString> Function(CLLMInference)>>(
'llmInferenceGetTokenizerConfig');
late final _llmInferenceGetTokenizerConfig =
_llmInferenceGetTokenizerConfigPtr
.asFunction<ffi.Pointer<StatusOrString> Function(CLLMInference)>();

ffi.Pointer<Status> llmInferenceClose(
CLLMInference instance,
Expand Down Expand Up @@ -885,6 +904,74 @@ class OpenVINO {
late final _graphRunnerStopCamera = _graphRunnerStopCameraPtr
.asFunction<ffi.Pointer<Status> Function(CGraphRunner)>();

ffi.Pointer<StatusOrSentenceTransformer> sentenceTransformerOpen(
ffi.Pointer<pkg_ffi.Utf8> model_path,
ffi.Pointer<pkg_ffi.Utf8> device,
) {
return _sentenceTransformerOpen(
model_path,
device,
);
}

late final _sentenceTransformerOpenPtr = _lookup<
ffi.NativeFunction<
ffi.Pointer<StatusOrSentenceTransformer> Function(
ffi.Pointer<pkg_ffi.Utf8>,
ffi.Pointer<pkg_ffi.Utf8>)>>('sentenceTransformerOpen');
late final _sentenceTransformerOpen = _sentenceTransformerOpenPtr.asFunction<
ffi.Pointer<StatusOrSentenceTransformer> Function(
ffi.Pointer<pkg_ffi.Utf8>, ffi.Pointer<pkg_ffi.Utf8>)>();

ffi.Pointer<StatusOrEmbeddings> sentenceTransformerGenerate(
CSentenceTransformer instance,
ffi.Pointer<pkg_ffi.Utf8> prompt,
) {
return _sentenceTransformerGenerate(
instance,
prompt,
);
}

late final _sentenceTransformerGeneratePtr = _lookup<
ffi.NativeFunction<
ffi.Pointer<StatusOrEmbeddings> Function(CSentenceTransformer,
ffi.Pointer<pkg_ffi.Utf8>)>>('sentenceTransformerGenerate');
late final _sentenceTransformerGenerate =
_sentenceTransformerGeneratePtr.asFunction<
ffi.Pointer<StatusOrEmbeddings> Function(
CSentenceTransformer, ffi.Pointer<pkg_ffi.Utf8>)>();

ffi.Pointer<Status> sentenceTransformerClose(
CSentenceTransformer instance,
) {
return _sentenceTransformerClose(
instance,
);
}

late final _sentenceTransformerClosePtr = _lookup<
ffi
.NativeFunction<ffi.Pointer<Status> Function(CSentenceTransformer)>>(
'sentenceTransformerClose');
late final _sentenceTransformerClose = _sentenceTransformerClosePtr
.asFunction<ffi.Pointer<Status> Function(CSentenceTransformer)>();

ffi.Pointer<StatusOrString> pdfExtractText(
ffi.Pointer<pkg_ffi.Utf8> pdf_path,
) {
return _pdfExtractText(
pdf_path,
);
}

late final _pdfExtractTextPtr = _lookup<
ffi.NativeFunction<
ffi.Pointer<StatusOrString> Function(
ffi.Pointer<pkg_ffi.Utf8>)>>('pdfExtractText');
late final _pdfExtractText = _pdfExtractTextPtr.asFunction<
ffi.Pointer<StatusOrString> Function(ffi.Pointer<pkg_ffi.Utf8>)>();

ffi.Pointer<StatusOrSpeechToText> speechToTextOpen(
ffi.Pointer<pkg_ffi.Utf8> model_path,
ffi.Pointer<pkg_ffi.Utf8> device,
Expand Down Expand Up @@ -1199,6 +1286,17 @@ final class StatusOrGraphRunner extends ffi.Struct {

typedef CGraphRunner = ffi.Pointer<ffi.Void>;

final class StatusOrSentenceTransformer extends ffi.Struct {
@ffi.Int()
external int status;

external ffi.Pointer<pkg_ffi.Utf8> message;

external CSentenceTransformer value;
}

typedef CSentenceTransformer = ffi.Pointer<ffi.Void>;

final class StatusOrSpeechToText extends ffi.Struct {
@ffi.Int()
external int status;
Expand Down Expand Up @@ -1277,6 +1375,18 @@ final class StatusOrTTIModelResponse extends ffi.Struct {
external ffi.Pointer<pkg_ffi.Utf8> value;
}

final class StatusOrEmbeddings extends ffi.Struct {
@ffi.Int()
external int status;

external ffi.Pointer<pkg_ffi.Utf8> message;

external ffi.Pointer<ffi.Float> value;

@ffi.Int()
external int size;
}

final class StatusOrVLMModelResponse extends ffi.Struct {
@ffi.Int()
external int status;
Expand Down
20 changes: 10 additions & 10 deletions lib/interop/llm_inference.dart
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,8 @@ class LLMInference {

NativeCallable<LLMInferenceCallbackFunctionFunction>? nativeListener;
final Pointer<StatusOrLLMInference> instance;
late bool chatEnabled;

LLMInference(this.instance) {
chatEnabled = hasChatTemplate();
}
LLMInference(this.instance);

static Future<LLMInference> init(String modelPath, String device) async {
final result = await Isolate.run(() {
Expand All @@ -38,11 +35,12 @@ class LLMInference {
return LLMInference(result);
}

Future<ModelResponse> prompt(String message, double temperature, double topP) async {
Future<ModelResponse> prompt(String message, bool applyTemplate, double temperature, double topP) async {
print("Actual prompt: $message");
int instanceAddress = instance.ref.value.address;
final result = await Isolate.run(() {
final messagePtr = message.toNativeUtf8();
final status = llmOV.llmInferencePrompt(Pointer<Void>.fromAddress(instanceAddress), messagePtr, temperature, topP);
final status = llmOV.llmInferencePrompt(Pointer<Void>.fromAddress(instanceAddress), messagePtr, applyTemplate, temperature, topP);
calloc.free(messagePtr);
return status;
})
Expand Down Expand Up @@ -83,14 +81,16 @@ class LLMInference {
}
}

bool hasChatTemplate() {
final status = llmOV.llmInferenceHasChatTemplate(instance.ref.value);
String getTokenizerConfig() {
final status = llmOV.llmInferenceGetTokenizerConfig(instance.ref.value);

if (StatusEnum.fromValue(status.ref.status) != StatusEnum.OkStatus) {
throw "LLM Chat template error: ${status.ref.status} ${status.ref.message.toDartString()}";
throw "LLM get Chat template error: ${status.ref.status} ${status.ref.message.toDartString()}";
}

return status.ref.value;
final result = status.ref.value.toDartString();
llmOV.freeStatusOrString(status);
return result;
}

void close() {
Expand Down
31 changes: 31 additions & 0 deletions lib/interop/pdf_extractor.dart
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
// Copyright (c) 2024 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0


//EXPORT StatusOrSentences* pdfExtractSentences(const char* pdf_path);

import 'dart:isolate';

import 'dart:ffi';
import 'package:ffi/ffi.dart';
import 'package:inference/interop/openvino_bindings.dart';

final ov = getBindings();


Future<String> getTextFromPdf(String path) {
return Isolate.run(() {
final pathPtr = path.toNativeUtf8();
final status = ov.pdfExtractText(pathPtr);
calloc.free(pathPtr);

if (StatusEnum.fromValue(status.ref.status) != StatusEnum.OkStatus) {
throw "pdfExtractText error: ${status.ref.status} ${status.ref.message.toDartString()}";
}

final output = status.ref.value.toDartString();
ov.freeStatusOrString(status);
return output;
});
}
89 changes: 89 additions & 0 deletions lib/interop/sentence_transformer.dart
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
// Copyright (c) 2024 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0

import 'dart:ffi';
import 'dart:isolate';
import 'dart:math';

import 'package:ffi/ffi.dart';
import 'package:inference/interop/openvino_bindings.dart';

final ov = getBindings();

class SentenceTransformer {

final Pointer<StatusOrSentenceTransformer> instance;
SentenceTransformer(this.instance);

static Future<SentenceTransformer> init(String modelPath, String device) async {
final result = await Isolate.run(() {
final modelPathPtr = modelPath.toNativeUtf8();
final devicePtr = device.toNativeUtf8();
final status = ov.sentenceTransformerOpen(modelPathPtr, devicePtr);
calloc.free(modelPathPtr);
calloc.free(devicePtr);
return status;
});


if (StatusEnum.fromValue(result.ref.status) != StatusEnum.OkStatus) {
throw "SentenceTransformer open error: ${result.ref.status} ${result.ref.message.toDartString()}";
}
return SentenceTransformer(result);
}

Future<List<double>> generate(String prompt) async{

int instanceAddress = instance.ref.value.address;
final status = await Isolate.run(() {
final promptPtr = prompt.toNativeUtf8();
final status = ov.sentenceTransformerGenerate(Pointer<Void>.fromAddress(instanceAddress), promptPtr);
calloc.free(promptPtr);
return status;
});

if (StatusEnum.fromValue(status.ref.status) != StatusEnum.OkStatus) {
throw "SentenceTransformer generate error: ${status.ref.status} ${status.ref.message.toDartString()}";
}

List<double> data = [];
for (int i = 0; i < status.ref.size; i++) {
data.add(status.ref.value[i]);
}

ov.freeStatusOrEmbeddings(status);

return data;
}

void close() {
final status = ov.sentenceTransformerClose(instance.ref.value);
if (StatusEnum.fromValue(status.ref.status) != StatusEnum.OkStatus) {
throw "Close error: ${status.ref.status} ${status.ref.message.toDartString()}";
}
ov.freeStatus(status);
}

static double cosineSimilarity(List<double> vec1, List<double> vec2) {
if (vec1.length != vec2.length) {
throw Exception("Vectors must be of the same size");
}

double dotProduct = 0.0;
double normVec1 = 0.0;
double normVec2 = 0.0;

for (int i = 0; i < vec1.length; ++i) {
dotProduct += vec1[i] * vec2[i];
normVec1 += vec1[i] * vec1[i];
normVec2 += vec2[i] * vec2[i];
}

if (normVec1 == 0 || normVec2 == 0) {
throw Exception("Vectors must not be zero-vectors");
}

return dotProduct / (sqrt(normVec1) * sqrt(normVec2));
}
}
Loading
Loading