Skip to content

Commit

Permalink
pinecone-retrieval
Browse files Browse the repository at this point in the history
  • Loading branch information
sonam-pankaj95 committed May 8, 2024
1 parent a16c9b1 commit 500bb47
Show file tree
Hide file tree
Showing 5 changed files with 35 additions and 2 deletions.
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]
name = "embed_anything"

version = "0.1.14"
version = "0.1.15"
edition = "2021"

[dependencies]
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1CowJrqZxDDYJzkclI-rbHaZHgL9C6K3p?usp=sharing)
[![license]( https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![license]( https://img.shields.io/badge/Package-PYPI-blue.svg)](https://pypi.org/project/embed-anything/)
[![license](https://img.shields.io/discord/1223707915827937321?style=flat&logo=discord&link=https%3A%2F%2Fdiscord.gg%2FHGxDZxNt9G)](https://discord.gg/juETVTMdZu)

EmbedAnything is a powerful python library designed to streamline the creation and management of embedding pipelines. Whether you're working with text, images, audio, or any other type of data., EmbedAnything makes it easy to generate embeddings from multiple sources and store them efficiently in a vector database.

Expand Down
Binary file added Vector_database_files/test_paper.pdf
Binary file not shown.
32 changes: 32 additions & 0 deletions retrieval.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import embed_anything
from openai import OpenAI

import os
import time
from pinecone import Pinecone
import numpy as np



data = embed_anything.embed_directory('Vector_database_files\test_paper.pdf', embeder= "OpenAI")
embeddings = np.array([data.embedding for data in data])

print(len(data))
query= embed_anything.embed_query(["what is AI?"], embeder="OpenAI")

pc = Pinecone(api_key="")
index = pc.Index("anything")

# for i in range(len(data)):
# index.upsert(
# vectors=[{"id": str(i), "values": data[i].embedding, "metadata": {"text": data[i].text}}]
# )



def retrieval(query):
query_embedding = embed_anything.embed_query(query, embeder="OpenAI")
return index.query(vector=query_embedding[0].embedding, top_k=2)


print(retrieval(["what is AI?"]))

0 comments on commit 500bb47

Please sign in to comment.