generated from oracle-devrel/repo-template
-
Notifications
You must be signed in to change notification settings - Fork 46
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into mfioramo-patch-3
- Loading branch information
Showing
21 changed files
with
377 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
98 changes: 98 additions & 0 deletions
98
ai/generative-ai-service/decode-Images-and-Videos-with-OCI-GenAI/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
|
||
# Decode Images and Videos with OCI GenAI | ||
|
||
This is an AI-powered application designed to unlock insights hidden within media files using the Oracle Cloud Infrastructure (OCI) Generative AI services. This application enables users to analyze images and videos, generating detailed summaries in multiple languages. Whether you are a content creator, researcher, or media enthusiast, this app helps you interpret visual content with ease. | ||
|
||
<img src="./image.png"> | ||
</img> | ||
--- | ||
|
||
## Features | ||
|
||
### 🌍 **Multi-Language Support** | ||
- Receive summaries in your preferred language, including: | ||
- English, French, Arabic, Spanish, Italian, German, Portuguese, Japanese, Korean, and Chinese. | ||
|
||
### 🎥 **Customizable Frame Processing for Videos** | ||
- Extract video frames at user-defined intervals. | ||
- Analyze specific frame ranges to tailor your results for precision. | ||
|
||
### ⚡ **Parallel Processing** | ||
- Uses efficient parallel computation for quick and accurate frame analysis. | ||
|
||
### 🖼️ **Image Analysis** | ||
- Upload images to generate detailed summaries based on your input prompt. | ||
|
||
### 🧠 **Cohesive Summaries** | ||
- Combines individual frame insights to create a seamless, cohesive summary of the video’s overall theme, events, and key details. | ||
|
||
--- | ||
|
||
## Technologies Used | ||
- **[Streamlit](https://streamlit.io/):** For building an interactive user interface. | ||
- **[Oracle Cloud Infrastructure (OCI) Generative AI](https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm):** For powerful image and video content analysis. | ||
- **[OpenCV](https://opencv.org/):** For video frame extraction and processing. | ||
- **[Pillow (PIL)](https://pillow.readthedocs.io/):** For image handling and processing. | ||
- **[tqdm](https://tqdm.github.io/):** For progress visualization in parallel processing. | ||
|
||
--- | ||
|
||
## Installation | ||
|
||
1. **Clone the repository:** | ||
|
||
|
||
2. **Install dependencies:** | ||
Make sure you have Python 3.8+ installed. Then, install the required libraries: | ||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
|
||
3. **Configure OCI:** | ||
- Set up your OCI configuration by creating or updating the `~/.oci/config` file with your credentials and profile. | ||
- Replace placeholders like `compartmentId`, `llm_service_endpoint`, and `visionModel` in the code with your actual values. | ||
|
||
--- | ||
|
||
## Usage | ||
|
||
1. **Run the application:** | ||
```bash | ||
streamlit run app.py | ||
``` | ||
|
||
2. **Upload a file:** | ||
- Use the sidebar to upload an image (`.png`, `.jpg`, `.jpeg`) or a video (`.mp4`, `.avi`, `.mov`). | ||
|
||
3. **Set parameters:** | ||
- For videos, adjust the frame extraction interval and select specific frame ranges for analysis. | ||
|
||
4. **Analyze and summarize:** | ||
- Enter a custom prompt to guide the AI in generating a meaningful summary. | ||
- Choose the output language from the sidebar. | ||
|
||
5. **Get results:** | ||
- View detailed image summaries or cohesive video summaries directly in the app. | ||
|
||
--- | ||
|
||
## Screenshots | ||
### Image Analysis | ||
<img src="./image2.png"> | ||
</img> | ||
|
||
### Video Analysis | ||
<img src="./image3.png"> | ||
</img> | ||
|
||
--- | ||
|
||
|
||
## Acknowledgments | ||
- Oracle Cloud Infrastructure Generative AI for enabling state-of-the-art visual content analysis. | ||
- Open-source libraries like OpenCV, Pillow, and Streamlit for providing powerful tools to build this application. | ||
|
||
--- | ||
|
||
## Contact | ||
If you have questions or feedback, feel free to reach out via [[email protected]](mailto:[email protected]). |
203 changes: 203 additions & 0 deletions
203
ai/generative-ai-service/decode-Images-and-Videos-with-OCI-GenAI/app.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,203 @@ | ||
# Author: Ansh | ||
import streamlit as st | ||
import oci | ||
import base64 | ||
import cv2 | ||
from PIL import Image | ||
from concurrent.futures import ThreadPoolExecutor | ||
from tqdm import tqdm | ||
|
||
# OCI Configuration | ||
compartmentId = "ocid1.compartment.oc1..XXXXXXXXXXXXXxxxxxxxxxxxxxxxxxxxxxxxxxxx" | ||
llm_service_endpoint = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com" | ||
CONFIG_PROFILE = "DEFAULT" | ||
visionModel = "meta.llama-3.2-90b-vision-instruct" | ||
summarizeModel = "cohere.command-r-plus-08-2024" | ||
config = oci.config.from_file('~/.oci/config', CONFIG_PROFILE) | ||
llm_client = oci.generative_ai_inference.GenerativeAiInferenceClient( | ||
config=config, | ||
service_endpoint=llm_service_endpoint, | ||
retry_strategy=oci.retry.NoneRetryStrategy(), | ||
timeout=(10, 240) | ||
) | ||
|
||
# Functions for Image Analysis | ||
def encode_image(image_path): | ||
with open(image_path, "rb") as image_file: | ||
return base64.b64encode(image_file.read()).decode("utf-8") | ||
|
||
# Functions for Video Analysis | ||
def encode_cv2_image(frame): | ||
_, buffer = cv2.imencode('.jpg', frame) | ||
return base64.b64encode(buffer).decode("utf-8") | ||
|
||
# Common Functions | ||
def get_message(encoded_image=None, user_prompt=None): | ||
content1 = oci.generative_ai_inference.models.TextContent() | ||
content1.text = user_prompt | ||
|
||
message = oci.generative_ai_inference.models.UserMessage() | ||
message.content = [content1] | ||
|
||
if encoded_image: | ||
content2 = oci.generative_ai_inference.models.ImageContent() | ||
image_url = oci.generative_ai_inference.models.ImageUrl() | ||
image_url.url = f"data:image/jpeg;base64,{encoded_image}" | ||
content2.image_url = image_url | ||
message.content.append(content2) | ||
return message | ||
|
||
def get_chat_request(encoded_image=None, user_prompt=None): | ||
chat_request = oci.generative_ai_inference.models.GenericChatRequest() | ||
chat_request.messages = [get_message(encoded_image, user_prompt)] | ||
chat_request.api_format = oci.generative_ai_inference.models.BaseChatRequest.API_FORMAT_GENERIC | ||
chat_request.num_generations = 1 | ||
chat_request.is_stream = False | ||
chat_request.max_tokens = 500 | ||
chat_request.temperature = 0.75 | ||
chat_request.top_p = 0.7 | ||
chat_request.top_k = -1 | ||
chat_request.frequency_penalty = 1.0 | ||
return chat_request | ||
|
||
def cohere_chat_request(encoded_image=None, user_prompt=None): | ||
print(" i am here") | ||
chat_request = oci.generative_ai_inference.models.CohereChatRequest() | ||
chat_request.api_format = oci.generative_ai_inference.models.BaseChatRequest.API_FORMAT_COHERE | ||
message = get_message(encoded_image, user_prompt) | ||
chat_request.message = message.content[0].text | ||
chat_request.is_stream = False | ||
chat_request.preamble_override = "Make sure you answer only in "+ lang_type | ||
chat_request.max_tokens = 500 | ||
chat_request.temperature = 0.75 | ||
chat_request.top_p = 0.7 | ||
chat_request.top_k = 0 | ||
chat_request.frequency_penalty = 1.0 | ||
return chat_request | ||
|
||
|
||
def get_chat_detail(chat_request,model): | ||
chat_detail = oci.generative_ai_inference.models.ChatDetails() | ||
chat_detail.serving_mode = oci.generative_ai_inference.models.OnDemandServingMode(model_id=model) | ||
chat_detail.compartment_id = compartmentId | ||
chat_detail.chat_request = chat_request | ||
return chat_detail | ||
|
||
def extract_frames(video_path, interval=1): | ||
frames = [] | ||
cap = cv2.VideoCapture(video_path) | ||
frame_rate = int(cap.get(cv2.CAP_PROP_FPS)) | ||
success, frame = cap.read() | ||
count = 0 | ||
|
||
while success: | ||
if count % (frame_rate * interval) == 0: | ||
frames.append(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)) | ||
success, frame = cap.read() | ||
count += 1 | ||
cap.release() | ||
return frames | ||
|
||
def process_frame(llm_client, frame, prompt): | ||
encoded_image = encode_cv2_image(frame) | ||
try: | ||
llm_request = get_chat_request(encoded_image, prompt) | ||
llm_payload = get_chat_detail(llm_request,visionModel) | ||
llm_response = llm_client.chat(llm_payload) | ||
return llm_response.data.chat_response.choices[0].message.content[0].text | ||
except Exception as e: | ||
return f"Error processing frame: {str(e)}" | ||
|
||
def process_frames_parallel(llm_client, frames, prompt): | ||
with ThreadPoolExecutor() as executor: | ||
results = list(tqdm( | ||
executor.map(lambda frame: process_frame(llm_client, frame, prompt), frames), | ||
total=len(frames), | ||
desc="Processing frames" | ||
)) | ||
return results | ||
|
||
def generate_final_summary(llm_client, frame_summaries): | ||
combined_summaries = "\n".join(frame_summaries) | ||
final_prompt = ( | ||
"You are a video content summarizer. Below are summaries of individual frames extracted from a video. " | ||
"Using these frame summaries, create a cohesive and concise summary that describes the content of the video as a whole. " | ||
"Focus on providing insights about the overall theme, events, or key details present in the video, and avoid referring to individual frames or images explicitly.\n\n" | ||
f"{combined_summaries}" | ||
) | ||
try: | ||
llm_request = cohere_chat_request(user_prompt=final_prompt) | ||
llm_payload = get_chat_detail(llm_request,summarizeModel) | ||
llm_response = llm_client.chat(llm_payload) | ||
return llm_response.data.chat_response.text | ||
except Exception as e: | ||
return f"Error generating final summary: {str(e)}" | ||
|
||
# Streamlit UI | ||
st.title("Decode Images and Videos with OCI GenAI") | ||
uploaded_file = st.sidebar.file_uploader("Upload an image or video", type=["png", "jpg", "jpeg", "mp4", "avi", "mov"]) | ||
user_prompt = st.text_input("Enter your prompt for analysis:", value="Describe the content of this image.") | ||
lang_type = st.sidebar.selectbox("Output Language", ["English", "French", "Arabic", "Spanish", "Italian", "German", "Portuguese", "Japanese", "Korean", "Chinese"]) | ||
|
||
if uploaded_file: | ||
if uploaded_file.name.split('.')[-1].lower() in ["png", "jpg", "jpeg"]: | ||
# Image Analysis | ||
temp_image_path = "temp_uploaded_image.jpg" | ||
with open(temp_image_path, "wb") as f: | ||
f.write(uploaded_file.getbuffer()) | ||
|
||
st.image(temp_image_path, caption="Uploaded Image", width=500) | ||
|
||
if st.button("Generate image Summary"): | ||
with st.spinner("Analyzing the image..."): | ||
try: | ||
encoded_image = encode_image(temp_image_path) | ||
llm_request = get_chat_request(encoded_image, user_prompt) | ||
llm_payload = get_chat_detail(llm_request,visionModel) | ||
llm_response = llm_client.chat(llm_payload) | ||
llm_text = llm_response.data.chat_response.choices[0].message.content[0].text | ||
st.success("OCI gen AI Response:") | ||
st.write(llm_text) | ||
except Exception as e: | ||
st.error(f"An error occurred: {str(e)}") | ||
elif uploaded_file.name.split('.')[-1].lower() in ["mp4", "avi", "mov"]: | ||
|
||
# Video Analysis | ||
temp_video_path = "temp_uploaded_video.mp4" | ||
video_html = f""" | ||
<video width="600" height="300" controls> | ||
<source src="data:video/mp4;base64,{base64.b64encode(open(temp_video_path, 'rb').read()).decode()}" type="video/mp4"> | ||
Your browser does not support the video tag. | ||
</video> | ||
""" | ||
st.markdown(video_html, unsafe_allow_html=True) | ||
with open(temp_video_path, "wb") as f: | ||
f.write(uploaded_file.getbuffer()) | ||
|
||
# st.video(temp_video_path) | ||
st.write("Processing the video...") | ||
|
||
frame_interval = st.sidebar.slider("Frame extraction interval (seconds)", 1, 10, 1) | ||
frames = extract_frames(temp_video_path, interval=frame_interval) | ||
num_frames = len(frames) | ||
st.write(f"Total frames extracted: {num_frames}") | ||
|
||
frame_range = st.sidebar.slider("Select frame range for analysis", 0, num_frames - 1, (0, num_frames - 1)) | ||
|
||
if st.button("Generate Video Summary"): | ||
with st.spinner("Analyzing selected frames..."): | ||
try: | ||
selected_frames = frames[frame_range[0]:frame_range[1] + 1] | ||
waiting_message = st.empty() | ||
waiting_message.write(f"Selected {len(selected_frames)} frames for processing.") | ||
# st.write(f"Selected {len(selected_frames)} frames for processing.") | ||
frame_summaries = process_frames_parallel(llm_client, selected_frames, user_prompt) | ||
# st.write("Generating final video summary...") | ||
waiting_message.empty() | ||
waiting_message.write("Generating final video summary...") | ||
final_summary = generate_final_summary(llm_client, frame_summaries) | ||
waiting_message.empty() | ||
st.success("Video Summary:") | ||
st.write(final_summary) | ||
except Exception as e: | ||
st.error(f"An error occurred: {str(e)}") |
Binary file added
BIN
+343 KB
ai/generative-ai-service/decode-Images-and-Videos-with-OCI-GenAI/image.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+455 KB
ai/generative-ai-service/decode-Images-and-Videos-with-OCI-GenAI/image2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+619 KB
ai/generative-ai-service/decode-Images-and-Videos-with-OCI-GenAI/image3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions
5
ai/generative-ai-service/decode-Images-and-Videos-with-OCI-GenAI/requirements.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
streamlit==1.33.0 | ||
oci==3.50.1 | ||
Pillow | ||
opencv-python-headless==4.10.0.84 | ||
tqdm==4.66.1 |
35 changes: 35 additions & 0 deletions
35
...-integration-and-automation/oracle-integration-cloud/05-oic-adapters-clickthrough/LICENSE
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
Copyright (c) 2024 Oracle and/or its affiliates. | ||
|
||
The Universal Permissive License (UPL), Version 1.0 | ||
|
||
Subject to the condition set forth below, permission is hereby granted to any | ||
person obtaining a copy of this software, associated documentation and/or data | ||
(collectively the "Software"), free of charge and under any and all copyright | ||
rights in the Software, and any and all patent rights owned or freely | ||
licensable by each licensor hereunder covering either (i) the unmodified | ||
Software as contributed to or provided by such licensor, or (ii) the Larger | ||
Works (as defined below), to deal in both | ||
|
||
(a) the Software, and | ||
(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if | ||
one is included with the Software (each a "Larger Work" to which the Software | ||
is contributed by such licensors), | ||
|
||
without restriction, including without limitation the rights to copy, create | ||
derivative works of, display, perform, and distribute the Software and make, | ||
use, sell, offer for sale, import, export, have made, and have sold the | ||
Software and the Larger Work(s), and to sublicense the foregoing rights on | ||
either these or other terms. | ||
|
||
This license is subject to the following condition: | ||
The above copyright notice and either this complete permission notice or at | ||
a minimum a reference to the UPL must be included in all copies or | ||
substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
26 changes: 26 additions & 0 deletions
26
...-and-automation/oracle-integration-cloud/05-oic-adapters-clickthrough/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# OIC Adapters clickthrough presentations | ||
|
||
Assets that contain oic adapters configuration and implementation practice for the | ||
- EPM Adapter | ||
- ERP Adapter | ||
- HCM Adapter | ||
- Kafka Adapter | ||
- EBS Adapter | ||
|
||
Review Date: 28.11.2024 | ||
|
||
# When to use these assets? | ||
|
||
These assets should be used whenever needed to design solutions with mentioned applications and resources integration. | ||
|
||
# How to use these asset? | ||
|
||
The information is generic in nature and not specified for a particular customer. | ||
|
||
# License | ||
|
||
Copyright (c) 2024 Oracle and/or its affiliates. | ||
|
||
Licensed under the Universal Permissive License (UPL), Version 1.0. | ||
|
||
See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details. |
Binary file added
BIN
+26.1 KB
...tegration-and-automation/oracle-integration-cloud/05-oic-adapters-clickthrough/README.pdf
Binary file not shown.
Binary file added
BIN
+1.67 MB
...ion-cloud/05-oic-adapters-clickthrough/files/Kafka Adapter click through presentation.pdf
Binary file not shown.
Binary file added
BIN
+2.05 MB
...loud/05-oic-adapters-clickthrough/files/Oracle EBS Adapter click through presentation.pdf
Binary file not shown.
Binary file added
BIN
+1.98 MB
...c-adapters-clickthrough/files/Oracle EPM Cloud Adapter Sep single slider presentation.pdf
Binary file not shown.
Binary file added
BIN
+2.6 MB
...5-oic-adapters-clickthrough/files/Oracle ERP Cloud Adapter click through presentation.pdf
Binary file not shown.
Binary file added
BIN
+1.3 MB
...5-oic-adapters-clickthrough/files/Oracle HCM Cloud Adapter click through presentation.pdf
Binary file not shown.
Oops, something went wrong.