Cloud Run Demo - Serverless GPUs with Ollama

This demo walks through deploying a GPU workload on Cloud Run built with Ollama.

Prerequisites

A Google Cloud account with a logged-in gcloud CLI
Approved quota to use GPUs with Cloud Run
CLI for ollama installed
(optional if you want to run the load generator) docker

Demo script

On the laptop

# run local ollama with a GPU
ollama serve

# list models and show llama3.2
ollama list

# show how ollama works, note the use of a GPU
ollama run llama3.2 “what is google cloud run”

Cloud experience

# what about in PRODUCTION? – let’s try a serverless GPU
time gcloud beta run deploy ollama-llama32 \
--image docker.io/gabemonroy/ollama-llama3.2:latest \
--concurrency 1 \
--gpu 1 \
--allow-unauthenticated

# export the URL for the ollama app running in the cloud
export URL=<url>

# curl the API to see if it's working
curl $URL

# show ollama streaming a response via a cloud run gpu
OLLAMA_HOST=$URL ollama run llama3.2 “what is google cloud run”

# run the load generator to simulate 100 clients
docker run \
-e PROJECT_ID=graceful-wall-382722 \
-e SERVICE_NAME=ollama-llama32 \
-e BACKEND_URL=$URL \
-e NUM_CLIENTS=100 \
gcr.io/fcrisciani/tail_logger

# demo teardown
gcloud run services delete ollama-llama32

Demo setup stuff

# build and push multi-arch image
docker buildx build --platform linux/amd64 --push -t gabemonroy/ollama-llama3.2:latest .
 .

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Dockerfile		Dockerfile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cloud Run Demo - Serverless GPUs with Ollama

Prerequisites

Demo script

On the laptop

Cloud experience

Demo setup stuff

About

Releases

Packages

Languages

gabrtv/serverless-gpu-demo

Folders and files

Latest commit

History

Repository files navigation

Cloud Run Demo - Serverless GPUs with Ollama

Prerequisites

Demo script

On the laptop

Cloud experience

Demo setup stuff

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages