Skip to content
This repository has been archived by the owner on Feb 15, 2025. It is now read-only.

feat(vllm)!: upgrade vllm backend and refactor deployment #854

Merged
merged 350 commits into from
Oct 3, 2024
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
350 commits
Select commit Hold shift + click to select a range
301fd42
fix Dockerfile lint
justinthelaw Sep 16, 2024
2a2c7d6
re-added default tensor size
justinthelaw Sep 16, 2024
98227a6
fix README
justinthelaw Sep 16, 2024
c620efa
cleanup
justinthelaw Sep 16, 2024
6593fbb
3.11.9 python
justinthelaw Sep 16, 2024
79272d1
fix FinishReason, add vLLM E2E
justinthelaw Sep 17, 2024
927ad25
llama completion test, add CompleteStreamChoice
justinthelaw Sep 17, 2024
e9e434f
condense e2e to 1 file, add max_new_tokens
justinthelaw Sep 17, 2024
d8c6767
formatting fix
justinthelaw Sep 17, 2024
29a9785
max_tokens for OpenAI client
justinthelaw Sep 17, 2024
a166c93
fix singular model_name arg
justinthelaw Sep 17, 2024
1c63741
isolate model_name to single test
justinthelaw Sep 17, 2024
2e82a9f
fix e2e-llama-cpp-python.yaml
justinthelaw Sep 17, 2024
807128e
Update e2e-vllm.yaml
justinthelaw Sep 17, 2024
e48331f
model_name fixture
justinthelaw Sep 17, 2024
e88b29f
Merge remote-tracking branch 'origin/main' into 1037-testvllm-impleme…
justinthelaw Sep 17, 2024
b366c5f
Merge remote-tracking branch 'origin/main' into 835-upgrade-vllm-for-…
justinthelaw Sep 17, 2024
ecbd4f7
handle request queue possibly being None
justinthelaw Sep 17, 2024
8552ce0
workaround GPU runner issue
justinthelaw Sep 17, 2024
af4e4ca
workaround GPU runner issue, pt.2
justinthelaw Sep 17, 2024
5b1532a
workaround GPU runner issue, pt.3
justinthelaw Sep 17, 2024
a8551e5
workaround GPU runner issue, pt.4
justinthelaw Sep 17, 2024
5f1b3c1
temp turn on e2e vllm, add nvidia-smi
justinthelaw Sep 17, 2024
1e7e98c
add nvidia setp
justinthelaw Sep 17, 2024
c46731a
fix cluster cmd, play with prompt
justinthelaw Sep 17, 2024
161fb3a
k3d permissions
justinthelaw Sep 17, 2024
84a0388
Update e2e-vllm.yaml
justinthelaw Sep 17, 2024
cb905ff
Update e2e-llama-cpp-python.yaml
justinthelaw Sep 17, 2024
6afb992
e2e-vllm.yaml with lfai-core
justinthelaw Sep 17, 2024
094da70
vllm e2e missing cluster create
justinthelaw Sep 17, 2024
f5d9f82
fix llama e2e steps
justinthelaw Sep 17, 2024
9fb28fa
test GPU cluster health
justinthelaw Sep 17, 2024
c19cec2
test GPU runner deps, pt.1
justinthelaw Sep 17, 2024
8767649
test GPU runner deps, pt.2
justinthelaw Sep 17, 2024
52857c5
test GPU runner deps, pt.3
justinthelaw Sep 17, 2024
287b911
test GPU runner deps, pt.4
justinthelaw Sep 17, 2024
e0b7e18
test GPU runner deps, pt.5
justinthelaw Sep 17, 2024
042248d
add comments
justinthelaw Sep 17, 2024
64079aa
better comments, log test outputs
justinthelaw Sep 17, 2024
0148b92
add wait-for, more comments
justinthelaw Sep 17, 2024
04ab8b2
remove formatting
justinthelaw Sep 17, 2024
635bdaf
fix CUDA pod test
justinthelaw Sep 17, 2024
c85a00c
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 17, 2024
4b73ba9
Merge remote-tracking branch 'origin/main' into 1037-testvllm-impleme…
justinthelaw Sep 17, 2024
f7b2a50
reduced context window
justinthelaw Sep 17, 2024
1bef345
remove pytest cache Make target
justinthelaw Sep 17, 2024
8b2af46
vLLM deployment debugging
justinthelaw Sep 17, 2024
9dae852
revert formatting
justinthelaw Sep 17, 2024
d44a907
fix build, add better debugging steps
justinthelaw Sep 17, 2024
5af2d70
fix Kubectl commands
justinthelaw Sep 17, 2024
8befd3b
nvidia daemonset debug
justinthelaw Sep 17, 2024
32a1c31
set nvidia runtime as default
justinthelaw Sep 17, 2024
1e7aca1
check node issues
justinthelaw Sep 17, 2024
2464cc4
draft, node detailed describe
justinthelaw Sep 17, 2024
c7b4aa3
Update cuda-vector-add.yaml
justinthelaw Sep 17, 2024
2245c7c
Update cuda-vector-add.yaml
justinthelaw Sep 17, 2024
b1933c2
more cluster runner debugging
justinthelaw Sep 17, 2024
325f520
remove erroneous journal to command
justinthelaw Sep 18, 2024
87cc755
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 18, 2024
32ed63c
Merge remote-tracking branch 'origin/main' into 1037-testvllm-impleme…
justinthelaw Sep 18, 2024
5c13861
docker-level debug addition
justinthelaw Sep 18, 2024
e4e4611
downgrade CUDA version
justinthelaw Sep 18, 2024
32dad39
downgrade CUDA version, again
justinthelaw Sep 18, 2024
850100f
try root full
justinthelaw Sep 18, 2024
d1d6e48
try root, pt.2
justinthelaw Sep 18, 2024
df61e46
try root, pt.3
justinthelaw Sep 18, 2024
34926e9
different tests and logs
justinthelaw Sep 18, 2024
547a64b
typo
justinthelaw Sep 18, 2024
59ce6f6
revert to old daemonset version
justinthelaw Sep 18, 2024
284812d
typo
justinthelaw Sep 18, 2024
b222543
add config.toml to k3s image
justinthelaw Sep 18, 2024
76cccbc
get failure reason
justinthelaw Sep 18, 2024
d6aacf0
Merge branch 'main' into 1037-testvllm-implement-e2e-testing-for-vllm
justinthelaw Sep 18, 2024
c9e7840
just see if change in containerd config works
justinthelaw Sep 18, 2024
1514ead
Dockerfile changes, apply both tests
justinthelaw Sep 18, 2024
a437b7b
typo
justinthelaw Sep 18, 2024
66ef462
fix image tag, add NVIDIA capabilities all
justinthelaw Sep 18, 2024
c9d480c
align docker test, add node label
justinthelaw Sep 18, 2024
a32226a
add quotes, increase priv
justinthelaw Sep 18, 2024
db04bd0
Merge remote-tracking branch 'origin/main' into 1037-testvllm-impleme…
justinthelaw Sep 18, 2024
d199203
add nfd
justinthelaw Sep 18, 2024
bd89870
add nfd, pt.1
justinthelaw Sep 18, 2024
2ce805b
remove nfd
justinthelaw Sep 18, 2024
3be3648
remove set-as-default
justinthelaw Sep 18, 2024
9cf7d7f
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 18, 2024
5f777e0
Merge branch 'main' into 1037-testvllm-implement-e2e-testing-for-vllm
justinthelaw Sep 18, 2024
8d28084
refactor, unload drivers
justinthelaw Sep 18, 2024
c12aa82
script typo
justinthelaw Sep 18, 2024
6900dac
fix typos
justinthelaw Sep 18, 2024
3ab9228
slim k3d cluster, permission workaround
justinthelaw Sep 18, 2024
7dd8abf
k3d bootstrap match
justinthelaw Sep 18, 2024
79f8d30
k3d server name
justinthelaw Sep 18, 2024
2811359
nvidia wait-for
justinthelaw Sep 18, 2024
3cf42eb
remove extra stuff
justinthelaw Sep 18, 2024
331584e
pods out first
justinthelaw Sep 18, 2024
e7fdf7c
node out first, whoami
justinthelaw Sep 18, 2024
0662106
which k3d
justinthelaw Sep 18, 2024
6b04c55
sleep!
justinthelaw Sep 18, 2024
6110ec4
root user
justinthelaw Sep 18, 2024
9f9157c
root user, pt.2
justinthelaw Sep 18, 2024
664709b
revert vllm e2e GPU runner changes
justinthelaw Sep 18, 2024
f896e59
revert formatting changes
justinthelaw Sep 18, 2024
ef75a70
e2e tests made easier
justinthelaw Sep 18, 2024
2fcac88
Merge branch 'main' into 1037-testvllm-implement-e2e-testing-for-vllm
justinthelaw Sep 18, 2024
23c008e
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 18, 2024
d1d6540
e2e test Make target typo
justinthelaw Sep 18, 2024
2cfd164
Merge branch '1037-testvllm-implement-e2e-testing-for-vllm' of https:…
justinthelaw Sep 18, 2024
09510b7
zarf-config.yaml changes docs
justinthelaw Sep 18, 2024
1e89fac
add load_format
justinthelaw Sep 18, 2024
0568232
revert format e2e-llama-cpp-python.yaml
justinthelaw Sep 18, 2024
cc7ac6c
fixed Makefile typo
justinthelaw Sep 18, 2024
8a07080
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 18, 2024
f335be7
attempt merge with main
justinthelaw Sep 18, 2024
e0c0ac7
better clean-up
justinthelaw Sep 19, 2024
c90d820
add FinishReason enum back in
justinthelaw Sep 19, 2024
a1a03c1
passing unit tests
justinthelaw Sep 19, 2024
3da388f
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 19, 2024
3387974
Merge branch 'main' into 1037-testvllm-implement-e2e-testing-for-vllm
justinthelaw Sep 19, 2024
620e3b5
fixes GPU_LIMIT
justinthelaw Sep 20, 2024
09dd182
Merge remote-tracking branch 'origin/1037-testvllm-implement-e2e-test…
justinthelaw Sep 20, 2024
331a346
fixes load_format
justinthelaw Sep 20, 2024
6df5ebb
Merge branch 'main' into 1037-testvllm-implement-e2e-testing-for-vllm
justinthelaw Sep 20, 2024
304f659
Merge remote-tracking branch 'origin/1037-testvllm-implement-e2e-test…
justinthelaw Sep 20, 2024
cc46716
adds Docker container-only things
justinthelaw Sep 20, 2024
5ab0b99
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 20, 2024
da1399b
PR review fixes
justinthelaw Sep 20, 2024
59e1830
Merge remote-tracking branch 'origin/1037-testvllm-implement-e2e-test…
justinthelaw Sep 20, 2024
e963293
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 20, 2024
b9545b7
description for PROMPT_FORMAT*
justinthelaw Sep 20, 2024
5a6d59f
makefile clean improvements, add bundle configs
justinthelaw Sep 20, 2024
396370a
variabilize PYTHON_VERSION in vllm Dockerfile
justinthelaw Sep 20, 2024
b023dfa
missing download sub-cmd
justinthelaw Sep 20, 2024
f24180d
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 20, 2024
8ba6dcb
variabilize vllm directory
justinthelaw Sep 20, 2024
0186ad0
Merge branch '835-upgrade-vllm-for-gptq-bfloat16-inferencing' of http…
justinthelaw Sep 20, 2024
9791cb6
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 20, 2024
6e1ca0c
fix release.yaml
justinthelaw Sep 20, 2024
858b64f
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 20, 2024
ced0797
Update e2e-registry1-weekly.yaml
justinthelaw Sep 20, 2024
89d0d69
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 20, 2024
dd9b1bc
Update e2e-registry1-weekly.yaml
justinthelaw Sep 20, 2024
2bf474c
Update e2e-registry1-weekly.yaml
justinthelaw Sep 20, 2024
6effe8c
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 23, 2024
1641379
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 23, 2024
ca0ff03
update to 0.13.0, fix versioning
justinthelaw Sep 23, 2024
d365660
fix registry1 workflow, add prints
justinthelaw Sep 23, 2024
bdda602
merge with registry1 workflow
justinthelaw Sep 23, 2024
2e24a6b
chainguard login, fix registry1 uds setup
justinthelaw Sep 23, 2024
280927a
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Sep 23, 2024
bd3d7ff
Merge branch 'main' into chore-update-registry1-weekly-bundle-0.13.0
justinthelaw Sep 23, 2024
686e755
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 23, 2024
e109740
fix permissions
justinthelaw Sep 23, 2024
b4b767e
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Sep 23, 2024
7948e33
fix permissions, pt.2
justinthelaw Sep 23, 2024
c468b2c
fix permissions
justinthelaw Sep 23, 2024
44d4a0e
centralize integration llm config, no-cache-dir
justinthelaw Sep 23, 2024
9c1811c
merge with testing branch, pt.1
justinthelaw Sep 23, 2024
c0af7c7
centralize integration llm config, pt.2
justinthelaw Sep 23, 2024
6c24d34
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Sep 23, 2024
332d348
better make clean-all
justinthelaw Sep 23, 2024
14ab833
complete overhaul of registry1 weekly
justinthelaw Sep 23, 2024
3caed3a
revert formatting
justinthelaw Sep 23, 2024
c50e16a
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Sep 23, 2024
cbbdc20
update yq command for zarf.yaml
justinthelaw Sep 23, 2024
8518d71
yq sub typo
justinthelaw Sep 23, 2024
f11dd73
go back to using latest bundle
justinthelaw Sep 23, 2024
4079620
package create modifications
justinthelaw Sep 23, 2024
dd52e03
typo UDS zarf package create
justinthelaw Sep 23, 2024
a4fb386
correct bundle pointers and mutation
justinthelaw Sep 23, 2024
7192692
different zarf package ref location
justinthelaw Sep 23, 2024
d465753
log level debug
justinthelaw Sep 23, 2024
58b67c6
confirm missing C lib, more dynamic API create
justinthelaw Sep 24, 2024
25a1223
README improvement
justinthelaw Sep 24, 2024
9185ebf
README improvement, pt.2
justinthelaw Sep 24, 2024
5ff7f1c
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Sep 24, 2024
ee08217
0.13.0, merge with test branch
justinthelaw Sep 24, 2024
982533f
more FinishReason exception throwing
justinthelaw Sep 24, 2024
4c4b0b6
fix class method on FinishReason
justinthelaw Sep 24, 2024
78efedb
change method name
justinthelaw Sep 24, 2024
55546a7
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 25, 2024
c7ca585
Merge branch 'main' into chore-update-registry1-weekly-bundle-0.13.0
justinthelaw Sep 25, 2024
072427a
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 25, 2024
5e545f6
Merge branch 'main' into chore-update-registry1-weekly-bundle-0.13.0
justinthelaw Sep 25, 2024
91da0ce
modify release-please-config
justinthelaw Sep 25, 2024
240e2c1
weekly sunday 12AM pst
justinthelaw Sep 25, 2024
d673244
move install to JIT
justinthelaw Sep 25, 2024
81c598c
remove udsCliVersion
justinthelaw Sep 25, 2024
301e9dd
comment typo
justinthelaw Sep 25, 2024
340414f
add v to registry ref
justinthelaw Sep 25, 2024
8c4e194
Merge branch 'main' into chore-update-registry1-weekly-bundle-0.13.0
justinthelaw Sep 25, 2024
beb643f
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 25, 2024
3defb55
better sub yq cmd
justinthelaw Sep 25, 2024
4fdec61
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Sep 25, 2024
da1e466
add failure logging
justinthelaw Sep 25, 2024
b2b6905
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Sep 25, 2024
3cfecf0
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 26, 2024
94d2385
Merge branch 'main' into chore-update-registry1-weekly-bundle-0.13.0
justinthelaw Sep 26, 2024
ccd99e9
Merge branch 'main' into chore-update-registry1-weekly-bundle-0.13.0
justinthelaw Sep 26, 2024
26932de
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 26, 2024
649406d
Update release-please-config.json
justinthelaw Sep 27, 2024
20a73b7
Update and rename e2e-registry1-weekly.yaml to weekly-registry1-e2e-t…
justinthelaw Sep 27, 2024
a4f4c0f
Update and rename weekly-registry1-e2e-testing.yaml to weekly-registr…
justinthelaw Sep 27, 2024
ab5871d
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 27, 2024
0928698
Merge branch 'main' into chore-update-registry1-weekly-bundle-0.13.0
justinthelaw Sep 27, 2024
757166e
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Sep 27, 2024
5cca687
0.13.1
justinthelaw Sep 27, 2024
db7193e
Merge remote-tracking branch 'origin/main' into 835-upgrade-vllm-for-…
justinthelaw Sep 27, 2024
c878283
Merge branch 'main' into chore-update-registry1-weekly-bundle-0.13.0
justinthelaw Sep 27, 2024
be13c59
filename typo
justinthelaw Sep 27, 2024
1264c4c
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Sep 27, 2024
db7e27a
make target typo
justinthelaw Sep 27, 2024
ef2f559
env variabilized
justinthelaw Sep 27, 2024
bcc1287
make target just does not work
justinthelaw Sep 27, 2024
03837c9
image_versions explicit set
justinthelaw Sep 27, 2024
8e4faf3
image_versions explicit set, pt.2
justinthelaw Sep 27, 2024
7a3c365
Merge branch 'main' into chore-update-registry1-weekly-bundle-0.13.0
justinthelaw Sep 27, 2024
f77bcfe
use version pattern from release.yaml
justinthelaw Sep 27, 2024
37093dd
merge and resolve release conflict
justinthelaw Sep 27, 2024
14351c1
remove the v
justinthelaw Sep 27, 2024
46174ed
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Sep 27, 2024
ce4c30f
Merge branch 'main' into chore-update-registry1-weekly-bundle-0.13.0
justinthelaw Sep 30, 2024
f502e06
fix lint
justinthelaw Sep 30, 2024
af8c971
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Sep 30, 2024
fd2c153
cutover to utils.client.py
justinthelaw Oct 1, 2024
d22439e
Merge branch 'main' into chore-update-registry1-weekly-bundle-0.13.0
justinthelaw Oct 1, 2024
5c493ea
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Oct 1, 2024
ae68868
cutover to utils.client.py, pt.2
justinthelaw Oct 1, 2024
2acb604
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Oct 1, 2024
ca55f72
cutover to utils.client.py, pt.3
justinthelaw Oct 1, 2024
a42c320
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Oct 1, 2024
807fbdc
fix text embeddings backend full
justinthelaw Oct 1, 2024
abff6bd
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Oct 1, 2024
a9c34fb
remove extraneous env
justinthelaw Oct 1, 2024
8caf64f
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Oct 1, 2024
a6f0af0
add get_supabase_url, default model warnings
justinthelaw Oct 1, 2024
0b291f4
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Oct 1, 2024
d590268
supabase base url incorrect
justinthelaw Oct 1, 2024
54af6dc
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Oct 1, 2024
17e20fa
supabase_url in wrong position
justinthelaw Oct 1, 2024
2c3b7f1
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Oct 1, 2024
76efca3
Merge remote-tracking branch 'origin/main' into chore-update-registry…
justinthelaw Oct 1, 2024
1211e69
fastapi status code usage
justinthelaw Oct 1, 2024
7e6bdb2
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Oct 1, 2024
a4c5ace
FinishReason _missing_ class method
justinthelaw Oct 1, 2024
5ee07cf
new missing JWT
justinthelaw Oct 1, 2024
df60811
Merge remote-tracking branch 'origin/chore-update-registry1-weekly-bu…
justinthelaw Oct 1, 2024
0c12449
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Oct 1, 2024
99c27c9
missing ZARF VAR passthrough to values
justinthelaw Oct 2, 2024
c106e10
more clarity in the README
justinthelaw Oct 3, 2024
d92b572
Merge branch 'main' into 835-upgrade-vllm-for-gptq-bfloat16-inferencing
justinthelaw Oct 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/actions/release/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ runs:
run: |
docker buildx build --build-arg LOCAL_VERSION=${{ inputs.releaseTag }} -t ghcr.io/defenseunicorns/leapfrogai/vllm:${{ inputs.releaseTag }} --push -f packages/vllm/Dockerfile .

zarf package create packages/vllm --set=IMAGE_VERSION=${{ inputs.releaseTag }} --flavor upstream --confirm
ZARF_CONFIG=packages/vllm/zarf-config.yaml zarf package create packages/vllm --set=IMAGE_VERSION=${{ inputs.releaseTag }} --flavor upstream --confirm

zarf package publish zarf-package-vllm-amd64-${{ inputs.releaseTag }}.tar.zst oci://ghcr.io/defenseunicorns/packages${{ inputs.subRepository }}leapfrogai

Expand Down
5 changes: 5 additions & 0 deletions .github/release-please-config.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@
"path": "**/zarf.yaml",
"glob": true
},
{
"type": "generic",
"path": "**/zarf-config.yaml",
"glob": true
},
{
"type": "generic",
"path": "**/uds-bundle.yaml",
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/e2e-llama-cpp-python.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,11 @@ jobs:
runs-on: ai-ubuntu-big-boy-8-core
if: ${{ !github.event.pull_request.draft }}

permissions:
contents: read
packages: read
id-token: write # This is needed for OIDC federation.

steps:
- name: Checkout Repo
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
Expand All @@ -69,6 +74,7 @@ jobs:
registry1Username: ${{ secrets.IRON_BANK_ROBOT_USERNAME }}
registry1Password: ${{ secrets.IRON_BANK_ROBOT_PASSWORD }}
ghToken: ${{ secrets.GITHUB_TOKEN }}
chainguardIdentity: ${{ secrets.CHAINGUARD_IDENTITY }}

- name: Setup API and Supabase
uses: ./.github/actions/lfai-core
Expand Down
8 changes: 7 additions & 1 deletion .github/workflows/e2e-playwright.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,11 @@ jobs:
runs-on: ai-ubuntu-big-boy-8-core
if: ${{ !github.event.pull_request.draft }}

permissions:
contents: read
packages: read
id-token: write # This is needed for OIDC federation.

steps:
- name: Checkout Repo
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
Expand All @@ -82,6 +87,7 @@ jobs:
registry1Username: ${{ secrets.IRON_BANK_ROBOT_USERNAME }}
registry1Password: ${{ secrets.IRON_BANK_ROBOT_PASSWORD }}
ghToken: ${{ secrets.GITHUB_TOKEN }}
chainguardIdentity: ${{ secrets.CHAINGUARD_IDENTITY }}

- name: Create Test User
run: |
Expand Down Expand Up @@ -120,7 +126,7 @@ jobs:
- name: UI/API/Supabase E2E Playwright Tests
run: |
cp src/leapfrogai_ui/.env.example src/leapfrogai_ui/.env
rm src/leapfrogai_ui/tests/global.teardown.ts
rm src/leapfrogai_ui/tests/global.teardown.ts
mkdir -p src/leapfrogai_ui/playwright/.auth
SERVICE_ROLE_KEY=$(uds zarf tools kubectl get secret -n leapfrogai supabase-bootstrap-jwt -o jsonpath={.data.service-key} | base64 -d)
echo "::add-mask::$SERVICE_ROLE_KEY"
Expand Down
9 changes: 9 additions & 0 deletions .github/workflows/e2e-text-backend-full-cpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,11 @@ jobs:
runs-on: ai-ubuntu-big-boy-8-core
if: ${{ !github.event.pull_request.draft }}

permissions:
contents: read
packages: read
id-token: write # This is needed for OIDC federation.

steps:
- name: Checkout Repo
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
Expand All @@ -69,6 +74,8 @@ jobs:
with:
registry1Username: ${{ secrets.IRON_BANK_ROBOT_USERNAME }}
registry1Password: ${{ secrets.IRON_BANK_ROBOT_PASSWORD }}
ghToken: ${{ secrets.GITHUB_TOKEN }}
chainguardIdentity: ${{ secrets.CHAINGUARD_IDENTITY }}

- name: Setup LFAI-API and Supabase
uses: ./.github/actions/lfai-core
Expand Down Expand Up @@ -97,5 +104,7 @@ jobs:
# Test
##########
- name: Test Text Backend
env:
LEAPFROGAI_MODEL: llama-cpp-python
run: |
python -m pytest ./tests/e2e/test_text_backend_full.py -v
6 changes: 6 additions & 0 deletions .github/workflows/e2e-text-embeddings.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,11 @@ jobs:
runs-on: ai-ubuntu-big-boy-8-core
if: ${{ !github.event.pull_request.draft }}

permissions:
contents: read
packages: read
id-token: write # This is needed for OIDC federation.

steps:
- name: Checkout Repo
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
Expand All @@ -71,6 +76,7 @@ jobs:
registry1Username: ${{ secrets.IRON_BANK_ROBOT_USERNAME }}
registry1Password: ${{ secrets.IRON_BANK_ROBOT_PASSWORD }}
ghToken: ${{ secrets.GITHUB_TOKEN }}
chainguardIdentity: ${{ secrets.CHAINGUARD_IDENTITY }}

- name: Setup LFAI-API and Supabase
uses: ./.github/actions/lfai-core
Expand Down
9 changes: 7 additions & 2 deletions .github/workflows/e2e-vllm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,11 @@ jobs:
runs-on: ai-ubuntu-big-boy-8-core
if: ${{ !github.event.pull_request.draft }}

permissions:
contents: read
packages: read
id-token: write # This is needed for OIDC federation.

steps:
- name: Checkout Repo
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
Expand All @@ -73,7 +78,7 @@ jobs:
registry1Username: ${{ secrets.IRON_BANK_ROBOT_USERNAME }}
registry1Password: ${{ secrets.IRON_BANK_ROBOT_PASSWORD }}
ghToken: ${{ secrets.GITHUB_TOKEN }}
udsCliVersion: 0.14.0
chainguardIdentity: ${{ secrets.CHAINGUARD_IDENTITY }}

##########
# vLLM
Expand All @@ -82,4 +87,4 @@ jobs:
##########
- name: Build vLLM
run: |
make build-vllm LOCAL_VERSION=e2e-test
make build-vllm LOCAL_VERSION=e2e-test ZARF_CONFIG=packages/vllm/zarf-config.yaml
6 changes: 6 additions & 0 deletions .github/workflows/e2e-whisper.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,11 @@ jobs:
runs-on: ai-ubuntu-big-boy-8-core
if: ${{ !github.event.pull_request.draft }}

permissions:
contents: read
packages: read
id-token: write # This is needed for OIDC federation.

steps:
- name: Checkout Repo
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
Expand All @@ -71,6 +76,7 @@ jobs:
registry1Username: ${{ secrets.IRON_BANK_ROBOT_USERNAME }}
registry1Password: ${{ secrets.IRON_BANK_ROBOT_PASSWORD }}
ghToken: ${{ secrets.GITHUB_TOKEN }}
chainguardIdentity: ${{ secrets.CHAINGUARD_IDENTITY }}

- name: Setup LFAI-API and Supabase
uses: ./.github/actions/lfai-core
Expand Down
6 changes: 5 additions & 1 deletion .github/workflows/pytest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,10 @@ on:
- "!packages/ui/**"

# Declare default permissions as read only.
permissions: read-all
permissions:
contents: read
packages: read
id-token: write # This is needed for OIDC federation.

concurrency:
group: pytest-integration-${{ github.ref }}
Expand Down Expand Up @@ -97,6 +100,7 @@ jobs:
registry1Username: ${{ secrets.IRON_BANK_ROBOT_USERNAME }}
registry1Password: ${{ secrets.IRON_BANK_ROBOT_PASSWORD }}
ghToken: ${{ secrets.GITHUB_TOKEN }}
chainguardIdentity: ${{ secrets.CHAINGUARD_IDENTITY }}

- name: Setup API and Supabase
uses: ./.github/actions/lfai-core
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
name: e2e-registry1-weekly
name: weekly-registry1-flavor-test

on:
schedule:
- cron: "0 0 * * 6" # Run every Sunday at 12 AM EST
- cron: "0 8 * * 0" # Run every Sunday at 12 AM PST
workflow_dispatch: # trigger manually as needed
pull_request:
types:
Expand All @@ -12,79 +12,110 @@ on:
- ready_for_review # don't run on draft PRs
- milestoned # allows us to trigger on bot PRs
paths:
- .github/workflows/e2e-registry1-weekly.yaml
- .github/workflows/weekly-registry1-flavor-test.yaml
- bundles/latest/**

concurrency:
group: e2e-registry1-weekly-${{ github.ref }}
group: weekly-registry1-flavor-test-${{ github.ref }}
cancel-in-progress: true

defaults:
run:
shell: bash

jobs:
test-flavors:
registry1-flavor-test:
runs-on: ai-ubuntu-big-boy-8-core
name: e2e_registry1_weekly
name: weekly_registry1_flavor_test
if: ${{ !github.event.pull_request.draft }}

permissions:
contents: read
packages: write
packages: read
id-token: write # This is needed for OIDC federation.

steps:
- name: Checkout Repo
# Checkout main just to see the latest release in the release-please manifest
- name: Checkout Repo (main)
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
with:
# x-release-please-start-version
ref: "caf4f9c3093a55a003b49fcbf05c03221be6a232" # 0.12.2 w/ integration tests turned-on
# x-release-please-end
ref: main

- name: Setup Python
uses: ./.github/actions/python
- name: Get Latest Release Version
id: get_version
run: |
LFAI_VERSION=$(jq -r '.["."]' .github/.release-please-manifest.json)
echo "LFAI_VERSION=$LFAI_VERSION" >> $GITHUB_OUTPUT

- name: Install API and SDK Dev Dependencies
run : |
make install
################
# LATEST RELEASE
################

- name: Checkout Repo
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
with:
fetch-tags: true
ref: v${{ steps.get_version.outputs.LFAI_VERSION }}

- name: Setup UDS Cluster
uses: ./.github/actions/uds-cluster
- name: Setup UDS Environment
uses: defenseunicorns/uds-common/.github/actions/setup@24c8a2a48eeb33773b76b3587c489cb17496c9e0 # v0.12.0
with:
registry1Username: ${{ secrets.IRON_BANK_ROBOT_USERNAME }}
registry1Password: ${{ secrets.IRON_BANK_ROBOT_PASSWORD }}
ghToken: ${{ secrets.GITHUB_TOKEN }}
udsCliVersion: 0.14.0
chainguardIdentity: ${{ secrets.CHAINGUARD_IDENTITY }}

- name: Create UDS Cluster
shell: bash
- name: Setup Python
uses: actions/setup-python@0a5c61591373683505ea898e09a3ea4f39ef2b9c #v5.0.0
with:
python-version-file: "pyproject.toml"

- name: Install Python Dependencies
run: pip install ".[dev]" "src/leapfrogai_api" "src/leapfrogai_sdk" --no-cache-dir

- name: Mutation of the Zarf Packages
run: |
UDS_CONFIG=.github/config/uds-config.yaml make create-uds-cpu-cluster
uds zarf tools yq -i '
.components[].images[0] |= sub(":v[0-9\.]+$", ":v${{ steps.get_version.outputs.LFAI_VERSION }}")
' packages/api/zarf.yaml
uds zarf tools yq -i '.api.image.tag = "v${{ steps.get_version.outputs.LFAI_VERSION }}"' packages/api/values/registry1-values.yaml

- name: Setup Playwright
- name: Print the Modified Zarf Packages
run: |
npm --prefix src/leapfrogai_ui ci
npx --prefix src/leapfrogai_ui playwright install
cat packages/api/zarf.yaml
cat packages/api/values/registry1-values.yaml

- name: Create Registry1 Packages
- name: Create Registry1 Zarf Packages
run: |
LOCAL_VERSION=registry1 FLAVOR=registry1 make build-api
uds zarf package create packages/api --set image_version="${{ steps.get_version.outputs.LFAI_VERSION }}" --flavor registry1 -a amd64 --confirm

# Mutate UDS bundle definition to use Registry1 packages
- name: Mutation to Registry1 Bundle
# TODO: fix bundle path
# Mutate non-Registry1 packages to be the current tagged version
- name: Mutation of the UDS Bundle
run: |
uds zarf tools yq -i '.packages[1] |= del(.repository)' bundles/latest/cpu/uds-bundle.yaml
uds zarf tools yq -i '.packages[1] |= .ref = "registry1"' bundles/latest/cpu/uds-bundle.yaml
uds zarf tools yq -i '.packages[1] |= .path = "../../../packages/api"' bundles/latest/cpu/uds-bundle.yaml
uds zarf tools yq -i '.metadata.version = "registry1"' bundles/latest/cpu/uds-bundle.yaml

- name: Create and Deploy Bundle
uds zarf tools yq -i '.packages[].ref |= sub("^[^ ]+-upstream$", "${{ steps.get_version.outputs.LFAI_VERSION }}-upstream")' bundles/latest/cpu/uds-bundle.yaml

uds zarf tools yq -i '.packages[1] |= del(.repository)' bundles/latest/cpu/uds-bundle.yaml
uds zarf tools yq -i '.packages[1] |= .ref = "${{ steps.get_version.outputs.LFAI_VERSION }}"' bundles/latest/cpu/uds-bundle.yaml
uds zarf tools yq -i '.packages[1] |= .path = "../../../"' bundles/latest/cpu/uds-bundle.yaml

- name: Print the Modified UDS Bundle
run: |
cat bundles/latest/cpu/uds-config.yaml
cat bundles/latest/cpu/uds-bundle.yaml

- name: Create UDS Cluster
shell: bash
run: |
UDS_CONFIG=.github/config/uds-config.yaml make create-uds-cpu-cluster

- name: Create and Deploy Registry1 Bundle
run: |
cd bundles/latest/cpu
uds create . --confirm && \
uds deploy uds-bundle-leapfrogai-amd64-registry1.tar.zst --confirm --no-progress && \
uds deploy uds-bundle-leapfrogai-amd64-registry1.tar.zst --confirm --no-progress --log-level debug && \
rm -rf uds-bundle-leapfrogai-amd64-registry1.tar.zst && \
docker system prune -af

Expand All @@ -107,32 +138,19 @@ jobs:
echo "ANON_KEY is set: ${{ steps.generate_secrets.outputs.ANON_KEY != '' }}"
echo "SERVICE_KEY is set: ${{ steps.generate_secrets.outputs.SERVICE_KEY != '' }}"

- name: Run Integration Tests
env:
SUPABASE_ANON_KEY: ${{ steps.generate_secrets.outputs.ANON_KEY }}
SUPABASE_PASS: ${{ steps.generate_secrets.outputs.FAKE_PASSWORD }}
SUPABASE_EMAIL: [email protected]
SUPABASE_URL: https://supabase-kong.uds.dev
# Turn off NIAH tests that are not applicable for integration testing using the Repeater model
LFAI_RUN_NIAH_TESTS: "false"
run: |
uds zarf connect --name=llama-cpp-python-model --namespace=leapfrogai --local-port=50051 --remote-port=50051 &
while ! nc -z localhost 50051; do sleep 1; done

make test-user-pipeline
env $(cat .env | xargs) python -m pytest -v -s tests/integration/api

# Backends
- name: Run Backend E2E Tests
env:
ANON_KEY: ${{ steps.generate_secrets.outputs.ANON_KEY }}
SERVICE_KEY: ${{ steps.generate_secrets.outputs.SERVICE_KEY }}
LEAPFROGAI_MODEL: llama-cpp-python
run: |
python -m pytest -vvv -s ./tests/e2e

- name: Setup Playwright
run: |
python -m pytest ./tests/e2e/test_llama.py -vv
python -m pytest ./tests/e2e/test_text_embeddings.py -vv
python -m pytest ./tests/e2e/test_whisper.py -vv
python -m pytest ./tests/e2e/test_supabase.py -vv
python -m pytest ./tests/e2e/test_api.py -vv
npm --prefix src/leapfrogai_ui ci
npx --prefix src/leapfrogai_ui playwright install

- name: Run Playwright E2E Tests
env:
Expand All @@ -156,3 +174,12 @@ jobs:
name: playwright-report
path: src/leapfrogai_ui/e2e-report/
retention-days: 30

- name: Get Cluster Debug Information
id: debug
if: ${{ !cancelled() }}
uses: defenseunicorns/uds-common/.github/actions/debug-output@e3008473beab00b12a94f9fcc7340124338d5c08 # v0.13.1

- name: Get Cluster Debug Information
if: ${{ !cancelled() && steps.debug.conclusion == 'success' }}
uses: defenseunicorns/uds-common/.github/actions/save-logs@e3008473beab00b12a94f9fcc7340124338d5c08 # v0.13.1
Loading
Loading