Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize generation script #33

Merged
merged 18 commits into from
Jan 30, 2025

Conversation

thepetk
Copy link
Collaborator

@thepetk thepetk commented Jan 27, 2025

What does this PR do?:

The PR finalizes the automatic conversion of the shared gitops resources between the ai-lab-app and the ai-lab-helm-chart.

More specifically it introduces the following updates:

  • A script called generate.sh is added and is the main point to update all the shared resources. Along with the script a workflow has been added to check in every new PR that we are aligned with the ai-lab-app repo. This way we guarantee that both ai-lab-app & ai-lab-samples are always up-to-date. Finally, a quick readme has been added to capture all this information. I find the best location to be the ai-software-templates directory which contains all the related charts.
  • In regards to the chatbot-ai-sample chart, all the updates brought from Add partially-automatic conversion flow #31 are now tested and captured in the readme tmpl (and ofc in the readme) of the chart.
  • Finally, the values.dbRequired is removed for now from the new resource content as it is only related with the RAG case. I think the best approach here is to create a new chart that will cover this case and not use the chatbot chart for this.

Apart from the above, also some other details are:

  • To make sure that the existingModelServer will remove the model-server deployment if set, the conversion script now adds on the top and bottom of all *model-server* files a condition.
  • As you can notice some updates are also pulled from ai-lab-app after the formatting PR that was merged recently.

Which issue(s) this PR fixes:

Fixes RHDHPAI-379

PR acceptance criteria:

Testing and documentation do not need to be complete in order for this PR to be approved. We just need to ensure tracking issues are opened and linked to this PR, if they are not in the PR scope due to various constraints.

  • Tested and Verified
  • Documentation (READMEs, Product Docs, Blogs, Education Modules, etc.)

How to test changes / Special notes to the reviewer:

In regards to testing, I've tested the following cases:

  • vLLM support. If vllmSelected is set the model server deployed is based on this technology. To test this case you need to provision a cluster with GPU support:

image

  • The existing model server case, along with the bearer authentication support. This skips the model-server deployment and service creation:

existing-model-server-vs-vllm

  • A test run for the new workflow is here

@thepetk thepetk changed the title Feat/add generate script WIP: Finalize generation script Jan 27, 2025
@thepetk
Copy link
Collaborator Author

thepetk commented Jan 27, 2025

WIP: because I'm working on an issue with the MODEL_ENDPOINT_BEARER env var and the update of the deployment done by the tekton pipeline

Signed-off-by: thepetk <[email protected]>
Copy link
Contributor

@gabemontero gabemontero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some review of the README's with some minor wording suggestions

I'm booting up a cluster now and will checkout this branch and given a go later today @thepetk - thanks

charts/ai-software-templates/README.md Outdated Show resolved Hide resolved
charts/ai-software-templates/README.md Outdated Show resolved Hide resolved
charts/ai-software-templates/README.md Outdated Show resolved Hide resolved
charts/ai-software-templates/chatbot/README.md Outdated Show resolved Hide resolved
charts/ai-software-templates/chatbot/README.md Outdated Show resolved Hide resolved
charts/ai-software-templates/chatbot/README.md.gotmpl Outdated Show resolved Hide resolved
charts/ai-software-templates/chatbot/README.md.gotmpl Outdated Show resolved Hide resolved
@gabemontero
Copy link
Contributor

Testing the 3 charts worked for me @thepetk , including an update to the app getting rolled out. I'm guessing until we have an update to ai-lab-app or ai-lab-samples the generate.sh script won't produce any changes (it did not for me when I ran locally).

@thepetk
Copy link
Collaborator Author

thepetk commented Jan 29, 2025

Testing the 3 charts worked for me @thepetk , including an update to the app getting rolled out. I'm guessing until we have an update to ai-lab-app or ai-lab-samples the generate.sh script won't produce any changes (it did not for me when I ran locally).

Yeah it should be this way and pretty much this is the main reason I added the workflow just to be sure every time we are up-to-date with the ai-lab-app dependency. For the ai-lab-samples dependency, I think we are ok, as we directly populating the application repo from the ai-lab-samples we are automatically up-to-date. WDYT?

@gabemontero
Copy link
Contributor

Testing the 3 charts worked for me @thepetk , including an update to the app getting rolled out. I'm guessing until we have an update to ai-lab-app or ai-lab-samples the generate.sh script won't produce any changes (it did not for me when I ran locally).

Yeah it should be this way and pretty much this is the main reason I added the workflow just to be sure every time we are up-to-date with the ai-lab-app dependency. For the ai-lab-samples dependency, I think we are ok, as we directly populating the application repo from the ai-lab-samples we are automatically up-to-date. WDYT?

Sure all that is good. I'm just saying we can't fully validate until a change is made to those repo(s).

Before you move your Jira to closed, were you planning on making some sort of change in those repo(s) that would get caught by your github workflow, or get seen if someone ran generate.sh locally?

@thepetk
Copy link
Collaborator Author

thepetk commented Jan 29, 2025

Testing the 3 charts worked for me @thepetk , including an update to the app getting rolled out. I'm guessing until we have an update to ai-lab-app or ai-lab-samples the generate.sh script won't produce any changes (it did not for me when I ran locally).

Yeah it should be this way and pretty much this is the main reason I added the workflow just to be sure every time we are up-to-date with the ai-lab-app dependency. For the ai-lab-samples dependency, I think we are ok, as we directly populating the application repo from the ai-lab-samples we are automatically up-to-date. WDYT?

Sure all that is good. I'm just saying we can't fully validate until a change is made to those repo(s).

Before you move your Jira to closed, were you planning on making some sort of change in those repo(s) that would get caught by your github workflow, or get seen if someone ran generate.sh locally?

I think we are getting new changes right? Because we are importing the changes from the linter merged in the ai-lab-app recently. So the script ran and imported the updates on the ai-lab-app files. This can be verified by the workflow now as now that we run it again it doesn't introduce anything new.

@thepetk
Copy link
Collaborator Author

thepetk commented Jan 29, 2025

WIP: because I'm working on an issue with the MODEL_ENDPOINT_BEARER env var and the update of the deployment done by the tekton pipeline

I've managed to make the bearer auth work. In order to achieve that I've opened a PR in the rhdh-pipelines side and I made an update to the app-config so it can stay permanently as a configmap so we can fetch each time the necessary variables.

@gabemontero PTAL :)

@thepetk
Copy link
Collaborator Author

thepetk commented Jan 29, 2025

Example screenshot using existing model server, bearer auth and the updated pipeline introduced by redhat-ai-dev/rhdh-pipelines#11

Screenshot from 2025-01-29 15-00-38

Btw the whole experience with an existing model server is way faster, the application is up and running very fast, so the git repo and the pipelines.

@thepetk thepetk changed the title WIP: Finalize generation script Finalize generation script Jan 29, 2025
@thepetk thepetk requested a review from gabemontero January 29, 2025 15:02
@gabemontero
Copy link
Contributor

Testing the 3 charts worked for me @thepetk , including an update to the app getting rolled out. I'm guessing until we have an update to ai-lab-app or ai-lab-samples the generate.sh script won't produce any changes (it did not for me when I ran locally).

Yeah it should be this way and pretty much this is the main reason I added the workflow just to be sure every time we are up-to-date with the ai-lab-app dependency. For the ai-lab-samples dependency, I think we are ok, as we directly populating the application repo from the ai-lab-samples we are automatically up-to-date. WDYT?

Sure all that is good. I'm just saying we can't fully validate until a change is made to those repo(s).
Before you move your Jira to closed, were you planning on making some sort of change in those repo(s) that would get caught by your github workflow, or get seen if someone ran generate.sh locally?

I think we are getting new changes right? Because we are importing the changes from the linter merged in the ai-lab-app recently. So the script ran and imported the updates on the ai-lab-app files. This can be verified by the workflow now as now that we run it again it doesn't introduce anything new.

Either I'm mis-interpreting what you are saying, or I'm not making myself sufficiently clear, or something along those lines @thepetk

Let me try this way: At some point we should be able to submit at PR against this repo, and see the print at https://github.com/redhat-ai-dev/ai-lab-helm-charts/pull/33/files#diff-119dd431be9907b10ad9714895fb89d998ccd55bf25f3cc6bb4f1573ed7ca7e2R21 occur because on of those other repositories have been updated, correct?

Ideally we test that flow with perhaps a simple change, non effecting change in one of the other repos, merge in one of those repos, and then see your new workflow flag that generate.sh needs to be executed.

Correct?

@thepetk
Copy link
Collaborator Author

thepetk commented Jan 30, 2025

Testing the 3 charts worked for me @thepetk , including an update to the app getting rolled out. I'm guessing until we have an update to ai-lab-app or ai-lab-samples the generate.sh script won't produce any changes (it did not for me when I ran locally).

Yeah it should be this way and pretty much this is the main reason I added the workflow just to be sure every time we are up-to-date with the ai-lab-app dependency. For the ai-lab-samples dependency, I think we are ok, as we directly populating the application repo from the ai-lab-samples we are automatically up-to-date. WDYT?

Sure all that is good. I'm just saying we can't fully validate until a change is made to those repo(s).
Before you move your Jira to closed, were you planning on making some sort of change in those repo(s) that would get caught by your github workflow, or get seen if someone ran generate.sh locally?

I think we are getting new changes right? Because we are importing the changes from the linter merged in the ai-lab-app recently. So the script ran and imported the updates on the ai-lab-app files. This can be verified by the workflow now as now that we run it again it doesn't introduce anything new.

Either I'm mis-interpreting what you are saying, or I'm not making myself sufficiently clear, or something along those lines @thepetk

Let me try this way: At some point we should be able to submit at PR against this repo, and see the print at https://github.com/redhat-ai-dev/ai-lab-helm-charts/pull/33/files#diff-119dd431be9907b10ad9714895fb89d998ccd55bf25f3cc6bb4f1573ed7ca7e2R21 occur because on of those other repositories have been updated, correct?

Ideally we test that flow with perhaps a simple change, non effecting change in one of the other repos, merge in one of those repos, and then see your new workflow flag that generate.sh needs to be executed.

Correct?

Yeap! That's correct! Also apologies if my message was not clear.

I meant that in the current PR we can confirm that the generate.sh works as the PR imports changes merged in ai-lab-app from this commit. Those updates are imported in the ai-lab-helm-charts repo as the last time we imported changes was before this commit (in this PR). This can be confirmed from the corresponding workflow.

Now to test also the opposite, e.g that the workflow fails when "a change is introduced in workflow but not included in the PR", I've added a quick change in my fork of the ai-lab-app.

In my ai-lab-helm-charts fork, I've updated the script to point to my ai-lab-app fork and opened a test PR. I can confirm the workflow fails (status check failed here).

@gabemontero
Copy link
Contributor

Testing the 3 charts worked for me @thepetk , including an update to the app getting rolled out. I'm guessing until we have an update to ai-lab-app or ai-lab-samples the generate.sh script won't produce any changes (it did not for me when I ran locally).

Yeah it should be this way and pretty much this is the main reason I added the workflow just to be sure every time we are up-to-date with the ai-lab-app dependency. For the ai-lab-samples dependency, I think we are ok, as we directly populating the application repo from the ai-lab-samples we are automatically up-to-date. WDYT?

Sure all that is good. I'm just saying we can't fully validate until a change is made to those repo(s).
Before you move your Jira to closed, were you planning on making some sort of change in those repo(s) that would get caught by your github workflow, or get seen if someone ran generate.sh locally?

I think we are getting new changes right? Because we are importing the changes from the linter merged in the ai-lab-app recently. So the script ran and imported the updates on the ai-lab-app files. This can be verified by the workflow now as now that we run it again it doesn't introduce anything new.

If that occurred, I am unable to determine that. Is it possibly one of the 18 commits that pulled in that change that you can point me to?

Testing the 3 charts worked for me @thepetk , including an update to the app getting rolled out. I'm guessing until we have an update to ai-lab-app or ai-lab-samples the generate.sh script won't produce any changes (it did not for me when I ran locally).

Yeah it should be this way and pretty much this is the main reason I added the workflow just to be sure every time we are up-to-date with the ai-lab-app dependency. For the ai-lab-samples dependency, I think we are ok, as we directly populating the application repo from the ai-lab-samples we are automatically up-to-date. WDYT?

Sure all that is good. I'm just saying we can't fully validate until a change is made to those repo(s).
Before you move your Jira to closed, were you planning on making some sort of change in those repo(s) that would get caught by your github workflow, or get seen if someone ran generate.sh locally?

I think we are getting new changes right? Because we are importing the changes from the linter merged in the ai-lab-app recently. So the script ran and imported the updates on the ai-lab-app files. This can be verified by the workflow now as now that we run it again it doesn't introduce anything new.

Either I'm mis-interpreting what you are saying, or I'm not making myself sufficiently clear, or something along those lines @thepetk
Let me try this way: At some point we should be able to submit at PR against this repo, and see the print at https://github.com/redhat-ai-dev/ai-lab-helm-charts/pull/33/files#diff-119dd431be9907b10ad9714895fb89d998ccd55bf25f3cc6bb4f1573ed7ca7e2R21 occur because on of those other repositories have been updated, correct?
Ideally we test that flow with perhaps a simple change, non effecting change in one of the other repos, merge in one of those repos, and then see your new workflow flag that generate.sh needs to be executed.
Correct?

Yeap! That's correct! Also apologies if my message was not clear.

I meant that in the current PR we can confirm that the generate.sh works as the PR imports changes merged in ai-lab-app from this commit. Those updates are imported in the ai-lab-helm-charts repo as the last time we imported changes was before this commit (in this PR). This can be confirmed from the corresponding workflow.

Now to test also the opposite, e.g that the workflow fails when "a change is introduced in workflow but not included in the PR", I've added a quick change in my fork of the ai-lab-app.

In my ai-lab-helm-charts fork, I've updated the script to point to my ai-lab-app fork and opened a test PR. I can confirm the workflow fails (status check failed here).

Perfect :-) .... your verification via fork changes is even better then the post mortem "temp change" I was suggesting.

@gabemontero
Copy link
Contributor

WIP: because I'm working on an issue with the MODEL_ENDPOINT_BEARER env var and the update of the deployment done by the tekton pipeline

I've managed to make the bearer auth work. In order to achieve that I've opened a PR in the rhdh-pipelines side and I made an update to the app-config so it can stay permanently as a configmap so we can fetch each time the necessary variables.

@gabemontero PTAL :)

So is there a situation where we need to merge the rhdh-pipelines change BEFORE this one, or are you just citing the relationship, but they can merge independently @thepetk ?

@thepetk
Copy link
Collaborator Author

thepetk commented Jan 30, 2025

WIP: because I'm working on an issue with the MODEL_ENDPOINT_BEARER env var and the update of the deployment done by the tekton pipeline

I've managed to make the bearer auth work. In order to achieve that I've opened a PR in the rhdh-pipelines side and I made an update to the app-config so it can stay permanently as a configmap so we can fetch each time the necessary variables.
@gabemontero PTAL :)

So is there a situation where we need to merge the rhdh-pipelines change BEFORE this one, or are you just citing the relationship, but they can merge independently @thepetk ?

I'm quite on the fence, but honestly I don't see a blocker but as you say there's a relationship.

My plan is:

  • Merge this one, and the tekton will not be affected.
  • Merge the update on tekton so we can cover the case of the bearer auth.
  • Finally, I'll release a new helm chart version on the openshift-charts so we can have all this functionality there too.

@thepetk
Copy link
Collaborator Author

thepetk commented Jan 30, 2025

If that occurred, I am unable to determine that. Is it possibly one of the 18 commits that pulled in that change that you can point me to?

I think that one is a good example: ad618d8

@thepetk thepetk merged commit bcca534 into redhat-ai-dev:main Jan 30, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants