Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adjust Dockerfile & entrypoint and add Dependabot, CI & docker release #51

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

monotek
Copy link

@monotek monotek commented Feb 15, 2023

This PR adjusts the Dockerfile & adds Dependabot, CI and Docker release.

  • Dockerfile

    • remove user and crontab creation from entrypoint and move it to the dockerflile
    • set WORKDIR to /cvdupdate
    • use fixed version (3.12.0b1-slim) of the python image (for more fine graned updates)
    • remove gosu
    • container runs as user cvdupdate (fixed user id 1000) instead of root
    • add sudo to be able to run cron as user cvdupdate
    • files are stored in /cvdupdate/.cvdupdate (readme has been adjusted)
    • cvdupdate is run during docker build once so serve can be used directly without the need to run the update first
  • Dependabot

    • updates python / pip dependencies
    • updated the Dockerfile to the latest base python image
    • updates the github actions
  • CI

    • builds the docker image
    • runs the docker image
    • downloads main.cvd from running docker container via curl to check if everything works
    • updated github actions
  • Docker release using official docker build push action

    • image will be pushed to ghcr.io/Cisco-Talos/cvdupdate
      • github action genreal settings might need to be set to "Read and write permissions"
    • builds and pushes amd64 and arm64 docker images
    • pushes image when new commits in main (creates latest tag)
    • pushes image when new release / tag (added release drafter to simplify releases)
    • pushes image via daily github action schedule (created a tag with date and timestamp)
    • container can be tested by
      • docker run -it --net host --rm ghcr.io/monotek/cvdupdate
      • curl http://localhost:8000

Our usecase of this image is to run a database mirror inside a kubernetes cluster.

@monotek monotek changed the title add Dockerfile, Dependabot & CI add Dockerfile, Dependabot, CI & docker release Feb 15, 2023
@monotek
Copy link
Author

monotek commented Feb 24, 2023

@micahsnyder

would be nice this could get a review and some feedback :)

@val-ms
Copy link
Contributor

val-ms commented Feb 24, 2023

Hi @monotek thanks for the pull-request, and thanks for the reminder. Sorry about the delay. I will try to review it next week.

@monotek
Copy link
Author

monotek commented Apr 14, 2023

@micahsnyder

do you had any chance to look into it?
can i support you in any way?

@monotek
Copy link
Author

monotek commented May 3, 2023

Friendly ping :)

@val-ms
Copy link
Contributor

val-ms commented May 24, 2023

@monotek I'm sorry for dropping this and failing to notice the first major issue with this PR. There is an existing work in progress to add a Dockerfile to the project: #47

I intend to try to help finish cleanup on PR #47 and get it merged. After that, if you can base your work on that completed effort to extend the features you've added such as dependabot, CI, docker hub publishing - I think that would be best.

One thing to note is that your Dockerfile runs cvd update once, and then aims to serve a private mirror with the cvd serve feature. Meanwhile PR #47's Dockerfile aims to run cvd update on a schedule and does not offer the cvd serve feature.

I think running cvd update regularly makes the most sense for most people. Using cvd serve to host the mirror is less common, and others may have better/different webservers than the shoddy one used by cvd serve. And some people may not wish to host with a webserver at all.

My recommendation when updating this PR is to make it so the cvd serve feature is optional and is default-off. Perhaps it could be enabled by setting an environment variable before starting the container.

@monotek
Copy link
Author

monotek commented May 25, 2023

On best practice for docker containers is to be immutable.
This means a container should not change it contents while running.
Most environments nowdays also enforcing read only filesystems in containers so download of new virus definitions might fail anyway.
The corrrect way to use new definitions would be to build a new container.

@val-ms
Copy link
Contributor

val-ms commented May 25, 2023

And how often would you make a new container? The whole point of cvdupdate is to only update if an update is needed. If you're making a new container from scratch any time you want the latest definitions, you're misusing cvdupdate and evading the system we created intended to reduce data costs.

@monotek
Copy link
Author

monotek commented May 26, 2023

You could still do it by using a writable volume in a container ;-)

I would still prefer to rebuild my container (at least on a daily basis). This will also update all container dependencies like container image OS, all packages and libs. As every build get it's own version i also always know wich exact version of the container is running in my cluster.

Such builds / pushes to the registry could be done automatically. This is supported by Github actions via: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule

@val-ms
Copy link
Contributor

val-ms commented May 26, 2023

You could still do it by using a writable volume in a container ;-)

Yup! So that's how #47 does it. 👍

I would still prefer to rebuild my container (at least on a daily basis). This will also update all container dependencies like container image OS, all packages and libs. As every build get it's own version i also always know wich exact version of the container is running in my cluster.

Such builds / pushes to the registry could be done automatically. This is supported by Github actions via: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule

Rebuilding the image daily certainly makes it so you don't have to worry much about CVE's in the base image. That would be nice. 😄
Although ideally we would only have to push up a new image if there were actually new versions of the dependencies though. Best not to waste Docker Hub's bandwidth if there is no real difference in the image.

@val-ms
Copy link
Contributor

val-ms commented May 26, 2023

Ok @monotek merged #47.

If you're still interested in working on this, now is a good time to update your PR with the volume mount model in mind.

@monotek
Copy link
Author

monotek commented May 30, 2023

Thanks for the hint.
I'll add some changes later this or next week.

At least the user creation in the entrypoint is wrong and should be moved to the dockerfile itself, as user creation will fail anyway on readonly fs...

@ismvru
Copy link
Contributor

ismvru commented May 31, 2023

At least the user creation in the entrypoint is wrong and should be moved to the dockerfile itself

Hello! I agree that this is not correct, I just tried to make the UserID dynamic so that I could adapt to any infrastructure. I would be happy to see the corrected Dockerfile and gain some more experience.

@monotek monotek changed the title add Dockerfile, Dependabot, CI & docker release adjust Dockerfile & entrypoint and add Dependabot, CI & docker release May 31, 2023
@monotek
Copy link
Author

monotek commented May 31, 2023

I've adjusted the container (see updated pr description above).

Can be tested via:

Cron mode:

docker run -it --rm monotek/cvdupdate:0.0.2

Serve mode:

docker run -it --net host --rm monotek/cvdupdate:0.0.2 serve
curl http://localhost:8000

The release actions docker repo has still to be adjusted.
Just tell me where it should be pushed...

@monotek monotek force-pushed the add-dockerfile branch 7 times, most recently from 39bc749 to 0079e4e Compare June 1, 2023 07:53
@monotek
Copy link
Author

monotek commented Jun 22, 2023

@ismvru & @micahsnyder
did you have time to have a look?

@monotek
Copy link
Author

monotek commented Jun 26, 2023

i updated the pr to use ghcr.io, as no extra github secrets are needed for this.

@monotek
Copy link
Author

monotek commented Aug 22, 2023

friendly ping :)

@uudecode
Copy link

uudecode commented Mar 6, 2024

tbh, a little problem exists with file "/root/.cvdupdate/state.json"
it changes every update, and create new docker layer for this changes. Better to move state.json into mounted directory.

Signed-off-by: André Bauer <[email protected]>
@monotek
Copy link
Author

monotek commented Mar 6, 2024

You mean "/cvdupdate/.cvdupdate/state.json"?
Is this not wanted as it contains the "last modified" & "last checked" fields with the timestamps?
If we would remove the file the information about the state of the clamav database in the container would be lost.

@uudecode
Copy link

uudecode commented Mar 6, 2024

You mean "/cvdupdate/.cvdupdate/state.json"? Is this not wanted as it contains the "last modified" & "last checked" fields with the timestamps? If we would remove the file the information about the state of the clamav database in the container would be lost.

Yes, mea culpa, "/cvdupdate/.cvdupdate/state.json" will be mounted from host system, when run in docker, so no problem, sorry.

@val-ms val-ms self-requested a review October 15, 2024 15:29
Copy link
Contributor

@val-ms val-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really sorry I neglected this for so long. I don't have any excuse.

I have some concern with things in this PR.

I love adding .github/workflows/ci.yaml. The dependabot change seems good to me and a positive change. If you made this a separate PR I'd be happy to merge that.

I don't wan to add the release drafter stuff. I'm happy enough doing manual release notes. I'll add them to each release going forward.

The change to create the user in the Dockerfile instead of the entrypoint seems like a mild improvement, though we still need sudo (or gosu). I don't know the benefit of switching from gosu to sudo other than reducing the image size(?). I'm not sure if there is any other technical benefit or drawback of dropping gosu.

I see the logic to not use sudo/gosu if USER_ID == 0 is gone. I assume that's ... fine?

The permission bit changes on .gitignore and .dockerignore files are fine.

I have mixed feelings about publishing a container for cvdupdate and using ghcr.io.

There are a couple merge conflicts now (pypi.yaml, README.md). Sorry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion. It's neat tech. It shifts some work from release engineering over to grooming our PR titles, descriptions and labels... which I should probably do more of anyways.

But I'm not sure I can add Release Drafter to the org/repo. I'm not keen on trying to do this for a project that we update so infrequently. If we do go for it, then I'd want to add it to all of our projects in a consistent way. I'm not ready to commit to that.

At least for now, please remove this from this PR.

FROM python:3.12.0b1-slim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious to know why you suggested this change. I believe it is better with python:3-slim. Using a fine-grain image tag will be more work for maintenance, or else will leave people stuck on an unmaintained tag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, I am not ready to commit to Release Drafter for all our projects, and am not interested in adopting it just for cvdupdate, which we rarely update.

} | crontab -
fi
cron -f
sudo cron -f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on using gosu versus sudo. But it's the sort of change that someone (at least the person who added gosu to begin with) will be upset about.

It seems to me that sudo or gosu are necessary in order for cvdupdate to modify any user owned mounted directory. Sadly, we can't just strip them out and run without root privs.

I'm not comfortable with my current insight into how this works and sudo vs gosu in order to bless/merge this.

Can someone smarter than me please give me a breakdown on why I should/shouldn't accept the change to use sudo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dpendabot.yml Seems good to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our existing CI to build/publish docker images is done from an internal Jenkins and publishes to the Docker Hub registry.

I have mixed feelings about publishing a cvdupdate container, let alone publishing to a different registry.

Then again, this appears to be really simple, and easy to maintain. I have very mixed feelings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice. It would be very good to have CI.

@val-ms
Copy link
Contributor

val-ms commented Oct 15, 2024

If you're done with helping on this project, or otherwise unable -- I'll part it out to grab the CI and dependabot stuff which I know I want. Let me know.

@monotek
Copy link
Author

monotek commented Oct 25, 2024

updated the branch.
feel free to use only parts of it if you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants