-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decouple biobox implementations and input validation #163
Comments
This is directly related to #131. It also means that providing an independent reliable distribution channel for the validator binaries or deb package is less important since we can directly use the DockerHub. |
I believe using apt can solve this, as when the image is rebuilt the latest
I agree. I think the Dockerfiles have boilerplate code that confuses what
Could you expand this point further?
I think you are suggesting a wrapper script that runs the validation |
That's one approach but this way we still need to maintain an apt repository (overhead + network requirement). Since each biobox will have an independent version of the validation program, the versions will desynchronize relative to the built time of the containers. Therefore, we will not be directly able to push updates to the users of the biobox without altering or rebuilding individual containers. My suggestion would deliver the latest validation code to each biobox user by using our main distribution channel and technology: the Docker registry. Therefore, it should have a higher reliability and fewer dependencies.
If you have one input which needs to be validated, say a read library for assembly, it is guaranteed to be valid if the validator confirms validity. Then, it can be passed to any assembler biobox which accepts this kind of input. By integration of the validation program into the biobox, each assembly biobox would re-check the input. This is apparently not necessary.
No, in fact I mean to run an independent validation container before running the actual biobox. This would simply the biobox implementation by the separation of the validation and execution logic. |
Hi @michaelbarton and @pbelmann,
I have started to build a base Debian image to speed up and reduce the code needed for biobox implementations. Thereby, I'm now strongly favoring to separate the input validation from the actual passing and processing. There are multiple reasons why this would be beneficial:
All of those points could easily be circumvented by providing a single container image which would validate the input. One could provide one image per schema with deep inspection capabilities (like file format checks) or one general image.
The magic then happens in our bioboxes run wrapper which would call the validation container prior to running the actual biobox, if that is the desired behavior. Using this design, any biobox can assume to get correct input and restrict itself to a simple YAML parser.
The text was updated successfully, but these errors were encountered: