-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 #781
S3 #781
Conversation
…m to local docker for selenium
…tional tests to use docker-compose
@@ -18,11 +18,11 @@ create_db: start | |||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The host network change required a change from run
to exec
, not entirely sure why....
if os.getenv("S3_ENDPOINT"): | ||
return boto3.client('s3', config=config, endpoint_url=os.getenv('S3_ENDPOINT')) | ||
else: | ||
return boto3.client('s3', config=config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use local minio if an env var is set, otherwise use S3 as normal
@@ -226,8 +234,22 @@ def compute_hash(data_to_hash): | |||
return hashlib.sha256(data_to_hash).hexdigest() | |||
|
|||
|
|||
def get_presigned_image_url(s3_path: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pulled code from upload_obj_to_s3
out to it's own function
img_id = random.choice(images).id | ||
return models.Face(officer_id=officer.id, | ||
img_id=img_id, | ||
original_image_id=img_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
found + fixed a bug. We only ever used the one face on the dev data, probably unintentional.
department_id=1) for x in range(5)] + \ | ||
[models.Image( | ||
filepath=utils.get_presigned_image_url('/images/test_cop{}.png'.format(x + 1)), | ||
department_id=2) for x in range(5)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the /static/...
image path to the presigned url
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sweet, we're no longer hackily relying on two different routes between testing and production!
# shutdown server | ||
driver.get("http://localhost:5000/shutdown") | ||
time.sleep(3) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the special pytest flask invocation from the browser
fixture (it was starting a new flask with all the defaults, including a default database connection that I couldn't override)
total = Officer.query.filter_by(department_id=dept_id).count() | ||
perpage = BaseConfig.OFFICERS_PER_PAGE | ||
total = list(engine.execute('select count(1) from officers where department_id = 1'))[0][0] | ||
perpage = DevelopmentConfig.OFFICERS_PER_PAGE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
honestly couldn't figure out how to get Officer.query
to use my new engine
pointing to the docker container... if anyone has an idea, lemme know.
/usr/bin/mc config host add minio http://localhost:9000 ${AWS_ACCESS_KEY_ID} ${AWS_SECRET_ACCESS_KEY}; | ||
/usr/bin/mc policy set public minio/${S3_BUCKET_NAME}; | ||
" | ||
network_mode: "host" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this last mc
container sets an access policy on the "s3" minio bucket to public, and then shuts down, like a pre-init container in kube
EXPOSE 3000 | ||
|
||
ENV PATH="/usr/src/app/geckodriver:${PATH}" | ||
ENV SECRET_KEY 4Q6ZaQQdiqtmvZaxP1If | ||
ENV SQLALCHEMY_DATABASE_URI postgresql://openoversight:terriblepassword@postgres/openoversight-dev | ||
ENV SQLALCHEMY_DATABASE_URI postgresql://openoversight:terriblepassword@localhost:5432/openoversight-dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
host network swap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a real good start! Thank you for introducing us to this project, it has some great potential for both local dev and also self-contained production use by people hosting small departments.
I left a few inline comments.
I think CI is failing because we no longer are using the static routes and thus are invoking Boto in the tests / test-data-creation parts. You can add a fallback to the current_app.config['S3_BUCKET'] getenv() call, or specify a default in the development and testing configs and that should at least let the tests get further.
I'd have to yield to @redshiftzero on the test_functional.py changes -- she knows more about testing Flask with Selenium than I do by a large margin -- but otherwise this looks real promising.
OpenOversight/app/utils.py
Outdated
@@ -29,6 +30,13 @@ | |||
SAVED_UMASK = os.umask(0o077) | |||
|
|||
|
|||
def get_s3_connection(config=None): | |||
if os.getenv("S3_ENDPOINT"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you feel about moving this into app/config.py and setting a constant there, so we are pulling all our env vars from one place? Could plop it right above https://github.com/lucyparsons/OpenOversight/blob/develop/OpenOversight/app/config.py#L31 and then do a if current_app.config['S3_ENDPOINT']:
here (and in the endpoint_url setting on the next line)?
OpenOversight/app/utils.py
Outdated
def upload_obj_to_s3(file_obj, dest_filename): | ||
s3_client = boto3.client('s3') | ||
if os.getenv('S3_CONNECTION'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above re: os.getenv :)
return models.Face(officer_id=officer.id, | ||
img_id=num, | ||
original_image_id=random.choice(images).id) | ||
# for num in range(1, len(images)): # <-- bug here, we always return on the first one, the for-loop is unneeded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# for num in range(1, len(images)): # <-- bug here, we always return on the first one, the for-loop is unneeded |
Good catch!
department_id=1) for x in range(5)] + \ | ||
[models.Image( | ||
filepath=utils.get_presigned_image_url('/images/test_cop{}.png'.format(x + 1)), | ||
department_id=2) for x in range(5)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sweet, we're no longer hackily relying on two different routes between testing and production!
- postgres:/var/lib/postgresql/data | ||
ports: | ||
- "5432:5432" | ||
network_mode: "host" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to change all the containers to network_mode: "host"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we refer to the other docker services from both the frontend (unsigned url image lookups in HTML) and the backend (image uploading) - can't hit http://minio:1234
from the browser, and it's not exposed in a format the browser can hit otherwise. The alternative that I was considering was making a proxy route from the backend - but I think that'd introduce more "are we in testing?" conditionals in the codebase that I was trying to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. I see you already mentioned that in the PR summary and I missed it, sorry.
So basically we only save one string for the URL of an image, but the backend would need a different host to connect to minio than the front-end. That makes sense!
The docker-documentation says "The host networking driver only works on Linux hosts, and is not supported on Docker Desktop for Mac, Docker Desktop for Windows, or Docker EE for Windows Server." Would this mean that with those changes, the dev-environment is only functional on Linux? In that case, I think we should definitely explore other options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! I never knew that! That sucks! I code on linux. I'll look into the proxy endpoint if that's okay? The image can be http://localhost/proxy/file123.png
or something? and the backend can fetch it from S3/minio?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! I never knew that! That sucks! I code on linux.
I am also on Linux, so I cannot confirm that it doesn't work, but that's what I read from the documentation.
I'll look into the proxy endpoint if that's okay? The image can be
http://localhost/proxy/file123.png
or something? and the backend can fetch it from S3/minio?
Sounds good to me. Thank you for investigating this!
Can't see the @abandoned-prototype comment re: replicating the tests that failed in CI/CD, but I'll try to wipe my env clean to see if I can get the tests to fail locally. I was a bit stuck and put this down for a while due to it. I'll also clear out the merge conflicts in the next few days. |
Closing until proxy endpoint and passing tests |
Sorry, I added that to the issue instead: #171 (comment) So the first thing that happened locally was that the minio image wouldn't start because of missing Now, if I want to upload an image it throws an error with the following stacktrace:
I can access localhost:9000 from my browser. And even from within the web-container I can access
|
Status
Ready for review
Description of Changes
Adds an S3-compatible API container using Minio
Fixes #171 .
Changes proposed in this pull request:
test_functional.py
to access site via docker-compose rather than a pytest flask, uses docker-compose database rather than a pytest database.Notes for Deployment
Shouldn't affect prod. Should be QA'd in staging to ensure that the normal S3 upload and viewing is unaffected.
Screenshots (if appropriate)
Tests and linting
I have rebased my changes on current
develop
pytests pass in the development environment on my local machine
flake8
checks pass