-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sanitize input on Python benchmarks #118
Comments
Hello @mcopik! I wanted to check in regarding the design of the sanitization. I already believe to have solved for the main API (i.e empty or undecodable JSON data) and I am looking to continue adding sanitization to benchmark data keys in a granular manner. As an example, I used sleep_time = event.get("sleep", default=False)
if not sleep_time:
return 'Error: No key "sleep" found on input data.'
elif not isinstance(sleep_time, (int, float)):
return f'Error: Key "sleep" expected int or float, not {type(sleep_time)}.' I wanted to share my design here before I continue adding sanitization checks to the remaining benchmark variables, just to make sure I am on the right path, so please feel free to let me know if I need to redesing this or introduce any changes! Here is my design draft: |
@nbcl Thank you so much for your interest in the project! Much appreciated. I think the overall approach is correct; the only issue is that functions should return a dictionary. The dictionary is later included by the invoker to form an overall response (look at the final return value of each benchmark). While thinking about your proposal, I realized that we do not have any way of signaling errors. I propose that the function should now return the following dictionary: {
"status": "success" OR "failure",
"result": ...,
"measurements": ...
} When returning with an error, you should pass the string explaining the problem as "result" field. In addition to your PR, we will have to modify the handling of invocations to detect the failure. The final return should now include |
@nbcl Do you plan to submit a PR for this, or can someone else work on this feature? |
Hi @mcopik!. It seems as is there is no activity in this issue for a very long time. I would like to work on this. Can you please guide me through this. Thank you! |
And, what does, "measurements" should contain? |
@niranjank2022 Absolutely, feel free to work on this! All contributions are appreciated. In the new output, we should have a boolean {
"status": "success" OR "failure",
"result": "string with error"
"measurements": {
# current JSON output
}
} |
Hi @mcopik! In some of the files, like function.py in Do I have to relocate the values in 'result' to 'measurement'? |
@niranjank2022 Correct! |
Hello @mcopik
The key |
@niranjank2022 EDIT: wrong, see the comment below. |
Then, I need to work on only the handler.py files inside wrapper/aws and wrapper/azure, right? |
@niranjank2022 I'm sorry for the confusion - I misread your question and my previous answer was wrong. Handlers defined for each platform and language execute the benchmark, make measurements, and return the data back to the user. You can see it here - right now, we only extract the Now, the first step would be to modify the benchmarks themselves such that they return the result, status, and potential message explaining error - this needs to incorporate the input verification of benchmarks. Here I was wrong last time - we need to change benchmarks for that :) You only verify one instance of function code per benchmark. Then, you need to modify Finally, on the client side, you would need to modify the current results parsing, like here - we store the entire output and later access subfields. Instead, we should properly retrieve new information to easily verify that the function executed correctly :) I apologise for the confusion - pleas let me know if you have further questions. |
@mcopik |
@niranjank2022 In every benchmark that gets from the JSON data on bucket access, we should check if these exist. You can do it manually or (better) validate with JSON schema :) |
Hello @mcopik! I have submitted my PR for this issue! Kindly review it and tell me your comments so I can improve it. |
@niranjank2022 Thank you for your work! I will try to review the PR soon :) |
Is this issue available ? @mcopik |
@octonawish-akcodes There is a pending PR with unresolved review comments #173 - you can work on that and see if you can improve it :) The main remaining issue was unfinished comments and that we also have to adjust our triggers to properly catch failure/success (I left a comment on that). |
All Python benchmarks require JSON as input. However, we never verify that the provided input matches expectations. When users invoke benchmarks incorrectly, they should receive a helpful notification. Right now, they usually receive a Python exception where a variable loaded from JSON ends up being
None
.All benchmarks should verify that the input values are provided in the JSON and are of correct type, and return a clear error message upon a failure.
The text was updated successfully, but these errors were encountered: