Sanitize input on Python benchmarks #118

mcopik · 2023-02-20T21:02:12Z

All Python benchmarks require JSON as input. However, we never verify that the provided input matches expectations. When users invoke benchmarks incorrectly, they should receive a helpful notification. Right now, they usually receive a Python exception where a variable loaded from JSON ends up being None.

All benchmarks should verify that the input values are provided in the JSON and are of correct type, and return a clear error message upon a failure.

The text was updated successfully, but these errors were encountered:

nbcl · 2023-02-23T06:03:12Z

Hello @mcopik!

I wanted to check in regarding the design of the sanitization. I already believe to have solved for the main API (i.e empty or undecodable JSON data) and I am looking to continue adding sanitization to benchmark data keys in a granular manner.

As an example, I used 010.sleep to write an approach on how we could sanitize for absent keys or incorrect value types.

sleep_time = event.get("sleep", default=False)
if not sleep_time:
    return 'Error: No key "sleep" found on input data.'
elif not isinstance(sleep_time, (int, float)):
    return f'Error: Key "sleep" expected int or float, not {type(sleep_time)}.'

I wanted to share my design here before I continue adding sanitization checks to the remaining benchmark variables, just to make sure I am on the right path, so please feel free to let me know if I need to redesing this or introduce any changes!

Here is my design draft: nbcl/serverless-benchmarks/1

mcopik · 2023-02-24T14:18:44Z

@nbcl Thank you so much for your interest in the project! Much appreciated.

I think the overall approach is correct; the only issue is that functions should return a dictionary. The dictionary is later included by the invoker to form an overall response (look at the final return value of each benchmark).

While thinking about your proposal, I realized that we do not have any way of signaling errors. I propose that the function should now return the following dictionary:

{
  "status": "success" OR "failure",
  "result": ...,
  "measurements": ...
}

When returning with an error, you should pass the string explaining the problem as "result" field. In addition to your PR, we will have to modify the handling of invocations to detect the failure. The final return should now include "status": "success" in the output dictionary :)(

mcopik · 2023-04-27T00:07:48Z

@nbcl Do you plan to submit a PR for this, or can someone else work on this feature?

niranjank2022 · 2023-06-23T17:11:11Z

Hi @mcopik!. It seems as is there is no activity in this issue for a very long time. I would like to work on this. Can you please guide me through this. Thank you!

niranjank2022 · 2023-06-23T17:11:48Z

@nbcl Thank you so much for your interest in the project! Much appreciated.

I think the overall approach is correct; the only issue is that functions should return a dictionary. The dictionary is later included by the invoker to form an overall response (look at the final return value of each benchmark).

While thinking about your proposal, I realized that we do not have any way of signaling errors. I propose that the function should now return the following dictionary:
{
  "status": "success" OR "failure",
  "result": ...,
  "measurements": ...
}
When returning with an error, you should pass the string explaining the problem as "result" field. In addition to your PR, we will have to modify the handling of invocations to detect the failure. The final return should now include "status": "success" in the output dictionary :)(

And, what does, "measurements" should contain?

mcopik · 2023-06-25T21:49:48Z

@niranjank2022 Absolutely, feel free to work on this! All contributions are appreciated.

In the new output, we should have a boolean status, a result which should be a string explaining error (if it occurs), and measurements should be the current output we return, i.e., the output of benchmark & function:

{
  "status": "success" OR "failure",
  "result": "string with error"
  "measurements": {
    # current JSON output
  }
}

niranjank2022 · 2023-06-26T16:35:33Z

Hi @mcopik! In some of the files, like function.py in 210.thumbnailer, the return value is
return { 'result': { 'bucket': output_bucket, 'key': key_name }, 'measurement': { 'download_time': download_time, 'download_size': len(img), 'upload_time': upload_time, 'upload_size': resized_size, 'compute_time': process_time } }

Do I have to relocate the values in 'result' to 'measurement'?

mcopik · 2023-06-28T08:58:19Z

@niranjank2022 Correct!

niranjank2022 · 2023-07-07T16:29:35Z

Hello @mcopik
In this file,

import datetime
import igraph

def handler(event):

    size = event.get('size')

    graph_generating_begin = datetime.datetime.now()
    graph = igraph.Graph.Barabasi(size, 10)
    graph_generating_end = datetime.datetime.now()

    process_begin = datetime.datetime.now()
    result = graph.pagerank()
    process_end = datetime.datetime.now()

    graph_generating_time = (graph_generating_end - graph_generating_begin) / datetime.timedelta(microseconds=1)
    process_time = (process_end - process_begin) / datetime.timedelta(microseconds=1)

    return {
            'status': 'success',
            'result': result[0],
            'measurement': {
                'graph_generating_time': graph_generating_time,
                'compute_time': process_time
            }
    }

The key result holds the value result[0]. As result should be used to describe the returned value, what should be the new key name for result[0]?

mcopik · 2023-07-07T23:00:45Z

@niranjank2022 ~~Benchmarks should not be touched at all - you should only have to modify handler functions in benchmark/wrappers as they define the final output passed from our executor to users.~~

EDIT: wrong, see the comment below.

niranjank2022 · 2023-07-11T17:15:24Z

Then, I need to work on only the handler.py files inside wrapper/aws and wrapper/azure, right?

mcopik · 2023-07-15T16:05:58Z

@niranjank2022 I'm sorry for the confusion - I misread your question and my previous answer was wrong.

Handlers defined for each platform and language execute the benchmark, make measurements, and return the data back to the user. You can see it here - right now, we only extract the result field.

Now, the first step would be to modify the benchmarks themselves such that they return the result, status, and potential message explaining error - this needs to incorporate the input verification of benchmarks. Here I was wrong last time - we need to change benchmarks for that :) You only verify one instance of function code per benchmark.

Then, you need to modify handler.py for Python and handler,js for Node.js, each version for each platform. This is to ensure that this data is not lost but is instead passed to the user in the final output, like in here.

Finally, on the client side, you would need to modify the current results parsing, like here - we store the entire output and later access subfields. Instead, we should properly retrieve new information to easily verify that the function executed correctly :)

I apologise for the confusion - pleas let me know if you have further questions.

niranjank2022 · 2023-07-24T16:12:08Z

@mcopik
Values like 'bucket' and 'object' are accessed in multiple files. Do we have to check whether they exist and if their type is correct in every file? Or leave them and check the others?

mcopik · 2023-07-24T19:05:02Z

@niranjank2022 In every benchmark that gets from the JSON data on bucket access, we should check if these exist.

You can do it manually or (better) validate with JSON schema :)

https://python-jsonschema.readthedocs.io/en/stable/

niranjank2022 · 2023-08-12T07:16:24Z

Hello @mcopik! I have submitted my PR for this issue! Kindly review it and tell me your comments so I can improve it.

mcopik · 2023-08-12T13:53:45Z

@niranjank2022 Thank you for your work! I will try to review the PR soon :)

octonawish-akcodes · 2024-03-22T01:28:22Z

Is this issue available ? @mcopik

mcopik · 2024-03-22T10:48:10Z

@octonawish-akcodes There is a pending PR with unresolved review comments #173 - you can work on that and see if you can improve it :) The main remaining issue was unfinished comments and that we also have to adjust our triggers to properly catch failure/success (I left a comment on that).

mcopik added bug Something isn't working good first issue Good for newcomers labels Feb 20, 2023

This was referenced Feb 20, 2023

Internal Server error when Running 504.dna-visualisation #114

Closed

Another 500 internal error on 220.video-processing benchmark. Please assist #116

Closed

mcopik mentioned this issue Jul 17, 2023

"Faulty" benchmarks on OpenWhisk #138

Closed

niranjank2022 mentioned this issue Aug 4, 2023

Sanitize input #173

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sanitize input on Python benchmarks #118

Sanitize input on Python benchmarks #118

mcopik commented Feb 20, 2023

nbcl commented Feb 23, 2023

mcopik commented Feb 24, 2023 •

edited

Loading

mcopik commented Apr 27, 2023

niranjank2022 commented Jun 23, 2023 •

edited

Loading

niranjank2022 commented Jun 23, 2023 •

edited

Loading

mcopik commented Jun 25, 2023 •

edited

Loading

niranjank2022 commented Jun 26, 2023

mcopik commented Jun 28, 2023

niranjank2022 commented Jul 7, 2023 •

edited

Loading

mcopik commented Jul 7, 2023 •

edited

Loading

niranjank2022 commented Jul 11, 2023

mcopik commented Jul 15, 2023

niranjank2022 commented Jul 24, 2023

mcopik commented Jul 24, 2023

niranjank2022 commented Aug 12, 2023

mcopik commented Aug 12, 2023

octonawish-akcodes commented Mar 22, 2024

mcopik commented Mar 22, 2024

Sanitize input on Python benchmarks #118

Sanitize input on Python benchmarks #118

Comments

mcopik commented Feb 20, 2023

nbcl commented Feb 23, 2023

mcopik commented Feb 24, 2023 • edited Loading

mcopik commented Apr 27, 2023

niranjank2022 commented Jun 23, 2023 • edited Loading

niranjank2022 commented Jun 23, 2023 • edited Loading

mcopik commented Jun 25, 2023 • edited Loading

niranjank2022 commented Jun 26, 2023

mcopik commented Jun 28, 2023

niranjank2022 commented Jul 7, 2023 • edited Loading

mcopik commented Jul 7, 2023 • edited Loading

niranjank2022 commented Jul 11, 2023

mcopik commented Jul 15, 2023

niranjank2022 commented Jul 24, 2023

mcopik commented Jul 24, 2023

niranjank2022 commented Aug 12, 2023

mcopik commented Aug 12, 2023

octonawish-akcodes commented Mar 22, 2024

mcopik commented Mar 22, 2024

mcopik commented Feb 24, 2023 •

edited

Loading

niranjank2022 commented Jun 23, 2023 •

edited

Loading

niranjank2022 commented Jun 23, 2023 •

edited

Loading

mcopik commented Jun 25, 2023 •

edited

Loading

niranjank2022 commented Jul 7, 2023 •

edited

Loading

mcopik commented Jul 7, 2023 •

edited

Loading