Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autogenerate operator status #44

Open
nopeslide opened this issue Jul 14, 2020 · 7 comments
Open

autogenerate operator status #44

nopeslide opened this issue Jul 14, 2020 · 7 comments

Comments

@nopeslide
Copy link
Collaborator

It would be nice to autogenerate the operator overview

@nopeslide
Copy link
Collaborator Author

how about we restructure the overview, so it can be autogenerated?
sth like this?
❌ : not implemnted
✔ : implemented
blank: no valid input type

domain operator FLOAT UINT8 INT8 UINT16 INT16 INT32 INT64 STRING BOOL FLOAT16 DOUBLE UINT32 UINT64 COMPLEX64 COMPLEX128 BFLOAT16
ai.onnx Abs
ai.onnx Acos
ai.onnx Acosh
ai.onnx Add
ai.onnx And
ai.onnx ArgMax
ai.onnx ArgMin
ai.onnx Asin
ai.onnx Asinh
ai.onnx Atan
ai.onnx Atanh
ai.onnx AveragePool
ai.onnx BatchNormalization
ai.onnx Celu
ai.onnx DynamicQuantizeLinear
ai.onnx GreaterOrEqual
ai.onnx LessOrEqual
ai.onnx MeanSquaredDistance
ai.onnx MeanVarianceNormalization
ai.onnx NegativeLogLikelihoodLoss
ai.onnx Range
ai.onnx SoftmaxCrossEntropyLoss
ai.onnx.training Adagrad
ai.onnx.training Gradient
ai.onnx.training GraphCall
ai.onnx.training Momentum
ai.onnx.ml ArrayFeatureExtractor
ai.onnx.ml Binarizer
ai.onnx.ml CastMap
ai.onnx.ml CategoryMapper
ai.onnx.ml DictVectorizer
ai.onnx.ml FeatureVectorizer
ai.onnx.ml Imputer
ai.onnx.ml LabelEncoder
ai.onnx.ml LinearClassifier
ai.onnx.ml LinearRegressor
ai.onnx.ml Normalizer
ai.onnx.ml OneHotEncoder
ai.onnx.ml SVMClassifier
ai.onnx.ml SVMRegressor
ai.onnx.ml Scaler
ai.onnx.ml TreeEnsembleClassifier
ai.onnx.ml TreeEnsembleRegressor
ai.onnx.ml ZipMap

@nopeslide
Copy link
Collaborator Author

@alrevuelta with our current approach (no weak symbols) we may need to generate all onnx operators for this to work.
so #41 is related

@alrevuelta
Copy link
Owner

This makes me think something that we have been avoiding from the beginning. We are currently testing the the operators using the onnx "test vectors". However, these "test vectors" don't test all data types, but a single one (typically float as far as I have seen).

So lets say we implement an operator type that the onnx backend is not testing. To me, an operator that is not tested is not implemented. With this statement I'm saying that we should consider that an operator is implemented if a set of test cases for that operator are passing.

So first of all I think we should think a way to get one test vector for each type. As a first idea, we could reuse the onnx testing backend in test/node and with some Python magic convert it and generate as many types as we need. All the test vectors are generated with Python here, so we can reuse this.

Secondly, once we have the testcases for each data type, run them, and mark with ✔ the ones that are passing.

@alrevuelta
Copy link
Owner

Any thoughts on this?

As I previously stated I don't think the default test vectors that onnx provides are sufficient for us. As I already suggested, I think we can sort of reuse them and convert each on to the types that we need. Quick example.

Lets say we want to test Abs operator. The provided testcase inside node folder, tests float32 type. However, Abs operator is defined also for the following types tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16) and all of them are left untested.

Using the magic of what you have already used, we can access programatically the input types that each operator has.

all_schemas = [ s for s in onnx_cpp2py_export.defs.get_all_schemas_with_history()]

So continuing with Abs operator, we could autogenerate a set of tests using the one that is already provided. So using the float32 one, we autogenerate testcases for uint8, uint16 and so on. Just keep the same data, but change the type.

With something like this, we could say that a given operator is implemented if the corresponding testcase(s) are passing. Some thoughts:

  • a. The example I used is only valid if the operator has only 1 input.
  • b. The example is also valid if the operator has more than 1 input, but all the input shares the same constrain.
  • c. TBH I don't know how we can handle operators like Constant that have no inputs.
  • d. Also, I don't know how we can handle operators with several inputs and more than one constrain.

I ran some "statistics" on the operators, and among all 321 operators/versions, a total of 260 could be easily autogenerated (because they match a. and b. above).

I'm bringing this up because as I said I think the way that we can track if an operator is implemented or not is by looking t the testcases, and so far our testing strategy lacks some things.

The main decision I think we need to take is to:

  • Try to use the tests that onnx provides (that don't test all types) and try to build something on top like I have suggested above. This includes generating other tests using the onnx ones as reference.
  • Or on top of having the onnx tests, create our specific ones. Here we can create tests for different types, with different values, and in general, have a more rich set of test cases. This involve a lot of manual work (that can be backed with some Python to autogenerate the stuff we need). We could follow something like this. We can also extend the <class 'onnx.onnx_cpp2py_export.defs.OpSchema'> class with the testcases that we want.

I would go with option 2, but would like to discuss it with you.

@nopeslide
Copy link
Collaborator Author

@alrevuelta
I'm also pro testing, but dislike the way onnx does it.
My approach would be:

  • autogenerate model for each operator for each input permutation for each type permutation
  • fill input with "sane" but random floats
    • onnx does the same thing when generating test data
    • convert floats to other datatypes if needed
  • compare output with other onnx implementations like microsofts onnxruntime (native onnx, all operators implemented)
    • onnx compares against numpy implementations

This will produce a lot of tests, generate a lot of data without producing a massive number of files.
To achieve operator specific sane values, I would write a class that generates models for a specific operator schema and sublass this generator for each operator, so we can always enforce specific behaviour if needed.

@alrevuelta
Copy link
Owner

autogenerate model for each operator for each input permutation for each type permutation

Agree

onnx does the same thing when generating test data

Can you show where is this random float generation done? What I have seen so far are not randomly generated. example

Its nice to autogenerate as much as possible, but I think it is important to have some "manual" work when writing the testcases, so we can take into account the different particularities of each operator or type. So not just generate some float values and convert them to other types, but try to find some edge cases.

compare output with other onnx implementations like microsofts onnxruntime (native onnx, all operators implemented)
onnx compares against numpy implementations

We are lucky that onnx is already implemented and working, so there is no need to use numpy. We can just use the onnx runtime to calculate the expected values.

@nopeslide
Copy link
Collaborator Author

nopeslide commented Jul 17, 2020

Can you show where is this random float generation done? What I have seen so far are not randomly generated. example

transpose does this for example.
the test case specifies "sane" attributes (in this case all permutations of a hardcoded shape), but it uses random data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants