autogenerate operator status #44

nopeslide · 2020-07-14T18:48:03Z

It would be nice to autogenerate the operator overview

nopeslide · 2020-07-15T11:08:10Z

how about we restructure the overview, so it can be autogenerated?
sth like this?
❌ : not implemnted
✔ : implemented
blank: no valid input type

domain	operator	FLOAT	UINT8	INT8	UINT16	INT16	INT32	INT64	STRING	BOOL	FLOAT16	DOUBLE	UINT32	UINT64	COMPLEX64	COMPLEX128	BFLOAT16
ai.onnx	Abs	✔	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	Acos	✔	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	Acosh	✔	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	Add	✔	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	And	✔	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	ArgMax	✔	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	ArgMin	✔	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	Asin	✔	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	Asinh	✔	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	Atan	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	Atanh	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	AveragePool	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	BatchNormalization	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	Celu	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	DynamicQuantizeLinear	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	GreaterOrEqual	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	LessOrEqual	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	MeanSquaredDistance	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	MeanVarianceNormalization	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	NegativeLogLikelihoodLoss	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	Range	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx	SoftmaxCrossEntropyLoss	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.training	Adagrad	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.training	Gradient	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.training	GraphCall	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.training	Momentum	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	ArrayFeatureExtractor	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	Binarizer	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	CastMap	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	CategoryMapper	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	DictVectorizer	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	FeatureVectorizer	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	Imputer	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	LabelEncoder	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	LinearClassifier	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	LinearRegressor	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	Normalizer	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	OneHotEncoder	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	SVMClassifier	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	SVMRegressor	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	Scaler	❌	❌	❌	❌	❌	❌	❌		❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	TreeEnsembleClassifier	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	TreeEnsembleRegressor	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌
ai.onnx.ml	ZipMap	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌

nopeslide · 2020-07-15T11:20:33Z

@alrevuelta with our current approach (no weak symbols) we may need to generate all onnx operators for this to work.
so #41 is related

alrevuelta · 2020-07-16T11:11:55Z

This makes me think something that we have been avoiding from the beginning. We are currently testing the the operators using the onnx "test vectors". However, these "test vectors" don't test all data types, but a single one (typically float as far as I have seen).

So lets say we implement an operator type that the onnx backend is not testing. To me, an operator that is not tested is not implemented. With this statement I'm saying that we should consider that an operator is implemented if a set of test cases for that operator are passing.

So first of all I think we should think a way to get one test vector for each type. As a first idea, we could reuse the onnx testing backend in test/node and with some Python magic convert it and generate as many types as we need. All the test vectors are generated with Python here, so we can reuse this.

Secondly, once we have the testcases for each data type, run them, and mark with ✔ the ones that are passing.

alrevuelta · 2020-07-17T14:40:55Z

Any thoughts on this?

As I previously stated I don't think the default test vectors that onnx provides are sufficient for us. As I already suggested, I think we can sort of reuse them and convert each on to the types that we need. Quick example.

Lets say we want to test Abs operator. The provided testcase inside node folder, tests float32 type. However, Abs operator is defined also for the following types tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16) and all of them are left untested.

Using the magic of what you have already used, we can access programatically the input types that each operator has.

all_schemas = [ s for s in onnx_cpp2py_export.defs.get_all_schemas_with_history()]

So continuing with Abs operator, we could autogenerate a set of tests using the one that is already provided. So using the float32 one, we autogenerate testcases for uint8, uint16 and so on. Just keep the same data, but change the type.

With something like this, we could say that a given operator is implemented if the corresponding testcase(s) are passing. Some thoughts:

a. The example I used is only valid if the operator has only 1 input.
b. The example is also valid if the operator has more than 1 input, but all the input shares the same constrain.
c. TBH I don't know how we can handle operators like Constant that have no inputs.
d. Also, I don't know how we can handle operators with several inputs and more than one constrain.

I ran some "statistics" on the operators, and among all 321 operators/versions, a total of 260 could be easily autogenerated (because they match a. and b. above).

I'm bringing this up because as I said I think the way that we can track if an operator is implemented or not is by looking t the testcases, and so far our testing strategy lacks some things.

The main decision I think we need to take is to:

Try to use the tests that onnx provides (that don't test all types) and try to build something on top like I have suggested above. This includes generating other tests using the onnx ones as reference.
Or on top of having the onnx tests, create our specific ones. Here we can create tests for different types, with different values, and in general, have a more rich set of test cases. This involve a lot of manual work (that can be backed with some Python to autogenerate the stuff we need). We could follow something like this. We can also extend the <class 'onnx.onnx_cpp2py_export.defs.OpSchema'> class with the testcases that we want.

I would go with option 2, but would like to discuss it with you.

nopeslide · 2020-07-17T15:04:11Z

@alrevuelta
I'm also pro testing, but dislike the way onnx does it.
My approach would be:

autogenerate model for each operator for each input permutation for each type permutation
fill input with "sane" but random floats
- onnx does the same thing when generating test data
- convert floats to other datatypes if needed
compare output with other onnx implementations like microsofts onnxruntime (native onnx, all operators implemented)
- onnx compares against numpy implementations

This will produce a lot of tests, generate a lot of data without producing a massive number of files.
To achieve operator specific sane values, I would write a class that generates models for a specific operator schema and sublass this generator for each operator, so we can always enforce specific behaviour if needed.

alrevuelta · 2020-07-17T20:41:15Z

autogenerate model for each operator for each input permutation for each type permutation

Agree

onnx does the same thing when generating test data

Can you show where is this random float generation done? What I have seen so far are not randomly generated. example

Its nice to autogenerate as much as possible, but I think it is important to have some "manual" work when writing the testcases, so we can take into account the different particularities of each operator or type. So not just generate some float values and convert them to other types, but try to find some edge cases.

compare output with other onnx implementations like microsofts onnxruntime (native onnx, all operators implemented)
onnx compares against numpy implementations

We are lucky that onnx is already implemented and working, so there is no need to use numpy. We can just use the onnx runtime to calculate the expected values.

nopeslide · 2020-07-17T21:27:12Z

Can you show where is this random float generation done? What I have seen so far are not randomly generated. example

transpose does this for example.
the test case specifies "sane" attributes (in this case all permutations of a hardcoded shape), but it uses random data.

alrevuelta mentioned this issue Jul 20, 2020

Problem with generated resolvers (2/2) #25

Open

nopeslide mentioned this issue Aug 5, 2020

Build connxr as a shared library, to enable python bindings for testing #60

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

autogenerate operator status #44

autogenerate operator status #44

nopeslide commented Jul 14, 2020

nopeslide commented Jul 15, 2020

nopeslide commented Jul 15, 2020

alrevuelta commented Jul 16, 2020

alrevuelta commented Jul 17, 2020

nopeslide commented Jul 17, 2020

alrevuelta commented Jul 17, 2020

nopeslide commented Jul 17, 2020 •

edited

Loading

autogenerate operator status #44

autogenerate operator status #44

Comments

nopeslide commented Jul 14, 2020

nopeslide commented Jul 15, 2020

nopeslide commented Jul 15, 2020

alrevuelta commented Jul 16, 2020

alrevuelta commented Jul 17, 2020

nopeslide commented Jul 17, 2020

alrevuelta commented Jul 17, 2020

nopeslide commented Jul 17, 2020 • edited Loading

nopeslide commented Jul 17, 2020 •

edited

Loading