-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with generated resolvers (2/2) #25
Comments
@alrevuelta |
@nopeslide Right, this is a specific case since constraint On the other hand, maybe we could resolve ONLY according to the input type, and handle the output type inside each function. I'm suggesting this because operators like EyeLike have many combinations (with the current implementation this would have 12 switch cases with 12 cases each = 144 combinations). I don't think we really need 144 functions. We can have just 12 (one per input constrain) and handle the output inside each function with a simple switch case. This would also make the resolving process easier, because knowing the output type of a tensor in the resolving phase is not trivial. I mean, when we resolve the operators/functions, we don't really know what the output type is gonna be (i.e. we don't check the value of the attributes)
|
Generate all MNIST operators with the Python script and start integrating them in the existing code. So far only float type is supported. -Python generated code is now being used -MNIST and tinyYOLO models are working -Opset and function resolver is not hardcoded anymore. TODOs for next patches (see #24 #25) -Remove unused code -Remove all switch cases in the operator (no longer needed) -There is a minor issue when resolving the operator functions. When a function is not defined (i.e. for double) there is an error because that symbol can't be found. This should be linked with the default operator stub (using weakrefs?) -There is a workaround with the operator set. opset=12 is hardcoded for the initial tests. -Modify manually maxpool resolver. The Python generator for resolvers has to be rethought. Doesn't make sense for some operators. This modification is a temporal fix until we figure out how to handle this.
I'm implementing a new model that contains the operator constant and looks like an interesting case. This operator doesn't have any input, just one attribute and an output. The value that indicates the output type is the attribute that is set, so the resolving should be according to the attributes and not to the input type. Just FYI. |
@alrevuelta |
@nopeslide so you are suggesting that instead of resolving directly to the final function (operator + version + float/double...) we resolve only operator + version? Something more related to this that just came up with is the following. I'm specially concerned about this line |
The only alternative I see would be a function per operator that returns the types, but imho that would more complicated than doing it on demand. |
Just found another interesting case,
This creates a huge number of permutations, which creates a I would suggest generating less permutations, so instead of permutating T1, T2, T3, we could only permutate the two constraints that have more values (i.e. T2 and T3), or even just T3. Then we can handle all cases inside each function with a switch. Having that many combinations also makes difficult to generate the operator status #44. There are some other operators where we have the same issue: |
Maybe we need another approach? inline int add_int_int(int a, int b) { return a + b; }
inline float add_int_float(int a, float b) { return ((float)a) + b; }
#define ADD(TA,A,TB,B) _ADD(TA,A,TB,B)
#define _ADD(TA,A,TB,B) add_##TA##_##TB(A,B)
#define DATA(T,TENSOR) _DATA(T,TENSOR)
#define _DATA(T,TENSOR) TENSOR.T##_data
#include "arith.h"
operator_status operator_add( node_context *ctx) {
...
for(int i = 0; i < length; i++) {
DATA(T3,output)[i] = ADD( T1, DATA(T1,input1)[i], T2, DATA(T2,input2)[i] );
}
...
} and let the code generator generate the file #define T1 int
#define T2 float
#define T3 float
#include "operator_add.template" which would produce inline int add_int_int(int a, int b) { return a + b; }
inline float add_int_float(int a, float b) { return ((float)a) + b; }
operator_status operator_add( node_context *ctx) {
...
for(int i = 0; i < length; i++) {
output.float_data[i] = add_int_float(input1.int_data[i],input2.float_data[i]);
}
...
} this does not solve the problem of having all these permutations, but allows us to generate the permutations that are needed. |
the code generator still does not support a type subsets (ignore specific types), which may also mitigate, but not solve this problem. |
Nice idea but I'm not sure I want to rely on more autogenerated code (even if its with macros). If we can cover all the cases it should be fine, but I like having the flexibility of modifying the code directly. I haven't really worked with other types than Some types like ( /*
* For int32, uint8, int8, uint16, int16, bool, and float16 values
* float16 values must be bit-wise converted to an uint16_t prior
* to writing to the buffer.
* When this field is present, the data_type field MUST be
* INT32, INT16, INT8, UINT16, UINT8, BOOL, or FLOAT16
*/
size_t n_int32_data;
int32_t *int32_data; There is one thing I noticed in some operators (like the above mentioned This is the thing I don't like about having that many autogenerated code. Its nice to take into account the particularities of each operator, and be able to design solutions more ad hoc. This I ran some statistics with the operators, and these are the operators that have more than 15 combinations:
Here is a script to generate it: from onnx import onnx_cpp2py_export
import itertools
operators_with_perm = {}
all_schemas = [ operator for operator in onnx_cpp2py_export.defs.get_all_schemas_with_history()]
for operator in all_schemas:
constrainsts_type = [c.allowed_type_strs for c in operator.type_constraints]
operators_with_perm[operator.domain + " " + operator.name + str(operator.since_version)] = [i for i in itertools.product(*constrainsts_type)]
problematic_ops = {key:value for (key,value) in operators_with_perm.items() if len(value) > 15}
for key, value in problematic_ops.items():
print(key, len(value)) Less than 15 combinations should be fine, so here are the operators that we have to pay attention. |
this would be no problem since we match the types through the
I think #40 is related here. |
Yep, I had something like this in mind. I think we can have some kind of table with "custom rules" to limit the permutations for a specific operator. So for example, if we know that #custom_rules.py
custom_rules = [
{"operator": "OneHot",
"version": [],
"constrained_inputs": [0, 1],
"constrained_types": [["tensor(int64)"], ["tensor(int64)"]]}
# More rules...
{"operator": "xx", "version": [], "constrained_inputs": [], "constrained_types": []}
] This table would be used by the Python generator to limit the permutations that are generated. If someone in the future needs desperately support for one specific combination, they can just modify the rule and implement the newly generated combinations.
|
Some autogenerated resolvers look too complex, i.e.
resolve_operator__onnx__maxpool__12
. We don't need that many cases, just 5 "tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(uint8)".Action to take: Rethink the
OperatorTypeResolver.py
script.@nopeslide
The text was updated successfully, but these errors were encountered: