You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Blawx code is currently represented as XML. That method has been deprecated by Blockly, and there is an issue for upgrading (#137).
Once that has happened, it will be possible to give something like the -0613 models of GPT4 a JSON schema, and a set of examples to work from, and ask it to generate Blockly code directly. There is good reason to believe that it will be able to generate valid code, either with a codebase-specific JSON schema, or with a generic schema and detailed information on the availalble ontology and block types.
The problem is going to be context, because 8k will not contain everything we need it to know, plus multi-shot training examples, and I'm expecting the results will be bad without the examples.
Implement a bi-directional compression of Blawx JSON into a smaller representation, which can be expanded back using the block definitions.
Encode the smallified JSON representation as a JSON Schema for use in the OpenAI call.
Create an interface to select a set of examples that fit within the context limit.
Create an endpoint that will make the request and update the code in the relevant section.
The smallest way to test it is to do 2 and 3 first, manually generate JSON representations for some existing encodings, and do some leave-one-out testing to see if what we get back is syntactically correct and better than a blank screen. The larger project might involve breaking it into ontology and rule steps, because of the step-wise nature of the interface (you can't add a category and use it in the same code change).
An example of what the minified JSON might look like is given here for section 4 of the Rock Paper Scissors Act example:
A list is a list of blocks, which indicates an input. An object is a single block, which indicates a value. Those aren't distinguished in the Blockly representation, either. A value indicates a field. Extra state is not distinguished except that the field names don't appear in the block definition.
The only reason for doing this is that it would be roughly 1/3 the size in tokens of the Blockly JSON representation, which I'm hoping makes it feasible to do multi-shot, and increase the quality of the results while staying inside an 8k token context window.
The text was updated successfully, but these errors were encountered:
Blawx code is currently represented as XML. That method has been deprecated by Blockly, and there is an issue for upgrading (#137).
Once that has happened, it will be possible to give something like the -0613 models of GPT4 a JSON schema, and a set of examples to work from, and ask it to generate Blockly code directly. There is good reason to believe that it will be able to generate valid code, either with a codebase-specific JSON schema, or with a generic schema and detailed information on the availalble ontology and block types.
The problem is going to be context, because 8k will not contain everything we need it to know, plus multi-shot training examples, and I'm expecting the results will be bad without the examples.
So I'm thinking the path forward is:
The smallest way to test it is to do 2 and 3 first, manually generate JSON representations for some existing encodings, and do some leave-one-out testing to see if what we get back is syntactically correct and better than a blank screen. The larger project might involve breaking it into ontology and rule steps, because of the step-wise nature of the interface (you can't add a category and use it in the same code change).
An example of what the minified JSON might look like is given here for section 4 of the Rock Paper Scissors Act example:
A list is a list of blocks, which indicates an input. An object is a single block, which indicates a value. Those aren't distinguished in the Blockly representation, either. A value indicates a field. Extra state is not distinguished except that the field names don't appear in the block definition.
The only reason for doing this is that it would be roughly 1/3 the size in tokens of the Blockly JSON representation, which I'm hoping makes it feasible to do multi-shot, and increase the quality of the results while staying inside an 8k token context window.
The text was updated successfully, but these errors were encountered: