Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support script object pointer downcast #2018

Open
ivan-mogilko opened this issue Jun 1, 2023 · 14 comments
Open

Support script object pointer downcast #2018

ivan-mogilko opened this issue Jun 1, 2023 · 14 comments
Labels
ags 4 related to the ags4 development context: script compiler context: script vm type: enhancement a suggestion or necessity to have something improved

Comments

@ivan-mogilko
Copy link
Contributor

ivan-mogilko commented Jun 1, 2023

CC @fernewelten

After RTTI was added in ags4, it is now theoretically possible to find out the managed object's parent(s), knowing its type. This opens a possibility to support downcasting, that is - a conversion from the base type pointer to a child type pointer. As opposed to the upcasting, which may be calculated at compile time, downcasting has to be performed at runtime, as compiler cannot know which type will be stored in the base pointer.

What would be required for this?

  1. From compiler's side: a pointer cast syntax. The two obvious options here are:
  • C style: Child* child_ptr = (Child*)parent_ptr;
  • C# style: Child* child_ptr = parent_ptr as Child;
  • there's also C++ style, with cast<T>(parent_ptr) or cast<T*>(parent_ptr), but it looks more complicated.
  • EDIT: an approach not requiring a bytecode, but another instance of a "psuedo-function" (like array_Length): a static method that casts from parent to the given type, something like
child_ptr = ChildType.CastFrom(parent_ptr);
  1. A new bytecode instruction, which requests a dynamic cast. This instruction likely should take a ptr from a reg, analyze if it's possible to cast, cast, and store back in a reg (or reg1->reg2 if it's more convenient for any reason). If cast fails, a null pointer is written.
  2. From the engine's side: an ability to know object's type. The user managed structs are already storing typeid in their headers. The builtin objects do not; they may get one using the "manager"; but I have not thought this through yet. This also may be related to resolving AGS 4: Parent class for game objects #1228 first.
  3. There's an interesting question about types exported by plugin. Script compiler may see their relationship, and they should normally be a part of RTTI, but I am not sure how to match an object received from the plugin with RTTI yet. Something is missing in this picture.
@ivan-mogilko ivan-mogilko added type: enhancement a suggestion or necessity to have something improved ags 4 related to the ags4 development context: script compiler context: script vm labels Jun 1, 2023
@ivan-mogilko ivan-mogilko added this to the 4.0.0 (preliminary) milestone Jun 1, 2023
@ivan-mogilko ivan-mogilko changed the title New compiler: support pointer downcast AGS 4: support script object pointer downcast Jun 1, 2023
@ivan-mogilko ivan-mogilko changed the title AGS 4: support script object pointer downcast Support script object pointer downcast Jun 1, 2023
@ericoporto
Copy link
Member

There is also the python approach of using super. It would be something like this.super() would give the reference for that object as its parent class whatever that is - but this would be tricky with the type system we have in place.

Overall, I like the C# style best, as it doesn't automatically invite other types of casting as a complexity.

@ivan-mogilko
Copy link
Contributor Author

ivan-mogilko commented Jul 26, 2023

There is also the python approach of using super. It would be something like this.super() would give the reference for that object as its parent class whatever that is - but this would be tricky with the type system we have in place.

Hmm, but that upcasts as opposed of downcast? AGS already supports getting parent's pointer, you can do:

GUIControl *control = myButton;

EDIT: I think I forgot to mention another variant, idk how good it is, but worth mentioning: a static method that casts to this type from parent, like:

child_ptr = ChildType.CastFrom(parent_ptr);

I got this idea while thinking that GUIControl.AsButton, AsLabel etc is bad, because it hardcodes the list of descendants, so I wondered if we could instead have

Button.FromControl(control_ptr)

or similar.

@ivan-mogilko
Copy link
Contributor Author

ivan-mogilko commented Jan 7, 2025

@fernewelten , would you be willing to write a "typecast" operator for the new compiler?

I wrote a naive "dynamic cast" implementation in the engine here, in this experimental branch:
https://github.com/ivan-mogilko/ags-refactoring/commits/experiment--scriptdynamiccast/
but haven't tried adding anything to compiler's parser yet; it will probably take me some time to understand how to do this.

I'm thinking about a most trivial classic syntax of (T) value, which would mean "cast value to type T".
We may support this syntax only for pointers first and refuse to cast others (but this potentially may be expanded, see #2593).
Alternatively this may be value as T syntax in C# style, maybe that will be even easier to parse.

For pointers, I suppose, the compiler would need to do following:

  1. Check if this is going to be upcast or downcast.
  2. If upcast (child to parent), then this may be resolved at compile time same way as it is resolved now, so (T) becomes a no-op.
  3. If downcast (parent to child), then compiler should first test if type T is even a child of the value's type, and if not, then throw error, because that will never work.
  4. If all checks are passed, compiler would generate a new "dynamic cast" opcode. I called it SCMD_DYNAMICCAST in my test branch. This instruction assumes that MAR register has a handle, and arg1 is a literal value containing the target type id (this type id is local to script, similar to how SCMD_NEWUSEROBJECT2 and SCMD_NEWARRAY2 are implemented).
    Of course this is not a final suggestion, and instruction may be planned differently.
    SCMD_DYNAMICCAST is supposed to take a handle from MAR, try casting it to a new typeid, on success the handle remains in MAR, on failure writes 0 to MAR (so it becomes a null pointer), from where it is used in the rest of expression.
    In other words, SCMD_DYNAMICCAST is something like a test, its result is either same handle, or null.
    Null pointer from input remains null pointer regardless of types.
    No error is thrown on failure by SCMD_DYNAMICCAST itself, because failure is a valid outcome here too, and user is supposed to test the resulting pointer themselves.

If this is implemented in compiler, then the remaining task is implementing this in the engine. I think that so long as the instruction is designed, the engine's implementation may be done and improved at its own pace. This should work trivially for user-made managed types. I have got an idea of how to make builtin engine classes work too (there will have to be some changes, but that's engine's problem). It may or not work for plugins depending on how they registered their managed types, but maybe that's rather a matter of documenting the behavior and suggesting that plugin authors program them right.

@ericoporto
Copy link
Member

ericoporto commented Jan 7, 2025

Actually what I really want to be able to do is this

managed struct T {
  import void Do();
}

void T::Do() {
  Display("I am only T :( ...");
}

void DoExT(T* inst)
{
  inst.Do(); // make it print from ExT somehow, without specifying ExT!!
}

// in some other script
managed struct ExT extends T {
  import void Do();
}

void ExT::Do {
  Display("I am ExT :) !!!");
}

I think it's pointer upcast? I believe this is possible using python super. This would enable making a module that can take a function pointer of sorts, wrapping it as an extended struct.

@ivan-mogilko
Copy link
Contributor Author

ivan-mogilko commented Jan 8, 2025

@ericoporto your script example is not pointer upcast, it's an example of virtual functions.

Pointer upcast is casting from child to parent, and it's already supported in ags script:

Button *btn;
GUIControl *ctrl = btn;

@ericoporto
Copy link
Member

ah, ok, so it is for this https://www.adventuregamestudio.co.uk/forums/advanced-technical-forum/solved-trying-to-cast-struct-size-of-identifier-does-not-match-prototype/ .

I do think the as keyword from C# is the best way to go for this cast thing proposed here.

The other approach is no syntax at all and only support implicit cast of managed pointers. Like

GUIControl* ctrl;
Button* btn = ctrl;

@ivan-mogilko
Copy link
Contributor Author

The other approach is no syntax at all and only support implicit cast of managed pointers.

This will not make it possible to chain expressions, and also may be prone to mistakes imo.

@fernewelten
Copy link
Contributor

fernewelten commented Jan 8, 2025

@fernewelten , would you be willing to write a "typecast" operator for the new compiler?

Yes, as soon as the Engine supports this, we can get this done.

As for the AGS language, my personal favourite would be the as syntax, but I'm willing to implement anything that the consensus is:

struct Weapon { ];
struct MorningStar extends Weapon { import function foo(); };
…
Weapon w;
…
MorningStar func()
{
    (w as Morningstar).foo();
    w as Morningstar.foo(); // Do we _want_ that to compile?
    MorningStar ms = w as MorningStar;
    ms.foo();
    return ms as Weapon as MorningStar;
}

@ivan-mogilko
Copy link
Contributor Author

Alright, let it be p as T then.

Yes, as soon as the Engine supports this, we can get this done.

I can apply my SCMD_DYNAMICCAST implementation, but I do not know how to test it directly without compiler generating it...

@ivan-mogilko
Copy link
Contributor Author

ivan-mogilko commented Jan 21, 2025

Hmm, so, I've been thinking a possible implementation in the engine, and the big issue is matching a type declared by engine or plugin in script and the same type produced by engine or plugin at runtime. The way I wrote RTTI generation, it composes a fully qualified type name as a pair of "location::type", in order to distinguish types declared internally in scripts. The problem here is that both engine and plugins only know their bare "type" names, but they don't know about this "location" part. Currently it's based on script header filename, which in turn is generated according to some arbitrary rules by the editor. But I don't like having to rely on that.

At first I considered relying on "builtin" keyword, but then realized that plugins may also use this keyword (in fact, they should, although existing plugins probably do not).

Strictly speaking, engine may assume that since its types (and plugin's types) are declared for all scripts, then if there's any "type" in any location matching internal API name, then it should be that. So it could run around the RTTI table, finding these types, their global typeids, and, say, register a lookup "alias" without "location::" part, pointing to the same typeid, - that is for quicker access when doing any type matching with the engine's types.

The problem here though, even if we do that, we can't do same for plugin types, because there's no method in plugin API for registering types, there are only methods for registering type methods... which is an oversight, imo, but whether this will be changed or not is not obvious at the moment. In any case, we cannot tell which types belong to plugins, and cannot create same lookup aliases for them.

Eventually I came to a conclusion that we should add some sort of group identifier to the types declared as a engine or plugin API. AGS does not have any ready concept for that currently, there's no such thing as a "namespace" or "package" etc (maybe something like that would be useful, but it's a whole separate story, and I don't want to bring any new big feature right away).

What can be done though, is specifying a sort of a optional tag, used instead of "location" when building RTTI table. Such "tag" could be made in a more concise way than the generated header's name, and be ruled by an "official" standard.
As a quick idea, for the time being, this tag may be added as an (optional) appendix to a NEW_SCRIPT_MARKER. This will let avoid inventing new preprocessor or script commands. E.g. syntax could be "__NEWSCRIPTSTART_<scriptfilename>:<package_name>".
For example, say, for the header that declares engine's API, we may have a tag "builtin" (or maybe "ags"), and for header that declares plugin's API we have a tag named after plugin's name.
The compiler should check if such tag is available, and use it as a type's location when writing RTTI table, instead of just a script name:
__NEWSCRIPTSTART__BuiltinHeaderSomethingsomething.ash:builtin
__NEWSCRIPTSTART__PluginX.ash:pluginname
Then the engine will be able to know what types to look for when creating lookup aliases for the engine and plugin types.

@ivan-mogilko
Copy link
Contributor Author

ivan-mogilko commented Jan 21, 2025

In regards to the above comment. I figured out that it won't be possible or at least not convenient to do what I proposed without a new preprocessor command. The reason is that it's preprocessor who creates NEW_SCRIPT_MARKER in a script file, and preprocessor does not know what the script header means. We have a standalone preprocessor too, so we'd have to pass this information as an additional command-line parameter anyway, which is ugly.

I'm now thinking about this solution:

  • introduce new preprocessor command that declares the "group" for all the following type declarations in this script header.
    My current variants:
    #namespace X
    #typespace X
    #module X
    #package X
  • this command must the the first non-whitespace line in script.
  • when preprocessor finds this command, it remembers the X, and uses it to create NEW_SCRIPT_MARKER as
    __NEWSCRIPTSTART_<scriptfilename>,<package_name> (where package_name is X from the above command).
  • if such command was not found, then the NEW_SCRIPT_MARKER is created without "package_name" ammendment.
  • any repeated #package X command within the same file is ignored.
  • script compiler parses NEW_SCRIPT_MARKER, finds package name, and uses it when writing RTTI.

@ericoporto
Copy link
Member

The standalone version doesn't have the preprocessor ran separately, it runs integrated with the compiler - this is because of the macros that are configured in the compiler and necessary in the preprocessor, so there's back and forward between the two just with slight differences in how but not in behavior between editor and standalone.

@ivan-mogilko
Copy link
Contributor Author

ivan-mogilko commented Jan 21, 2025

The standalone version doesn't have the preprocessor ran separately, it runs integrated with the compiler - this is because of the macros that are configured in the compiler and necessary in the preprocessor, so there's back and forward between the two just with slight differences in how but not in behavior between editor and standalone.

Okay, but the point is, it still receives all data as command line arguments, while in the editor you can potentially pass arguments directly using a program interface. What I meant above, the standalone compiler program will not know what the header means, is that a engine api, or else.
In addition, I would not want to create direct program dependency between compiler and preprocessor, as they may be run as separate tools in theory too.

@ivan-mogilko
Copy link
Contributor Author

Opened a draft PR for the engine side: #2665

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ags 4 related to the ags4 development context: script compiler context: script vm type: enhancement a suggestion or necessity to have something improved
Projects
None yet
Development

No branches or pull requests

3 participants