Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The toString() of 'class' and 'function' return SourceText #1726

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

scgm0
Copy link
Contributor

@scgm0 scgm0 commented Jan 5, 2024

I noticed that the previous reason for returning native code was that Esprima didn't support it (in the ToString comment for Function), but I found that Esprima now supports returning SourceText, so wanted to align with the behaviour of the mainstream js engines.
Unfortunately, passing the Test262 test still requires waiting for Esprima to refine the functionality, but that's outside the scope of this pr and beyond my capabilities (
(I accidentally closed the old pr earlier and reset my repository, so I'm re-proposing it now...)

@lofcz
Copy link
Contributor

lofcz commented Feb 7, 2024

@lahma I just tested the branch and while there is still room for improvement as mentioned in by scgm0, this is already a great improvement over the current behaviour. Can we rebase the branch and merge it, or is there something blocking that?

@lahma
Copy link
Collaborator

lahma commented Feb 7, 2024

See the build status.

@lofcz
Copy link
Contributor

lofcz commented Feb 7, 2024

sorry, took another look into the failed tests, all but the ones (90) testing for the native function toString() could be fixed by modifying the Regex used but I don't know how to fix these without Esprima returning the correct function name in the first place.

@scgm0
Copy link
Contributor Author

scgm0 commented Feb 8, 2024

There are now only 90 toString tests that have failed. In fact, they cannot pass the toString test now. They pass the "NativeFunction" test in "assertToStringOrNativeFunction" in the main repository.

main jint: function f() { [native code] }
图片
test262: ( /* a */ a /* b */ , /* c */ b /* d */ ) /* e */ => /* f */ { /* g */ ; /* h */ }
图片
this pr: (a, b) => { ; }
图片

@scgm0
Copy link
Contributor Author

scgm0 commented Feb 8, 2024

@lahma Should we skip related tests? Or should we wait for Esprima to update? Does Esprima have any relevant update plans?

@lofcz
Copy link
Contributor

lofcz commented Feb 8, 2024

I think skipping the 90 tests would be reasonable as these cover only function toString() which is wastly improved in this PR.
Nit: is there a reason not to precompile the regex used instead of constructing it every time?

@scgm0
Copy link
Contributor Author

scgm0 commented Feb 8, 2024

I think skipping the 90 tests would be reasonable as these cover only function toString() which is wastly improved in this PR. Nit: is there a reason not to precompile the regex used instead of constructing it every time?

This is a test code. I can change it anytime.

@scgm0
Copy link
Contributor Author

scgm0 commented Feb 8, 2024

I think skipping the 90 tests would be reasonable as these cover only function toString() which is wastly improved in this PR. Nit: is there a reason not to precompile the regex used instead of constructing it every time?

How about now?

@lofcz
Copy link
Contributor

lofcz commented Feb 8, 2024

@scgm0 thanks
@lahma could you have another look at this please? The 90 tests would need to be skipped.

@scgm0
Copy link
Contributor Author

scgm0 commented Feb 9, 2024

Changing just the jint no longer made the results more correct, so I updated Test262Harness.settings.json to skip those 90 toString tests

@lofcz
Copy link
Contributor

lofcz commented Feb 18, 2024

I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima

@scgm0
Copy link
Contributor Author

scgm0 commented Feb 18, 2024

I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima

It doesn't seem to work.
图片

@lofcz
Copy link
Contributor

lofcz commented Feb 21, 2024

I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima

It doesn't seem to work. 图片

AST to JS was implemented in Acornima today though there are still some rough edges from my testing.

@scgm0
Copy link
Contributor Author

scgm0 commented Feb 21, 2024

I wonder whether we wait for Acornima with this one, as it should provide the information we need from the quick look I took - https://github.com/adams85/acornima

It doesn't seem to work. 图片

AST to JS was implemented in Acornima today though there are still some rough edges from my testing.

That fast?
@lahma What do you think?

@@ -359,6 +364,10 @@ public ClassStaticBlockFunction(StaticBlock staticBlock) : base(Nodes.StaticBloc
{
var methodDef = method.DefineMethod(obj);
methodDef.Closure.SetFunctionName(methodDef.Key);
// TODO currently in the SourceText retrieved from Esprima, the method name is incorrect, so for now, use a string replacement.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to fix the wrong name on Esprima side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I don't know how to fix it in Esprima (

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of example test cases. You would just need to check the produced AST and that the identifier is correct. Minimal test case helps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it, and after debugging for a while, I gave up. Maybe you can take a look? I'll try it again later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally have a lot more free time for OSS if someone takes care of my child and cleans the house.

@adams85
Copy link
Contributor

adams85 commented Apr 24, 2024

Chiming in with some additional thoughts, hope you don't mind.

In my opinion this (i.e. utilizing the AST to JS source gen feature of the parser) is not the right way to solve this problem. Neither Esprima, nor Acornima is able to reproduce the exact source text for reasons thoroughly discussed here.

Instead of the source gen way, I'd suggest a simpler one that neither needs more support from the parser side, nor ugly hacks on the runtime side: Jint just needs to keep around the original script/module code that was parsed. Then, when you need the source text of a function or class, you can get that as a slice of the script/module code, based on the Range property of the related AST node.

The tricky part is where to store the original script/module code so it can be accessed wherever needed. At first glance, something like ExecutionContext.ScriptOrModule looks like a good candidate. However, that doesn't seem to work with dynamic code (eval, Function ctor, shadow realms).

What's even worse, let's consider a script like this: eval('function fn() {}'); fn.toString(). Here the direct eval defines a function in the caller's scope. But the source text of fn must be extracted from the evaled code... 🤯

So, I can't really see a better way than using the OnNode(Created) callback to store a reference to the parsed code string into the relevant node's AssociatedData/UserData property. This looks a bit cumbersome though, as that property is already used by Jint for other purposes. But probably, with some gymnastics, you could make it work.

But @lahma might have an even better idea on storing this piece of information.

@scgm0
Copy link
Contributor Author

scgm0 commented Sep 15, 2024

Chiming in with some additional thoughts, hope you don't mind.

In my opinion this (i.e. utilizing the AST to JS source gen feature of the parser) is not the right way to solve this problem. Neither Esprima, nor Acornima is able to reproduce the exact source text for reasons thoroughly discussed here.

Instead of the source gen way, I'd suggest a simpler one that neither needs more support from the parser side, nor ugly hacks on the runtime side: Jint just needs to keep around the original script/module code that was parsed. Then, when you need the source text of a function or class, you can get that as a slice of the script/module code, based on the Range property of the related AST node.

The tricky part is where to store the original script/module code so it can be accessed wherever needed. At first glance, something like ExecutionContext.ScriptOrModule looks like a good candidate. However, that doesn't seem to work with dynamic code (eval, Function ctor, shadow realms).

What's even worse, let's consider a script like this: eval('function fn() {}'); fn.toString(). Here the direct eval defines a function in the caller's scope. But the source text of fn must be extracted from the evaled code... 🤯

So, I can't really see a better way than using the OnNode(Created) callback to store a reference to the parsed code string into the relevant node's AssociatedData/UserData property. This looks a bit cumbersome though, as that property is already used by Jint for other purposes. But probably, with some gymnastics, you could make it work.

But @lahma might have an even better idea on storing this piece of information.

In fact, the method of intercepting the original text cannot pass the test262 test, because some tests require that the code returned by toString be generated and formatted by AST, and the original text cannot meet this requirement, so even it now passes AST The generated code is not perfect, but it is at least a formatted result.

@adams85
Copy link
Contributor

adams85 commented Sep 15, 2024

I'm not sure I follow you... So, are you saying that there are some tests which require the original source text of the function to be regenerated/reformatted? Can you show me such a test?

@scgm0
Copy link
Contributor Author

scgm0 commented Sep 15, 2024

I'm not sure I follow you... So, are you saying that there are some tests which require the original source text of the function to be regenerated/reformatted? Can you show me such a test?

https://github.com/tc39/test262/blob/main/test/built-ins/Function/prototype/toString/AsyncFunction.js

@adams85
Copy link
Contributor

adams85 commented Sep 16, 2024

Oh ok, I see now what you are getting at.

Apparently, functions not parsed directly but created dynamically via the Function constructor will need special treatment. In that case the runtime (Jint) should generate the source text of the function. According to my tests, this is done by simply concatenating the parameters passed to the Function constructor:

(async () => {
  async function f() {}
  var AsyncFunction = f.constructor;
  var p1 = "a", p2 = "/*a */ b, c";
  var g = /* before */AsyncFunction(p1, p2, "return Promise.resolve(a+b+c) /* d */ //")/* after */; 
  console.log(g.toString());
  console.log(await g(1, 2, 3));
})()

If you want to verify that there is indeed a simple string concatenation under the hood (at least in V8 and Firefox), run this:

(async () => {
  async function f() {}
  var AsyncFunction = f.constructor;
  var p1 = "a //", p2 = "/*a */ b, c";
  var g = /* before */AsyncFunction(p1, p2, "return Promise.resolve(a+b+c) /* d */ //")/* after */; 
  console.log(g.toString());
  console.log(await g(1, 2, 3));
})()

@adams85
Copy link
Contributor

adams85 commented Sep 16, 2024

As far as I can tell, a template like this is used to generate the function source text:

`${isPrototypeAsync ? "async " : ""}function anonymous(${[...arguments].slice(0, arguments.length -1).join(",")}
) {
${arguments[arguments.length -1]}
}`

Probably generator functions are also supported, so it may be a bit more complicated than this.

EDIT: looks like this is already implemented here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants