Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JVM] Necromance a cheaper (de)serialization #777

Merged
merged 22 commits into from
Oct 29, 2022
Merged

Conversation

Kaiepi
Copy link
Contributor

@Kaiepi Kaiepi commented Aug 4, 2022

This started as means of fixing rakudo/rakudo#4952 for the next stretch of #776, but is kind of its own rabbit hole. The hs_err log produced by the failure in CORE.d.setting points to this line as the source of the error from within nqp:

CompilationUnit cu = (CompilationUnit)c.newInstance();

While looking around here, I noticed we are very lackadaisical with how we handle serialized bytecode within memory. Namely, when LibraryLoad.load is invoked, we can wind up copying entire files in memory multiple times over, and their decompressed classfile and bytecode files persist in memory as long as their classes do. I haven't quite managed to even provoke a change in the CORE.d.setting error produced, but I have managed to convince CORE.c.setting to build in 90% of the time it does on master by reducing the copying to just one instance so far. This can be taken further to eliminate copying of files in bulk entirely, but will take time.

Passes make test with OpenJDK 19-ea, I guess.

Kaiepi added 6 commits August 4, 2022 06:33
We use Java 9 now, so we can get rid of some of the locking involved
here.
We've been depending largely on byte *arrays* of entire files so far.
Now we prefer to mmap in small chunks when access to a file is given,
doing a heap allocation given raw byte input instead, waiting until the
last minute to dump the entire thing in memory in either case.

We introduce a ByteBufferedInputStream to help avoid repetitively
reading files into (potentially huge!) chunks of memory on its own. Its
`copy` method should *not* exist, but this class is a whole tangent on
its own, so just do the bare minimum of an implementation of an input
stream given a byte buffer.
Despite a JAR doing ZIP compression, it doesn't do so hot with Raku's
bytecode. XZ can help reduce the size of a JAR produced by us to some
degree (50957 bytes for NQPCORE.setting with XZ vs. 53286 bytes with
ZIP), and *hopefully* doesn't interfere with precomp times.
Seems to offer an improvement in build times and allows for better
handling of memory with large workloads anyways.
This isn't a very good idea because it will lead to larger bytecode when
we already have bytecode size problems, but neither is spinning up
instances with caches we never wind up using.  We'll deal with this
properly once ByteBufferedInputStream and IgnoreNameClassLoader get more
attention.
We have a shiny new serialized XZ type of file to read, so we have no
worries as to versioning of how its read; let's exploit that. Java
enjoys having names for its defined classes, so put the name of the
class in the XZ's JAR entry comment field and use it when redefining it
on read. We wind up with null if none can be found anyways, giving us
prior behaviour for uncompressed serializations too.

We have a loadClass-ish method that doesn't follow Java's spec per se.
We have the choice of linking with resolveClass after building one with
a findClass-ish method; do so.

Slap a synchronized modifier on our new-ish loadByteClass method and now
there's nothing preventing us from having parallel class loading. Mark
our new SerialClassLoader as being parallel-capable and set
-XX:+AllowParallelDefineClass now that we have a somewhat more modern
JDK and now that there shouldn't be anything else preventing us from
doing so.
Maybe this'll return as a config thing if this carries a significant
enough improvement.
@Kaiepi
Copy link
Contributor Author

Kaiepi commented Aug 4, 2022

Turns out -XX:+UseTransparentHugePages is a detriment! CORE.c.setting can now build 23% faster than on master.

ClassLoader.getClassLoadingLock would throw before, but I couldn't
figure out why: it can't handle null names because it keeps a map of
defined strings. Override ClassLoader.getClassLoadingLock; our unique
object to lock on can just be ourself for now.
Kaiepi added 3 commits August 7, 2022 08:19
We *should be* able to InputStream our way out of keeping files in
memory, with the exception of when other parts of nqp do so beforehand,
thus the existence of the MemoryClassLoader. The SerialClassLoader wraps
an input stream to read from, which JarFileClassLoader extends with the
preliminary file loading. A similar FileClassLoader exists, which only
cares about decompressed classfiles.

Regardless, Rakudo builds are broken because xz's XZInputStream has its
bugs. Womp womp.
xz now breaks Rakudo builds by its minimal, quirky InputStream
implementation in the form of XZInputStream, which doesn't play nice
with the default readAllBytes implementation, which we kinda need in
some form.

lz4, in comparison to xz and zlib, is more inefficient in terms of
compressed output size (NQPCORE.setting.jar is now 57KB compared to 52KB
w/ ZIP DEFLATE), but is nevertheless more efficient to compress and
decompress.  Because precomp files are apt to exist locally, hopefully
this tradeoff is worth it...
@Kaiepi Kaiepi force-pushed the jvm-necromancy branch 2 times, most recently from 0d208d7 to 49d6df0 Compare August 7, 2022 21:00
Kaiepi added 8 commits August 8, 2022 07:59
All of its subtypes are oriented towards files in some way,
MemoryClassLoader handling cases where the file's already in memory
anyway.
ByteClassLoader tracks which classes are defined, while GlobalContext
tracks which are loaded. The LibraryLoader depends on the former, but
what if the latter already defined a class to be loaded (circling back
to itself via require, perhaps)? ByteClassLoader can keep track of all
classes loaded or defined by ourselves if it acts as a parent for
LibraryLoader's various classloaders, eliminating such a duplicate
definition. If we give StreamClassLoader and MemoryClassLoader a new
SerialClassLoader parent that takes care of class caching, we get to
dedup some logic there as well.
We only need them once, while we still have access the original
reference to any given classfile. Also, name it such rather than the
vague "bytes" or "serial class".
This helps reduce the size of precompiled JARs, and is information not
typically included in a release to begin with.
This is a matter of picking a compression level and keeping a static
compressor and decompressor around.
@Kaiepi Kaiepi marked this pull request as ready for review August 9, 2022 15:54
@Kaiepi
Copy link
Contributor Author

Kaiepi commented Aug 9, 2022

There's more that can be done in this vein in the SerializationReader whereabouts, but I'd rather not mess with both the library loader and serialization in general at the same time. CI seems to enjoy these changes as-is.

- Actually use any cached class, read or defined...
- We need to use getName when name is null in either case, not
  getCanonicalName, so we avoid name clashing with static classes.
@usev6
Copy link
Contributor

usev6 commented Sep 19, 2022

This looks/sounds very promising.

I don't feel qualified to give it a real review, but I'll at least try to test the branch. (But that will take some time.)

@patrickbkr
Copy link
Member

Given this unblocks Kaiepi, what hinders merging this as is? It doesn't seem to touch MoarVM things, so can't break thinks too badly.

Copy link
Contributor

@usev6 usev6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I noted earlier, I don't feel qualified for a real review. But I tried my best and raised a few questions. @Kaiepi maybe you could have a look at those?

Apart from those comments this still looks very promising to me. I didn't come around to give it a real test, but will do that now.

tools/build/install-jvm-runner.pl.in Outdated Show resolved Hide resolved
tools/build/install-jvm-runner.pl.in Outdated Show resolved Hide resolved
src/vm/jvm/runtime/org/raku/nqp/runtime/GlobalContext.java Outdated Show resolved Hide resolved
src/vm/jvm/runtime/org/raku/nqp/runtime/GlobalContext.java Outdated Show resolved Hide resolved
@usev6
Copy link
Contributor

usev6 commented Oct 23, 2022

A spectest looked good as well.

@usev6
Copy link
Contributor

usev6 commented Oct 24, 2022

Given this unblocks Kaiepi, what hinders merging this as is? It doesn't seem to touch MoarVM things, so can't break thinks too badly.

From me it's a +1 for merging.

If no-one else chimes in I'll hit the merge button in a couple of days.

@usev6 usev6 merged commit ccf2d60 into Raku:master Oct 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants