Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add macOS Universal Binary with Apple Silicon support #55

Conversation

skalarproduktraum
Copy link
Contributor

@skalarproduktraum skalarproduktraum commented Apr 15, 2022

This PR adds the following, based on what I've written in #37:

  • a script to create a macOS universal binary with x86_64 and ARM64/Apple Silicon support
  • an updated version of the included macOS native blob as universal binary
  • a sketch of a potential version for Metal surface creation that doesn't require a native blob
  • ClearScreenDemo, a somewhat eviscerated version of @httpdigest's TriangleDemo from the lwjgl3-demos repository, to have a Vulkan example in this repo that actually renders something for easier testing
  • removal of the AWT support check in SimpleDemo, as it fails on ARM64 for some reason

The following has been tested:

  • Vulkan rendering on macOS/ARM64 using the ClearScreenDemo and an actual application
    image
    image

  • Vulkan rendering on macOS/x86_64 using the ClearScreenDemo and an actual application

  • OpenGL rendering on macOS/ARM64

  • OpenGL rendering on macOS/x86_64

A current caveat might be that linking-wise, I am not using -framework JavaVM at the moment, so the linker might have trouble actually finding the correct JVM binaries. I have tested this only on my M1 machine so far, so any input there would be highly appreciated 👍

I hope this is useful, and any feedback or testing is highly welcome!

@SWinxy
Copy link
Collaborator

SWinxy commented Apr 15, 2022

I've been able to test it on my M1 Mac and my Intel Mac. Both Vulkan demos work, as well as its use on an actual application! OpenGL's tests don't work at all, but that's an issue that isn't caused by your additions (the crash is at NSOpenGLContext setView:, caused by the call to caFlush()).

Images: Intel first, then Apple Silicon. Ignore that the surfaces are low resolution; I have to fix that...
Intel
Apple Silicon

In all, looks good. Frankly, I'm jealous that I wasn't able to find a fix for the M1 support. Were you able to get the Java-based creation working? I tried to get that to work, but wasn't able to. We can keep with the binary, though one of the goals is to get the Java code working.

@skalarproduktraum
Copy link
Contributor Author

Thanks for checking this out, @SWinxy!

And no need to be jealous, I think this is a team effort and it only works because of the things you've posted in #37. There's actually not a lot I did on top of it (removal of AWT check, addition of universal binary, add test), so you do deserve a lot of credit 👍

How shall we proceed with this? The macOS build currently seems to fail with a 404 I can't reproduce. In terms of moving away from the native, there actually a bunch of similar code to yours in this PR, see

private long createMTKViewNoNative(long platformInfo, int x, int y, int width, int height) {
, however it doesn't work yet. I can remove it or leave it for the moment. I'd suggest however that removal of the native blob would be content for another PR, so we can first get a working macOS/ARM64 version, then improve on top of it. What do you think?

@skalarproduktraum
Copy link
Contributor Author

Ah ... one more thing (🤪): Would it make sense to temporarily get rid of the caFlush() part so we can get this PR fully working? Or would that be stupid or too much effort?

skalarproduktraum added a commit to scenerygraphics/scenery that referenced this pull request Apr 17, 2022
@SWinxy
Copy link
Collaborator

SWinxy commented Apr 18, 2022

caFlush90's issue is tracked in #51. Ignore it for now, but I'll have to eventually figure out what is going on. The PR can be merged even though the tests complain about it. As for isPlatformSupported(), that was there because of the Mac ARM issue (i.e. #37), so that developers could inform users that M1 Macs couldn't be supported. I suppose we could toss it since it was added in a snapshot. A thought I had was to split the universal binary into two different packages, but that seems like much effort for little gain (even with both, it's still only 130kb).

.apiVersion(VK_MAKE_VERSION(1, 1, 0));

Why the change in Vulkan version in the example, btw? Just curious.

After this PR is merged, I should try and reduce the amount of native code, even if native code will still exist. Whittle it down to what really need to be done in precompiled code, rather than fail trying to do all at once.
Another thing to do is to have a GitHub action to rebuild the native library when pushed.

@skalarproduktraum
Copy link
Contributor Author

Ah, cool, thanks for the explanation! After looking at your code and getting inspired, I did actually toy around with the ObjectiveC Runtime wrapper a bit more yesterday, and arrived at something like

private long createMTKViewNoNative(long platformInfo, int x, int y, int width, int height) {
      long objc_msgSend = ObjCRuntime.getLibrary().getFunctionAddress("objc_msgSend");
      SharedLibrary lib = MacOSXLibrary.getWithIdentifier("com.apple.Metal");

      // id<MTLDevice> device = MTLCreateSystemDefaultDevice();
      long address = lib.getFunctionAddress("MTLCreateSystemDefaultDevice");
      long device = invokeP(address);
      System.out.println("Device address: " + device);

        // get offset in window from JAWTSurfaceLayers
        long H = JNI.invokePPPPP(platformInfo,
          ObjCRuntime.sel_getUid("windowLayer"),
          ObjCRuntime.sel_getUid("frame"),
          ObjCRuntime.sel_getUid("size"),
          objc_msgSend);
        // height is the 4th member of the 4*64bit struct
        double h = MemoryUtil.memGetDouble(H+3*8);
        System.out.println("Height is " + h);

        // CGRect creation, layout is x/y/width/height, all as doubles on 64bit architectures
        ByteBuffer frame = ByteBuffer.allocateDirect(4*8).order(ByteOrder.nativeOrder());
        frame.putDouble(x);
        frame.putDouble(h-height-y);
        frame.putDouble(width);
        frame.putDouble(height);
        frame.flip();

        // MTKView *view = [[MTKView alloc] initWithFrame:frame device:device];
        // get MTKView class and allocate instance
        long MTKView = ObjCRuntime.objc_getClass("MTKView");
        System.out.println("MTKView is " + Long.toHexString(MTKView));
        long mtkView = JNI.invokePPP(MTKView,
          ObjCRuntime.sel_getUid("alloc"),
          objc_msgSend);
        System.out.println("MTKView instance is " + Long.toHexString(mtkView) + ", using buffer " + Long.toHexString(MemoryUtil.memAddress(frame)));

        // init MTKView with frame and device
        long view = JNI.invokePPPPP(mtkView,
          ObjCRuntime.sel_getUid("initWithFrame:"),
          MemoryUtil.memAddress(frame),
          device,
          objc_msgSend);
        System.out.println("view is "+ Long.toHexString(view) + " with " + x + "/" + (h-height-y) + "/" + width + "/" + height);

        // get layer from MTKView instance
        long mtkViewLayer = JNI.invokePPP(mtkView,
          ObjCRuntime.sel_getUid("layer"),
          objc_msgSend);
        System.out.println("Layer is " + Long.toHexString(mtkViewLayer));


        // set layer on JAWTSurfaceLayers object
        JNI.invokePPPV(platformInfo,
          ObjCRuntime.sel_getUid("setLayer:"),
          mtkViewLayer,
          objc_msgSend);

//         surfaceLayers.layer = view.layer;
        // return view.layer
        return mtkViewLayer;
    }

This almost works, the ObjC messages do look correct, there's only one error remaining:

2022-04-17 23:58:19.295 java[4133:3065236] CAMetalLayer ignoring invalid setDrawableSize width=0.000000 height=0.000000
Exception in thread "main" java.lang.AssertionError: Failed to acquire next swapchain image: A swapchain no longer matches the surface properties exactly, but can still be used to present to the surface successfully.

Okay, technically two, but if the CAMetalLayer one goes away, then the swapchain would also be correctly usable :-D If you have any ideas how to fix this, we could actually get rid of the native blob with this PR 🥳

@SWinxy
Copy link
Collaborator

SWinxy commented Apr 18, 2022

It works? Huh. You might've solved it.

The errors you're getting appear to be from your demo code. My project doesn't throw any errors.

CAMetalLayer ignoring invalid setDrawableSize is being caused by vkCreateSwapchainKHR() and I'm getting this weird [CAMetalLayer nextDrawable] returning nil because allocation failed being spammed in the console, which is being caused by vkQueueSubmit().

I haven't tested on my M1 Mac yet; this is on the x86 one.

Fantastic.

@SWinxy
Copy link
Collaborator

SWinxy commented Apr 19, 2022

OK. I was able to confirm the no-native-code works on my M1. The next step would be seeing if we can get OpenGL working w/o native code. I'm no OpenGL dev, so I dunno. It may be that some change in the past year broke it (which hopefully can be undone or something). If not, that's fine. I can merge your PR and release 1.9.

For 2.0, OpenGL and Vulkan must be working on both x86 and Apple Silicon devices using pure Java and no binaries. I'd hope to remove the now-deprecated classes and methods, too.

Thank you again for bringing your skills to the table.

@mynewestgitaccount
Copy link

I've been trying to follow along with the no-natives version of createMTKView, and I'm slightly confused.

I can get to the same points you've gotten to, successfully running the demos with the natives and getting the same CAMetalLayer ignoring invalid setDrawableSize width=0.000000 height=0.000000 thing with the createMTKViewNoNative version shared above.

However, if I comment out the static {Library.loadSystem("org.lwjgl.awt", "lwjgl3awt");} line, then most of the outputs from the no-native version are 0, and both demos fail to work.

Here is what gets printed when running the ClearScreenDemo with static {Library.loadSystem("org.lwjgl.awt", "lwjgl3awt");} left in:

Device address: 5427476992
Height is 0.0
MTKView is 1ef5e30d8
MTKView instance is 15399dc00, using buffer 600002b4ece0
view is 15399dc00 with 0/-572.0/600/572
Layer is 6000025822e0
2022-04-18 22:46:27.944 java[58967:3708014] CAMetalLayer ignoring invalid setDrawableSize width=0.000000 height=0.000000
Exception in thread "main" java.lang.AssertionError: Failed to acquire next swapchain image: A swapchain no longer matches the surface properties exactly, but can still be used to present to the surface successfully.
	at org.lwjgl.vulkan.awt.ClearScreenDemo.main(ClearScreenDemo.java:882)

And here is what gets printed with static {Library.loadSystem("org.lwjgl.awt", "lwjgl3awt");} commented out:

Device address: 5855648768
Height is 0.0
MTKView is 0
MTKView instance is 0, using buffer 6000008504e0
view is 0 with 0/-572.0/600/572
Layer is 0
Exception in thread "main" java.lang.AssertionError: No presentation queue found
	at org.lwjgl.vulkan.awt.ClearScreenDemo.getColorFormatAndSpace(ClearScreenDemo.java:228)
	at org.lwjgl.vulkan.awt.ClearScreenDemo.main(ClearScreenDemo.java:778)

It seems like the no-natives version still requires the natives in some way. Is there something I'm missing?


Also, when I tried to use the liblwjgl3awt.dylib attached to this pull request, I was told “liblwjgl3awt.dylib” cannot be opened because the developer cannot be verified., so I had to compile my own. Is that expected? If so, that doesn't seem ideal.

@SWinxy
Copy link
Collaborator

SWinxy commented Apr 19, 2022

Hmmm... You're right. There must be something in the compiled library that is causing a load of some native API, but isn't reflected in the Java-only code. So close.

@skalarproduktraum
Copy link
Contributor Author

Interesting! I actually totally forgot to remove the native blob loading, great catch! If I find some time today, I'll try to track this down further, looks like we're on the right path here. One thing that might be happening via the System.load() is that the Metal framework is loaded, which might not happen with the MacOSXLibrary.getWithIdentifier() - maybe @Spasi can comment on this?

@skalarproduktraum
Copy link
Contributor Author

Alrighty,

SharedLibrary mtk = MacOSXLibrary.create("/System/Library/Frameworks/MetalKit.framework");
SharedLibrary lib = MacOSXLibrary.create("/System/Library/Frameworks/Metal.framework");
mtk.getFunctionAddress("MTKView");

seems to have been the missing piece if the native blob is not loaded. Maybe the routines don't get registered with the ObjC runtime if no function is looked up or whatever. Now I'm back at the same place where I was with the native, with the zero width/height error. I guess something with the CGRect struct is still fishy.

@skalarproduktraum
Copy link
Contributor Author

Alrightyˆ2, got it to draw using only the Java code:
Screenshot 2022-04-19 at 14 53 13

Unfortunately, by hardcoding the surface height/width, otherwise the surface itself returns 0 for width and height, so I still think that something's fishy with the CGRect struct. It continues working after resizing too, though.

@mynewestgitaccount
Copy link

mynewestgitaccount commented Apr 19, 2022

I can confirm that by adding the mtk.getFunctionAddress("MTKView");, I was able to get the no-natives code working without loading the natives. Great investigative work!

Also, using your no-natives metal view code as a guide, I think I was able to come up with something that should work for the no-natives OpenGL view. However, due to the caFlush issue (#51), it's impossible to see if it actually does work. Very frustrating!

@skalarproduktraum
Copy link
Contributor Author

Hey there! Sorry for not being in touch, didn't have time to poke more. Considering that the non-native code still needs work (mainly in the form of figuring out how the NSRect can be passed by-value), I'd probably remove that again from the PR, and leave this just with the working binary blob.

I am wondering though, the “liblwjgl3awt.dylib” cannot be opened because the developer cannot be verified error @mynewestgitaccount reported, how can we get rid of that one? It's obviously related to code-signing. So either I sign the binary, which is somewhat suboptimal, or we do that on CI somehow. Does anyone know how lwjgl3 handles that on CI? Maybe @Spasi or @httpdigest can comment?

@skalarproduktraum
Copy link
Contributor Author

Okay, I have just removed the code for the non-native surface creation. I'll put that back in another PR, as I think it's a bit beyond scope here.

So the Vulkan stuff is working now, OpenGL I can't reasonably test because of another issue. It'd still be interesting if we can resolve the developer cannot be verified issue. If anyone has an idea, let me know 👍

@skalarproduktraum
Copy link
Contributor Author

I just checked the libraries lwjgl ships, they are not signed or notarised. So we could probably proceed like this, and even build the binary on the CI.

@SWinxy
Copy link
Collaborator

SWinxy commented Jul 28, 2022

Superseded by #57.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants