Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock in OpenAPI RouterBuilder #2383

Closed
jawaff opened this issue Mar 8, 2023 · 12 comments
Closed

Deadlock in OpenAPI RouterBuilder #2383

jawaff opened this issue Mar 8, 2023 · 12 comments
Assignees
Labels

Comments

@jawaff
Copy link

jawaff commented Mar 8, 2023

Questions

N/A

Version

4.3.4 to 4.3.8

Context

We're seeing deadlocks (probably), but only in my CI pipeline that's running unit tests. We're using a gitlab runner in an AWS T3-medium, EC2 instance for our CI pipeline. Vertx starts up completely fine on our dev machines and works in our production environments as well. I've done a lot of experimenting and diagnosing to pinpoint where the issue is occurring and I'm very confident that something weird is going on in Vertx with some of the weirder json schemas that we have.

The deadlock/hanging happens when we're standing up our Vertx server and specifically when we're using the OpenAPI RouterBuilder for reading in an OpenAPI spec. It has something to do with the Json Schemas that we're referencing in that spec and specifically with combinations of "oneOf" and "$ref" in those schemas. I've seen the deadlock/hanging occur on different OpenAPI specs after reordering them in our actual project. It's like Vertx gets overloaded with $refs and is unable to continue processing the json schemas.

Do you have a reproducer?

I was able to reproduce the issue with this project. Similarly, tests run just fine on a dev machine, but the deadlock/hanging occurs in our CI pipeline when the tests are ran.

https://github.com/jawaff/vert-test

Here's some suspicious exception from Vertx's OpenAPIHolderImpl that was repeatedly printed in our CI pipeline (the process has been getting killed by the gitlab-runner after one hour due to a timeout, but taks ~9 seconds to run locally):

19:10:27.545 [vertx-blocked-thread-checker] WARN io.vertx.core.impl.BlockedThreadChecker - Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 921228 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1718)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.resolveExternalRef(OpenAPIHolderImpl.java:267)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$394/850417910.apply(Unknown Source)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1580)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.walkAndSolve(OpenAPIHolderImpl.java:208)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.lambda$null$5(OpenAPIHolderImpl.java:272)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$396/2657808.apply(Unknown Source)
	at io.vertx.core.impl.future.Composition.onSuccess(Composition.java:38)
	at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:60)
	at io.vertx.core.impl.future.FutureImpl.addListener(FutureImpl.java:196)
	at io.vertx.core.impl.future.FutureBase.compose(FutureBase.java:84)
	at io.vertx.core.Future.compose(Future.java:208)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.lambda$resolveExternalRef$6(OpenAPIHolderImpl.java:270)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$395/2140534660.apply(Unknown Source)
	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.resolveExternalRef(OpenAPIHolderImpl.java:267)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$394/850417910.apply(Unknown Source)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1580)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.walkAndSolve(OpenAPIHolderImpl.java:208)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.lambda$null$5(OpenAPIHolderImpl.java:272)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$396/2657808.apply(Unknown Source)
	at io.vertx.core.impl.future.Composition.onSuccess(Composition.java:38)
	at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:60)
	at io.vertx.core.impl.future.FutureImpl.tryComplete(FutureImpl.java:211)
	at io.vertx.core.impl.future.Composition$1.onSuccess(Composition.java:62)
	at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:60)
	at io.vertx.core.impl.future.SucceededFuture.addListener(SucceededFuture.java:88)
	at io.vertx.core.impl.future.Composition.onSuccess(Composition.java:43)
	at io.vertx.core.impl.future.FutureBase.lambda$emitSuccess$0(FutureBase.java:54)
	at io.vertx.core.impl.future.FutureBase$$Lambda$393/167017155.run(Unknown Source)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:750)
19:10:28.545 [vertx-blocked-thread-checker] WARN io.vertx.core.impl.BlockedThreadChecker - Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 922227 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1718)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.resolveExternalRef(OpenAPIHolderImpl.java:267)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$394/850417910.apply(Unknown Source)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1580)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.walkAndSolve(OpenAPIHolderImpl.java:208)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.lambda$null$5(OpenAPIHolderImpl.java:272)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$396/2657808.apply(Unknown Source)
	at io.vertx.core.impl.future.Composition.onSuccess(Composition.java:38)
	at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:60)
	at io.vertx.core.impl.future.FutureImpl.addListener(FutureImpl.java:196)
	at io.vertx.core.impl.future.FutureBase.compose(FutureBase.java:84)
	at io.vertx.core.Future.compose(Future.java:208)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.lambda$resolveExternalRef$6(OpenAPIHolderImpl.java:270)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$395/2140534660.apply(Unknown Source)
	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.resolveExternalRef(OpenAPIHolderImpl.java:267)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$394/850417910.apply(Unknown Source)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1580)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.walkAndSolve(OpenAPIHolderImpl.java:208)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.lambda$null$5(OpenAPIHolderImpl.java:272)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$396/2657808.apply(Unknown Source)
	at io.vertx.core.impl.future.Composition.onSuccess(Composition.java:38)
	at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:60)
	at io.vertx.core.impl.future.FutureImpl.tryComplete(FutureImpl.java:211)
	at io.vertx.core.impl.future.Composition$1.onSuccess(Composition.java:62)
	at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:60)
	at io.vertx.core.impl.future.SucceededFuture.addListener(SucceededFuture.java:88)
	at io.vertx.core.impl.future.Composition.onSuccess(Composition.java:43)
	at io.vertx.core.impl.future.FutureBase.lambda$emitSuccess$0(FutureBase.java:54)
	at io.vertx.core.impl.future.FutureBase$$Lambda$393/167017155.run(Unknown Source)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:750)
19:10:29.545 [vertx-blocked-thread-checker] WARN io.vertx.core.impl.BlockedThreadChecker - Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 923228 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1718)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.resolveExternalRef(OpenAPIHolderImpl.java:267)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$394/850417910.apply(Unknown Source)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1580)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.walkAndSolve(OpenAPIHolderImpl.java:208)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.lambda$null$5(OpenAPIHolderImpl.java:272)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$396/2657808.apply(Unknown Source)
	at io.vertx.core.impl.future.Composition.onSuccess(Composition.java:38)
	at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:60)
	at io.vertx.core.impl.future.FutureImpl.addListener(FutureImpl.java:196)
	at io.vertx.core.impl.future.FutureBase.compose(FutureBase.java:84)
	at io.vertx.core.Future.compose(Future.java:208)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl.lambda$resolveExternalRef$6(OpenAPIHolderImpl.java:270)
	at io.vertx.ext.web.openapi.impl.OpenAPIHolderImpl$$Lambda$395/2140534660.apply(Unknown Source)
	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
Job's log exceeded limit of 4194304 bytes.

Steps to reproduce

  1. Create a Vertx HTTP Server Verticle that utilizes the OpenAPI RouterBuilder.
  2. Write an OpenAPI spec with a single endpoint that references json schemas.
  3. Make those json schemas reference other json schemas in combination with "oneOf" or "allOf". (The issue is related to too many polymorphic schema defintions that utilize "$ref" and seemingly only occurs in an environment with a lack of resources.)
  4. Mount that OpenAPI spec multiple times onto different routes for the HTTP Server so that Vertx will get overwhelmed.
  5. Run a test locally that stands up Vertx with that verticle and watch it succeed.
  6. The trick in reproducing it is probably running the test on a machine with less resources available. It might have something to do with the T3 Medium EC2 instance in AWS that sets everything up to fail this way? We might be using a different EC2 instance than what I'm expecting as well, so my best guess is just a machine with a lack of resources.

Extra

We're using Java 8 and OSX for development. I don't have a lot of information about our CI environment, but I can look further into that if it's needed.

@jawaff jawaff added the bug label Mar 8, 2023
@pk-work
Copy link
Contributor

pk-work commented Feb 20, 2024

Hi, there is a complete new rebuild of the OpenAPI Router [1], which hopefully solves the problem. Please check out if this works for you.

[1] https://vertx.io/docs/vertx-web-openapi-router/java/

@pk-work pk-work self-assigned this Feb 20, 2024
@jawaff
Copy link
Author

jawaff commented Mar 29, 2024

We've been watching the new openapi router work. I'd like to move to the newer json schema versions for sure. I tried a first integration pass, but we're stuck on the loading of the OpenAPIContract because it doesn't support json schemas in the additional contract files. I made a comment on a relevant issue in the verx-openapi project. We have a bunch of json schemas that are separate from our openapi files.

Once all of that works for us then I think this issue loses relevance. We were able to get around the deadlock by refactoring the schemas a bit.

@tnmtechnologies
Copy link

@pk-work Do you know if the smallrye-mutiny wrapper is available?

@pk-work
Copy link
Contributor

pk-work commented Apr 3, 2024

@tnmtechnologies it doesn't look like that [1].

@jponge what do I need to do, to have mutiny bindings?

[1] https://github.com/smallrye/smallrye-mutiny-vertx-bindings/tree/main/vertx-mutiny-clients

@jponge
Copy link
Member

jponge commented Apr 3, 2024

@pk-work
Copy link
Contributor

pk-work commented May 3, 2024

@jponge can you check if this commit fits? smallrye/smallrye-mutiny-vertx-bindings#926

@pk-work
Copy link
Contributor

pk-work commented May 6, 2024

MR is merged, I guess with the next release the mutiny bindings will be created.

I close this issue now.

@pk-work pk-work closed this as completed May 6, 2024
@AdrianVasiliu
Copy link

Hello, could please someone clarify, the fix of the issue is included in https://github.com/vert-x3/vertx-web/releases/tag/4.5.8 ?

@pk-work
Copy link
Contributor

pk-work commented May 31, 2024

There was an issue with visibility in vertx-openapi. Because of this the change was reverted [1].

Maybe you keep track of this issue [2].

[1] smallrye/smallrye-mutiny-vertx-bindings#930
[2] smallrye/smallrye-mutiny-vertx-bindings#920

@AdrianVasiliu
Copy link

@pk-work Oh, thanks. I followed-up in smallrye/smallrye-mutiny-vertx-bindings#920.

@tnmtechnologies
Copy link

Hi @pk-work,

We have tried the new rebuild of the OpenAPI Router.
The behavior is very different than the previous implementation.

In our project, we work with more than 100 OpenAPI files, most of them are very large with several dozen of operations. All our OpenAPI documents are OpenAPI 3.0 compliant.
All these OpenAPI files are embedded in the jar file. Everything worked well with the previous implementation except this issue occuring under some specific runtime conditions. We worked with it.
Now with the rebuild of the OpenAPI Router, we faced some problems.
eclipse-vertx/vertx-openapi#74
eclipse-vertx/vertx-openapi#75

To sum up, the behavior of the new library is very restrictive compared to that of the previous library for our use case.
The OpenAPIContract instance initialization mode is explicit and restrictive (the developer must know the external $ref), whereas with the previous library, the initialization mode was implicit.
With the previous OpenAPI router implementation, our software was generic, whatever the OpenAPI document loaded. With the new rebuild of the OpenAPI Router, we'll need to explicitly indicate the dependencies in the source code (cf additionalContractFiles parameter) for each of the supported OpenAPI documents and transform relative $ref references into absolute ones. This can be done more or less automatically. It's highly likely that multiple circular references exist, but the documents are compliant and can be loaded by the Swagger UI tool.

We can understand restrictive use cases for safety reasons, for example, but there are also use cases where this is not necessary like for our use case.
Our OpenAPI files are 100% OpenAPI 3.0 compliant, and we'd like to use them as before without any modification imposed by the new rebuild of the OpenAPI Router.

I hope that the description of our use case and our needs are well described.

Please let us know your point of view.

Kind Regards.

Hi, there is a complete new rebuild of the OpenAPI Router [1], which hopefully solves the problem. Please check out if this works for you.

[1] https://vertx.io/docs/vertx-web-openapi-router/java/

@pk-work
Copy link
Contributor

pk-work commented Sep 25, 2024

@tnmtechnologies I think this is related to eclipse-vertx/vertx-openapi#74 isn't it? If this is truem let's continue there, to avoid to spread the discussion into multiple threads.

I just posted my opinion on relative references there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

5 participants