Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1207] cache IResource in EnablementTester #1208

Merged
merged 9 commits into from
Feb 15, 2025

Conversation

ghentschke
Copy link
Contributor

Cache resource for URI because each call of
LSPEclipseUtils.findResourceFor(URI) takes ~300 microseconds. And it gets called a lot of times.

fixes #1207

@@ -69,7 +72,7 @@ public boolean evaluate(@Nullable URI uri) {
IResource resource = null;
try {
IDocument document = null;
resource = LSPEclipseUtils.findResourceFor(uri);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are several places where we map resources to URIs. Should we not benefit by having the cache reused by all consumers of the method? If so, what about building the cache as an static map in LSPEclipseUtils?

if (resource != null && resource.isAccessible()) {
return resource;
}
resource = LSPEclipseUtils.findResourceFor(uri);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it happen that a resource is moved after being cached? Would that not invalidate the cache? If that could happen, should we not either detect to be able to invalidate the cache or have some eviction policy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it happen that a resource is moved after being cached?

Yes, but shouldn't it be handled by the resource.isAccessible call:

Returns whether this resource is accessible. For files and folders, this is equivalent to existing;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and what if the resource has been moved to a place where it is still accessible? Can that happen?

Copy link
Contributor Author

@ghentschke ghentschke Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the resource has been moved, the uri will be changed as well. This leads to an new entry and to an obsolete entry in the cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are correct that its URI will change, but the cache will still have an entry for the old URI and will return the resource, will it not?

Copy link
Contributor Author

@ghentschke ghentschke Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO that's not a problem, if the old URI isn't cached anywhere. I'll check that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rubenporras Please review my current implementation. I use a IResourceChangeListener now to track modifications and to remove deleted/moved resources from the cache.

@martinlippert
Copy link
Contributor

One question: is there a mechanism to clean up the cache? It looks like it is an ever growing map that keeps references to all those IResource objects forever. Could that be a problem memory-wise over time?

@mickaelistria
Copy link
Contributor

LSPEclipseUtils.findResourceFor(URI) takes ~300 microseconds. And it gets called a lot of times.

Shouldn't such a strategy be implemented directly in LSPEclipseUtils.findResourceFor(uri) instead? And isn't there some other possible improvement to find there, or even in Eclipse Platform?

@ghentschke
Copy link
Contributor Author

One question: is there a mechanism to clean up the cache? It looks like it is an ever growing map that keeps references to all those IResource objects forever. Could that be a problem memory-wise over time?

Yes, I'll add a clear mechanism which could be triggered when the expression.evaluate returns false for the given URI.

@ghentschke
Copy link
Contributor Author

Shouldn't such a strategy be implemented directly in LSPEclipseUtils.findResourceFor(uri) instead? And isn't there some other possible improvement to find there, or even in Eclipse Platform?

Yes. I'll consider it.

@@ -40,6 +42,7 @@ public final class EnablementTester {
private final Expression expression;
private final String description;
private final Supplier<@Nullable IEvaluationContext> parent;
private final Map<URI, IResource> cache = new HashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using Guava Cache instead which supports expiring entries after read access and is thread safe. LSP4E already depends on guava so no new dependency needs to be added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hint. I'll take a closer look at it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebthom The Javadocs for Guava's says to prefer using Caffeine instead.

AFAICT nothing contributes caffeine to SimRel yet, but some projects do use it (e.g. Mylyn, m2e's tests) and it is in orbit aggregator for 2025-03 too

Any thoughts on best way to proceed?

Copy link
Member

@sebthom sebthom Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would be pragmatic and just use guava instead of introducing another dependency. I think it is good enough. Esp. since it does not look like guava will go away from lsp4e

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sebthom and @rubenporras for your input.

Copy link
Contributor Author

@ghentschke ghentschke Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your inputs. I implemented a singleton cache for URI-to-IResources with guava cache now. I implemented it as a singleton. It can be used everywhere instead of a direct call to LSPEclipseUtils.findResourceFor(URI).

Shouldn't such a strategy be implemented directly in LSPEclipseUtils.findResourceFor(uri) instead?

@mickaelistria it could. But the current implementation leaves it up to the developer whether to use a cache or not. This should minimize possible side effects of a cache implementation directly in LSPEclipseUtils.findResourceFor(uri)

Please review!

Cache resource for URI because each call of
LSPEclipseUtils.findResourceFor(URI) takes ~300 microseconds. And it
gets called a lot of times.

fixes eclipse-lsp4e#1207
which can be used as replacement for
LSPEclipseUtils.findResourceFor(URI)
@ghentschke
Copy link
Contributor Author

I don't know why the windows build fails due to this:

org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-clean-plugin:3.4.0:clean (default-clean) on project org.eclipse.lsp4e: Failed to clean project: Failed to delete D:\a\lsp4e\lsp4e\org.eclipse.lsp4e\target\org.eclipse.lsp4e-0.18.18-SNAPSHOT.jar

}

@Nullable
public synchronized IResource get(@Nullable URI uri) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use a LoadingCache instead of synchronizing the method

Copy link
Contributor Author

@ghentschke ghentschke Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my first attempt. It seems that LoadingCache is not really appropriate for this use case, because the load method has to return a non null value:

Returns:
the value associated with key; must not be null

But it cannot be guaranteed that we find an IResource object for a given URI (e.g. when the file is outside the workspace)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah true, that only works with Caffeine. I personally then would just remove the synchronized keyword and take the risk of a race conditions over synchronizing the method. The worst case is that a URI is looked up twice at the same time. Or alternatively use synchronized(cache) { ... } inside the if (uri != null) block. But I believe that isn't really necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the synchronization from the get method because current usage in EnablementTester.evaluate(URI) runs AFAIK in main thread only.
I removed the IResourceChangeListener as well because it won't worked as expected (see git commit comment) and the guave cache has already a mechanism to remove unused entries as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've to withdraw my statement above: there are concurrent calls to EnablementTester.evaluate(URI)

// use getInstance()
}

public static synchronized ResourceForUriCache getInstance() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about instead of adding a synchronized getInstance method simply defining a public static final ResourceForUriCache INSTANCE = new ResourceForUriCache();

because it won't work in case a IResource has been deleted or moved,
because the listener will be called not in main thread but the get
method will be. This leads to the situation where the listener has
removed the IResource from the cache but the get methods add it again,
when it gets called with the old URI. (tested with renaming)
@ghentschke ghentschke requested a review from sebthom February 13, 2025 10:20
sebthom
sebthom approved these changes Feb 13, 2025
@sebthom sebthom self-requested a review February 13, 2025 16:33
if (resource != null) {
return resource;
}
resource = LSPEclipseUtils.findResourceFor(uri);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move the logic of findResourceFor in this cache too, and then make LSPEclipseUtils.findResourceFor(...) reference ResourceForUriCache.getInstance().get(...) ?

Doing so will make all consumers of findResourceFor(...) save 300ms, not just en enablementTester

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that is a good idea

if (resource != null) {
return resource;
}
resource = LSPEclipseUtils.findResourceFor(uri);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that is a good idea

* Therefore entries can be removed before the limit exceeds.
*/
public final class ResourceForUriCache {
private static final Cache<URI, IResource> cache = CacheBuilder.newBuilder().maximumSize(100).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we pass a CacheLoader to the build method with a call to LSPEclipseUtils.findResourceFor? I think that would save the double access to the cache (get & set) in case of cache miss.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The load method in CacheLoader does not allow a null to be returned See here:

Returns:
the value associated with key; must not be null

But it cannot be guaranteed that we find an IResource object for a given URI (e.g. when the file is outside the workspace)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should add a comment to the method why we do it this way... (See also comment from @sebthom here)

@rubenporras
Copy link
Contributor

looks good to me

* See git history
*******************************************************************************/

package org.eclipse.lsp4e;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should move it into the internal package. I don't think this should be part of the LSP4E API.

@ghentschke ghentschke requested a review from sebthom February 14, 2025 19:30
@ghentschke ghentschke merged commit 3e78918 into eclipse-lsp4e:main Feb 15, 2025
6 checks passed
@ghentschke
Copy link
Contributor Author

Thanks for the reviews!

@ghentschke ghentschke deleted the fix-long-runner branch February 15, 2025 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[performance] EnablementTester.evaluate takes way to long in UI thread
6 participants