Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply Optimize Java 8 Streams refactoring #2837

Closed
wants to merge 5 commits into from

Conversation

khatchad
Copy link

Apply Optimize Java 8 Streams Refactoring

Description

This is a semantics-preserving automated refactoring that attempts to optimize code using Java 8 streams when it is safe and possibly advantageous to do so. It may make certain streams parallel, others sequential, and unorder streams where necessarily. The tool does not add new functionality; it only rearranges existing code in an effort to increase its performance. Your program's results should be the same before and after the refactoring.

Details

  • We are evaluating a research prototype automated refactoring Eclipse plug-in called Optimize Java 8 Streams Refactoring. We have applied the tool to your project in the hopes of receiving feedback.
  • The approach is very conservative. That may mean that not all changes that can be made were made.
  • The source code should be semantically equivalent to the original.
  • There was one manual change reversal due to parallelization of a stream with only one element.
  • The tool does not assess overhead vs. the amount of data processed; it only performs the refactoring when it is safe and possibly advantageous to do so. However, we ran several performance tests on the refactored code (see below).

Performance Evaluation

We did not find dedicated performance tests in your project (please let us know if we missed it). But, we did evaluate our tool on other projects (see iluwatar/java-design-patterns#786 (comment) and numenta/htm.java#539 (comment)). We have found an average speedup of ~3.25 as a result of our refactoring across all of our tests thus far.

Feedback

Thank you for your help in this evaluation! Any feedback you can provide would be very helpful. In particular, we are interested if each of the proposed changes are helpful or not.

@joakime
Copy link
Contributor

joakime commented Aug 20, 2018

First (and most important) be sure you follow the Eclipse Legal / IP-Validation requirements.
You currently do not have an Eclipse ECA on file at Eclipse, nor have you follow the Developer Certificate of Origin requirement to "sign off" the git commit.
https://github.com/eclipse/jetty.project/blob/jetty-9.4.x/CONTRIBUTING.md#eclipse-contributor-agreement
We'll need both of those addressed before we can accept this PR.

The java 8 .parallelStream() method uses the ForkJoinPool.commonPool() to spawn threads to handle this parallel processing.
This is undesired on a server platform, as those extra threads are not being handled by the server managed thread pools.
However, since this PR is limited to just jetty-start the impact should be negligible.

@khatchad
Copy link
Author

@joakime Thanks for the prompt feedback. AFAIK, I have signed the agreement; it seems that the validation server is down. But, I did forget to sign my commits. I'll do that now.

@sbordet
Copy link
Contributor

sbordet commented Aug 20, 2018

Using parallelStream() is only for exceptional cases.

Here we are on a server (not a good place to use parallelStream()) and we are iterating on streams that contain 10s or at most 100s elements (again not a good parallelStream() case).

Rule of thumb shared by most experts about parallelStream() is that you need at least 10k elements to win the parallel setup overhead.

@khatchad IIUC your benchmarks have been done with 1 million elements to show those gains.
Do you have gains for 10 and 100 elements only?

@khatchad
Copy link
Author

@joakime I signed my commits retroactively but doing so after a push doesn't seem to change the old commits. I guess I'll need to open a new PR. Sorry about that.

@sbordet I don't. Indeed, 1 million is the best gains I've seen in the literature. I'll run my script with 10 and 100 to see how it looks.

@khatchad
Copy link
Author

Superseded by #2838.

@khatchad khatchad closed this Aug 20, 2018
@khatchad khatchad deleted the streams branch August 20, 2018 23:38
@gregw
Copy link
Contributor

gregw commented Aug 20, 2018

I am not keen for Jetty as a server to use things like parallelStreams. The use of parallelism is really good if you have a single big task to do and wish to use all the CPUs available to achieve that in minimum time.
However, that is not the job of a server, which has many many tasks to do and the most important thing to optimize is the number of tasks completed rather than the elapsed time of any particular task.

Specifically, the last thing we want to do for any task that originates from a single connection/request, is to spread it's handling over all the CPUs of a server, filling all the caches with data from/for that single request. It is far FAR better that each CPU and its caches is dedicated to tasks from individual requests for as long as possible, allowing other CPUs to work on other requests with minimal (if any) contention. We actually go to some lengths in our scheduler to achieve this.

So thanks for the contribution, but it's not a direction we wish to follow.

@khatchad
Copy link
Author

@gregw Thank you for the feedback!

@khatchad
Copy link
Author

@sbordet:

@khatchad IIUC your benchmarks have been done with 1 million elements to show those gains.
Do you have gains for 10 and 100 elements only?

It would seem that 10 and 100 elements is not enough to overcome the overhead. Please see below. The left is the original and the right is the refactored version:

156,160c160,164
< Benchmark                                    (N)  Mode  Cnt   Score    Error  Units
< AbstractDocumentTest.shouldRetrieveChildren   10  avgt    5  ≈ 10⁻⁴           ms/op
< AbstractDocumentTest.shouldRetrieveChildren  100  avgt    5   0.001 ±  0.001  ms/op
< DomainTest.shouldConstructCar                 10  avgt    5  ≈ 10⁻⁴           ms/op
< DomainTest.shouldConstructCar                100  avgt    5   0.001 ±  0.001  ms/op
---
> Benchmark                                    (N)  Mode  Cnt  Score   Error  Units
> AbstractDocumentTest.shouldRetrieveChildren   10  avgt    5  0.043 ± 0.022  ms/op
> AbstractDocumentTest.shouldRetrieveChildren  100  avgt    5  0.293 ± 0.027  ms/op
> DomainTest.shouldConstructCar                 10  avgt    5  0.045 ± 0.009  ms/op
> DomainTest.shouldConstructCar                100  avgt    5  0.280 ± 0.026  ms/op

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants