-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Degradation Introduced in New Relic PHP Agent v10.13.0.2 #806
Comments
@theophileds agree with your observation 💯 |
@theophileds I think, main difference between current Intel and c7a instances is in SMT- vCPU is locked not on cpu thread but on actual CPU core. |
Description
A significant increase in CPU usage, latency, and fluctuating php-fpm processes occurred after upgrading the New Relic PHP agent from version 10.0.0.312 to version 10.13.0.2. Despite attempting to downgrade New Relic, compatibility issues arose with PHP 8.2, leading to agent disablement and subsequent performance improvement.
Hypothesis: Hypervisor Clock Settings
Upon contacting New Relic support, a potential connection to hypervisor clock settings was suggested. Despite transitioning to TSC (Timestamp Counter) for clock configuration, benchmark results displayed a marginal improvement in average duration.
This benchmark was executed with 100,000,000 iterations, repeated a hundred times on two different containers running on machines set with TSC and kvm-clock configurations.
Benchmark Results:
TSC-based Configuration: Average Duration 2.321919 seconds
kvm-clock-based Configuration: Average Duration 2.817715 seconds
The observed result indicated a 17.56% decrease in average time when using TSC.
However, we acknowledge that our benchmarking approach may not accurately mirror the load pattern experienced by the New Relic agent. Moreover, despite conducting tests using TSC, we did not observe any noteworthy improvement in performance.
Feature Disabling and Version Testing
To pinpoint the source of the issue, extensive testing was conducted, including the disabling of features such as distributed tracing, code-level metrics, and application logging. The performance impact persisted across multiple tests and versions.
Regrettably, these efforts did not result in any substantial improvement. After repeating the experiment multiple times, it became evident that enabling New Relic consistently led to a significant negative impact on performance. This observation persisted across various versions of the New Relic agent, including:
PHP-fpm Processes and CPU Usage
As illustrated in the Grafana metrics screen captures, the tests were conducted in the following sequence with the specified configurations:
Conclusion
The bump to version 10.13.0.2 introduced significant performance degradation, challenging explanations based solely on new features or clock system changes. The issue persists despite clock configuration adjustments and feature disabling.
Your Environment
PHP backend applications built on Symfony, Docker image php:8.2.13-fpm
Deployed on EKS 1.24, EC2 instance type: m5.xlarge (Hypervisor Nitro)
Clock configuration tested with TSC and kvm-clock
The text was updated successfully, but these errors were encountered: