-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add command line options to disable specific collectors #60
Comments
Out of curiosity, do you happen to have the names of the resources you removed that were taking a long time? There is some work recently added to make collection fetching parallel in the gofish library. Right now it is limited to one type that a user saw a need to speed up, but if there are a set of resources that generally take a longer time to collect, maybe we can expand that pattern in the library to speed up some of these other cases. |
We now disabled the whole system collector and the scrape duration went down from 40 to 4 seconds. A colleague did some more testing and found out, that the PCI (functions and devices) did need the most time (18 and 13 seconds). Probably because our servers have like 60 PCI devices and 80 PCI functions. We don't know if the the exporter does one long API call to collect all the info or a lot of fast calls. But the many PCI devices and functions seem to be the main problem. |
Hi, I'm the colleague of @ostertagconrad yeah especially the PCI stuff took a long time. As @ostertagconrad said because our Servers have so many of them and it seems like each PCI Device/Function has to be fetched with a seperate API-Call each. Just for reference, this is the place in the gofish library, where for each PCIDevice a seperate call is made - I guess fetching this in parallel will definitely reduce execution time a lot. |
No worries if this isn't something you have time for, but would love it if you could try an updated gofish with this change to see if it makes anything better. |
Thanks @stmcginnis , this looks good. I will try it out with our servers some time this week. |
Hi @stmcginnis I was now trying to test out your changes, but ran into some problems. In that same place I then tried to replace the code to simply use the However the problem is, that I assume the |
@tazend sorry for taking so long to get back to this. I've updated stmcginnis/gofish#210 to make all collection retrieval happen with some parallelism. Would you be able to try out these changes? |
Hi @stmcginnis, no worries - looks good! I will try to check it very soon. |
I would love this feature as well. We are getting all kinds of errors from our Dell and HPe servers from components of which we don't need the metrics anyway. So ideally the vendor fixes their firmware but pragmatically disabling it solves the problem for me. I've also included the branch/PR that makes fetching work in parallel and that seems to work fine, but for me it just improves the speed instead of solving my problem. |
Hey @tazend, just wanted to check if you ever had a chance to try out the changes. I may go ahead and merge it and watch for any reported issues, but wanted to quick check back here first. Thanks! |
Or @rfpronk - you mention using a fork with these changes included. Can you confirm things are working as expected against your hardware? |
Hi @stmcginnis sorry, I haven't tried out the changes yet - but I plan to do so soon. I'll let you know. |
You mean stmcginnis/gofish#210? |
FWIW, I'm also interested in this since a scrape on our new HP ProLiant DL325 Gen11 takes minutes. Here's my timings:
I haven't yet tried anything, I'm literally just getting started. |
When we scrape some of our redfish server one scrape takes up to 40 seconds.
We tried to disabled some collectors which information/metrics we don't need and got down to like 10 to 20 seconds. To disable we just removed them in the source code and build the exporter.
It would be very helpful if we could disable specific collectors via command line option or by config file. At start it will be maybe enough to disable one or two of the three collectors (manager, chassis, system). In the next step a more fine grained configuration to disable for example memory and storage in the system collector would be nice but definitely more work to implement.
The text was updated successfully, but these errors were encountered: