Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not recreate a new counter if we already have one. #13

Merged
merged 4 commits into from
Sep 30, 2019

Conversation

Yasumoto
Copy link
Contributor

Checklist

  • The provided tests still run.
  • I've created new tests where needed.
  • I've updated the documentation if necessary.

Motivation and Context

We want to make sure we're reusing metrics, not losing ones we already create.

Description

I discovered when running a service that I'm getting multiple invocations of the same metric appearing when I collect():

# TYPE invocation_register counter
invocation_register 0
invocation_register{tool="simple-curl"} 1
# TYPE invocation_register counter
invocation_register 0
invocation_register{tool="simple-curl"} 1
# TYPE invocation_register counter
invocation_register 0
invocation_register{tool="simple-curl"} 1

I'm calling createCounter on each request based on the good advice I got here from @ktoso.

If this looks roughly right to y'all, I can update the other types to follow this pattern, though definitely open to feedback 🎉

@MrLotU
Copy link
Collaborator

MrLotU commented Aug 28, 2019

This is by design. It's different between swift-metrics and this library, but the idea from the SwiftPrometheus standpoint is that you create a counter once, than increment that during your application lifecycle, passing in different labels as required.

Might have to reconsider this though, seeing how swift-metrics takes a different approach.

CC @tomerd @ktoso

@Yasumoto
Copy link
Contributor Author

Yah, totally see now that advice only applied to SwiftMetrics, and implementations can decide how to handle it.

I took a look at the golang Prometheus client library, and looks like it'll throw an AlreadyRegisteredError.

package main

import (
	"flag"
	"log"
	"net/http"
	"time"

	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promauto"
	"github.com/prometheus/client_golang/prometheus/promhttp"
)

var addr = flag.String("listen-address", ":9001", "The address to listen on for HTTP requests.")

var (
	myCounter = promauto.NewCounter(prometheus.CounterOpts{
		Name: "my_counter",
		Help: "The total number of processed events",
	})
)

func main() {
	flag.Parse()
	go func() {
		for {
			myCounter.Inc()
			time.Sleep(1 * time.Second)
		}
	}()
	myCounterTwo := promauto.NewCounter(prometheus.CounterOpts{
		Name: "my_counter",
		Help: "The total number of processed events",
	})
	go func() {
		for {
			myCounterTwo.Inc()
			time.Sleep(1 * time.Second)
		}
	}()
	http.Handle("/metrics", promhttp.Handler())
	log.Fatal(http.ListenAndServe(*addr, nil))
}
jmsmith@SFO-M-JMSMITH02 ~/w/s/s/loadtest (jmsmith-prom-test) [1]> go run ./prom_example.go
panic: duplicate metrics collector registration attempted

goroutine 1 [running]:
github.com/prometheus/client_golang/prometheus.(*Registry).MustRegister(0xc0000ca320, 0xc0000201c0, 0x1, 0x1)
	/Users/jmsmith/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/registry.go:399 +0xad
github.com/prometheus/client_golang/prometheus.MustRegister(...)
	/Users/jmsmith/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/registry.go:176
github.com/prometheus/client_golang/prometheus/promauto.NewCounter(0x0, 0x0, 0x0, 0x0, 0x1483caf, 0xa, 0x148e8a6, 0x24, 0x0, 0xc0000a0058, ...)
	/Users/jmsmith/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/promauto/auto.go:136 +0x10a
main.main()
	/Users/jmsmith/workspace/slack-github.com/slack/loadtest/prom_example.go:31 +0xd2
exit status 2

But the docs are pretty interesting, since they do suggest you can catch the error and use the existing collector. Testing out the example, it does indeed give you back a reference to the first counter. Note that promauto does not give you a chance to handle that error.

package main

import (
	"flag"
	"log"
	"net/http"
	"time"

	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promauto"
	"github.com/prometheus/client_golang/prometheus/promhttp"
)

var addr = flag.String("listen-address", ":9001", "The address to listen on for HTTP requests.")

var (
	myCounter = promauto.NewCounter(prometheus.CounterOpts{
		Name: "my_counter",
		Help: "The total number of processed events",
	})
)

func main() {
	flag.Parse()
	go func() {
		for {
			myCounter.Inc()
			time.Sleep(1 * time.Second)
		}
	}()
	myCounterTwo := prometheus.NewCounter(prometheus.CounterOpts{
		Name: "my_counter",
		Help: "The total number of processed events",
	})
	if err := prometheus.Register(myCounterTwo); err != nil {
		if are, ok := err.(prometheus.AlreadyRegisteredError); ok {
			// A counter for that metric has been registered before.
			// Use the old counter from now on.
			myCounterTwo = are.ExistingCollector.(prometheus.Counter)
		} else {
			// Something else went wrong!
			panic(err)
		}
	}
	go func() {
		for {
			myCounterTwo.Inc()
			time.Sleep(1 * time.Second)
		}
	}()
	http.Handle("/metrics", promhttp.Handler())
	log.Fatal(http.ListenAndServe(*addr, nil))
}

I think the context at the top of promauto is super enlightening.

So based on this spelunking on what I'd consider the "reference" implementation, I think we should throw an Exception either way. However, that leaves me with two questions:

  1. Do we want to fully panic & bail out of execution?
  2. Should we create some kind of "auto" registration functions that do the panic, while the main ones just throw an error which pulls out the existing metric?

@ktoso
Copy link
Collaborator

ktoso commented Aug 30, 2019

To clarify: Why are you not using the SwiftMetrics API to call through to SwiftPrometheus in your app? It would do the right thing, as it's implemented as:

    public func makeCounter(label: String, dimensions: [(String, String)]) -> CounterHandler {
        let createHandler = { (counter: PromCounter) -> CounterHandler in
            return MetricsCounterHandler(counter: counter, dimensions: dimensions)
        }
        // vvvvvvv the "right" thing / "semantics you expected"
        if let counter: PromCounter<Int64, DimensionLabels> = self.getMetricInstance(with: label, andType: .counter) {
            return createHandler(counter)
        }
        // ^^^^^^^ 

        return createHandler(self.createCounter(forType: Int64.self, named: label, withLabelType: DimensionLabels.self))
    }

Why not use the metrics API as a fascade there as it was intended?

This library IS an implementation of the metrics SPI, and thus this works:

let c1 = Counter(label: "one")
c1.increment()
let c2 = Counter(label: "one")
c2.increment()

why not use it, rather than bind yourself into the prometheus API directly?
(Of course with binding the prometheus as backend to the metrics API).


Do we want to fully panic & bail out of execution?

No, I don't think so. One cannot compare Go's panic to fatal erroring in Swift. Panics are recoverable, Swift faults are not.

Throwing an error sounds good if the same metrics is attempted to be created.
Though I'm actually surprised a bit -- I thought the Go impls semantics was only to bail out of the registered type did not match, not a "re-creation of correct type" 🤔

Should we create some kind of "auto" registration functions that do the panic, while the main ones just throw an error which pulls out the existing metric?

By "auto" you mean "return me the same instance if it existed already"? That's what the SwiftMetrics API would do; What am I missing?

// nitpick, let's not use the word panic, mixes up things a bit IMHO... Swift does not do panics, Go does, and they're recoverable.

@Yasumoto
Copy link
Contributor Author

I think there are valid usecases in an environment where Prometheus is such a standard that using this package directly is fine and has some benefits. For instance, using createCounter means you can specify help text for the metric that'll show up in Prometheus. Totally agreed that in general even in that environment using SwiftMetrics will make an eventual migration much easier.

That said, the goal for this issue is to track down the expected behavior when using SwiftPrometheus itself. I like the `SwiftMetrics implementation, but I can see why in the golang implementation they wanted to explicitly let folks know in case they're clobbering a metric with another library.

Any opposition to throwing an error that also includes a reference to the existing metric? That way callers can surface the collision just in case, but have access to use it if that's acceptable.


No, I don't think so. One cannot compare Go's panic to fatal erroring in Swift. Panics are recoverable, Swift faults are not.

Yikes, you're totally right. I definitely assumed panic meant "game over, execution will stop", but definitely not the case. From the go blog:

Recover is a built-in function that regains control of a panicking goroutine.

I thought the Go impls semantics was only to bail out of the registered type did not match, not a "re-creation of correct type" 🤔

Me too, I was surprised a bit as well.

By "auto" you mean "return me the same instance if it existed already"? That's what the SwiftMetrics API would do; What am I missing?

SwiftMetrics is a-ok 👌 , I'd like SwiftPrometheus to do something similar with its behavior when used directly (see above section for my current preference).

@Yasumoto Yasumoto force-pushed the yasumoto-prevent-dupes branch from 5d87049 to 2fe6b12 Compare September 11, 2019 04:11
@Yasumoto
Copy link
Contributor Author

I played around with this a bit, and it "feels" better to follow the SwiftMetrics style and return the metric if it already exists. Open to other preferences though!

@Yasumoto
Copy link
Contributor Author

@MrLotU @ktoso friendly nudge 😄

@tomerd
Copy link
Collaborator

tomerd commented Sep 25, 2019

I played around with this a bit, and it "feels" better to follow the SwiftMetrics style and return the metric if it already exists.

in one of the early iteration of swift-metrics, we did that for "free" as part of the API package (caching basically) but we decided to remove it and leave it to the implementation to decide. imo, the expected behavior when type and other arguments align is to return the existing one, but we wanted to leave the door open to other logic

@Yasumoto
Copy link
Contributor Author

@tomerd ah, that's good context and totally makes sense. Given the behavior of the golang prometheus client library (throw an exception if you try to create a metric that already exists) I see the value in leaving it open.

For now I think keeping this simple is fine, but I'm definitely open to other feedback too.

@Yasumoto Yasumoto force-pushed the yasumoto-prevent-dupes branch from 1ae094c to f4037ad Compare September 30, 2019 01:48
@Yasumoto
Copy link
Contributor Author

Looks like this needed a rebase off master 👍

Copy link
Collaborator

@MrLotU MrLotU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @Yasumoto

@MrLotU MrLotU merged commit b6e38ea into swift-server:master Sep 30, 2019
@Yasumoto Yasumoto deleted the yasumoto-prevent-dupes branch September 30, 2019 16:31
@Yasumoto
Copy link
Contributor Author

Thanks, folks! 🎊

@Yasumoto
Copy link
Contributor Author

@MrLotU do you mind adding this to the 1.0 milestone? I would but not sure I have the permissions. 🙇‍♂️

@MrLotU MrLotU added this to the 1.0.0 milestone Sep 30, 2019
@MrLotU
Copy link
Collaborator

MrLotU commented Sep 30, 2019

Done. Good catch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants