Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging Part 1: initial trace2 events (start, cmd_name, error, exit, atexit) #28

Merged
merged 10 commits into from
Feb 27, 2023

Conversation

vdye
Copy link
Collaborator

@vdye vdye commented Feb 24, 2023

Part of #22.

Summary

This series creates a (relatively) simple framework for logging events in the git-bundle-server and git-bundle-web-server applications. As noted in #22, I went with trace2 JSON as the log format.

In terms of this pull request, it's roughly broken into "set up to be able to do logging" (patches 1-4) and "create TraceLogger and its Trace2 implementation" (patches 5-10).

Testing

I tried out a bunch of scenarios - abrupt exits due to bad arguments, commands run successfully to completion, errors handled and sent back up the call stack - as well as console logging (with GIT_TRACE2_EVENT=1) and file logging (with GIT_TRACE2_EVENT=<existing dir> and GIT_TRACE2_EVENT=<new file, possibly missing parent dir>).

Future work

There's still a lot to do to get a full set of trace2 logging working, some of which I've started (or at least planned out):

  • Part 2: regions (+ refactoring repo.go and bundles.go while I'm there)
  • Part 3: child processes (+ refactoring both cron.gos & git.go)
  • Part 4: better environment configuration (GIT_TRACE2 and maybe GIT_TRACE2_PERF, as well as writing to a UDS) + documentation.
  • Part 5 (maybe): threading (sort-of - for the sake of simplicity, I'll assume goroutine == "thread")
  • Part 6 (might skip for now): timers & counters

CC: @jeffhostetler

When the internal 'FlagSet.Parse()' of an 'argparser' encounters an invalid
flag, the current 'ExitOnError' error handling causes it to invoke the
'os.Exit(2)' syscall and exit the program abruptly. In later patches, we're
going to want to "catch" that exit status in our logs, so change the error
handling to 'ContinueOnError' and explicitly call 'os.Exit(2)' in
'argparse.Parse()' if 'FlagSet.Parse()' returns an error.

Signed-off-by: Victoria Dye <[email protected]>
@vdye vdye added this to the v1.0 milestone Feb 24, 2023
@vdye vdye self-assigned this Feb 24, 2023
vdye added 3 commits February 24, 2023 09:16
The 'context.Context' structure is used, among other things, to hold
request, command, function, etc. scoped information for throughout the call
stack. In particular, this information scoping will be useful for logging
(e.g., holding a session ID or a region nesting level) added in future
patches.

Following Go convention [1], pass the 'ctx' from the 'main()' function of
the 'git-bundle-server' into child structures that will implement the
initial round of logging in the application (i.e., the arg parser,
'git-bundle-server' commands, and daemon providers).

[1] https://pkg.go.dev/context

Signed-off-by: Victoria Dye <[email protected]>
…ruct

Move the server setup & execution logic out of 'main.go' and into a new
'bundleWebServer' struct in 'bundle.go'. This brings the web server more in
line with 'git-bundle-server', which similarly uses its command structs to
contain operational logic. This new design makes passing around the
'context' structure a bit more straightforward, and (eventually) helps avoid
needing to pass a logger instance to each function called by
'git-bundle-web-server'.

Signed-off-by: Victoria Dye <[email protected]>
In later patches, a logger struct will need to be a member of each command
struct to facilitate structured logging & tracing. Like the 'web-server'
command in ca23190 (bundle-server: add 'web-server' subcommand, 2023-01-05),
add explicit constructors (i.e., 'New<Command>()') to the remaining
commands.

Signed-off-by: Victoria Dye <[email protected]>
@vdye vdye force-pushed the vdye/initial-trace2 branch from 7220de3 to 17b0aa1 Compare February 24, 2023 17:27
Copy link
Collaborator

@ldennington ldennington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job!

internal/log/fields.go Outdated Show resolved Hide resolved
TraceLogger
}

func WithTraceLogger(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice way to handle logging start and exit events. Great job!

internal/log/trace2.go Outdated Show resolved Hide resolved
@vdye vdye force-pushed the vdye/initial-trace2 branch from 17b0aa1 to f31fd8d Compare February 24, 2023 20:00
vdye added 6 commits February 24, 2023 14:55
Create a general 'TraceLogger' interface and (for now) a sole implementation
in the 'Trace2' struct. There are two main components of this commit: the
Trace2 logger itself, and how it's used in the CLIs built in this
repository.

The Trace2 logger is built around the 'zap' structured logger. As of right
now, it does not log anything other than 'start', 'exit', and 'atexit' (and
only through the 'WithLogger()' wrapper described later). The logs include:

* 'event': the event type; it takes over the 'message' field of the 'zap'
  logging functions, mostly for the sake of simplicity (those functions
  require a message, and every trace2 log has an 'event').
* 'sid': a UUID generated by the 'uuid.NewUUID()' function, propagated
  through the call stack via the 'context.Context'.
* 'time': the UTC timestamp of the log generation; this is automatically
  provided by a 'zap' logger, but requires a custom formatter to provide the
  timestamp in UTC.
* 't_abs': the time (in seconds) since the start of the program
  (technically, when the 'log' package is loaded and the 'globalStartTime'
  is initialized).
* 'file' and 'line': the filename and line number where the relevant trace2
  log function was called.
* 'thread': hardcoded to 'main', but may be updated in the future to give
  unique names to goroutines.

From a user's perspective, the logger is configured by the
'GIT_TRACE2_EVENT' environment variable. Its behavior matches what is
described in Git [1], but it is currently missing support for open file
handles 2-9 and Unix Domain Sockets.

To use the trace2 logger, create the 'main()' function wrapper
'WithLogger()'. This function creates an instance of a 'TraceLogger', logs
the program start, defers a "cleanup" logging function to write out the
'exit'/'atexit' logs, then runs the wrapped function.

[1] https://git-scm.com/docs/git-config#Documentation/git-config.txt-trace2eventTarget

Signed-off-by: Victoria Dye <[email protected]>
To increase the amount of the code covered by logging, pass the 'main()'
functions' 'logger' instances to structures it creates to execute the
user-specified operation (i.e., the arg parser, 'git-bundle-server'
commands, 'bundleWebServer', and daemon providers). For now, do nothing with
the 'logger' instance in those structs.

Signed-off-by: Victoria Dye <[email protected]>
Create a 'MockTraceLogger' struct that, like other mocks in 'mock.go',
embeds a 'mock.Mock' struct. However, unique to the 'TraceLogger' instance
in tests (namely the daemon tests) is that we usually don't want to assert
on the logger's behavior; in most cases, we just want it to "do nothing" or
pass through.

Implement the 'TraceLogger' interface with checks that only invoke
'Mock.Called()' if the method has been mocked at least once (even if it's
for args not matching the current ones). If the method is not mocked, fall
back on "passthrough" defaults (usually returning the input
'context.Context' and/or the input args).

Finally, add an input assertion to 'MockTraceLogger.Error()', to serve as an
additional validation in unit tests and mirror the behavior of an actual
'TraceLogger' instance.

Signed-off-by: Victoria Dye <[email protected]>
Replace executions of 'os.Exit()', 'log.Fatal()', and 'log.Fatalf()' with
'TraceLogger.Exit()', 'TraceLogger.Fatal()', and 'TraceLogger.Fatalf()',
respectively. When these are called, log the appropriate 'exit'/'atexit'
events.

Note that 'TraceLogger.Fatal()' and 'TraceLogger.Fatalf()' currently do not
log the error (or error format + args) they receive. This will be updated in
a later patch, but the unused arguments are added now to avoid needing to
remove then later add back the arguments passed to 'log.Fatal()' and
'log.Fatalf()' by their callers.

Signed-off-by: Victoria Dye <[email protected]>
Add logging for errors returned 'fmt.Errorf()' errors as well as the errors
passed to or created by 'TraceLogger.Fatal()'/'TraceLogger.Fatalf()'.
Replace all use of 'fmt.Errorf()' or direct error returns with
'TraceLogger.Errorf()' or 'TraceLogger.Error()', which return 'error'
instances to stay consistent with what they're replacing.

These errors are logged under the "error" trace2 event. We want to capture
the error as low in the call stack as possible (to more precisely indicate
the cause), but also not log a new "error" event for every 'fmt.Errorf()'
replaced. To do that, have the error wrappers return a custom type
'loggedError', and only log the error event if the 'err' or none of the
'Errorf()' arguments are of the 'loggedError' type.

Signed-off-by: Victoria Dye <[email protected]>
Add a 'LogCommand()' function to the 'TraceLogger', called in
'argparser.InvokeSubcommand()' right before the command is invoked. To
remain consistent with Git, do this only when a toplevel command (indicated
by 'isTopLevel') is invoked; nothing is done for subcommands of those
commands.

In the trace2 logger, this corresponds to the "cmd_name" event.

Signed-off-by: Victoria Dye <[email protected]>
@vdye vdye force-pushed the vdye/initial-trace2 branch from f31fd8d to 3504fd2 Compare February 24, 2023 22:55
@vdye
Copy link
Collaborator Author

vdye commented Feb 24, 2023

Quick heads-up: while doing some testing, I noticed that the panic() catch was suppressing the error & stack trace. It's a pretty small change, can be found here.

@derrickstolee
Copy link
Collaborator

I tried out a bunch of scenarios - abrupt exits due to bad arguments, commands run successfully to completion, errors handled and sent back up the call stack - as well as console logging (with GIT_TRACE2_EVENT=1) and file logging (with GIT_TRACE2_EVENT=<existing dir> and GIT_TRACE2_EVENT=<new file, possibly missing parent dir>).

I think Git requires GIT_TRACE2_EVENT to specify a fully-qualified path in order to write to a file. At least, I keep needing to write GIT_TRACE2_EVENT="$(pwd)/trace" in the test suite instead of just =trace. Having that extra ability here seems fine, though.

Copy link
Collaborator

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start. I will try to find time this week to reboot my instance of the bundle server with these traces enabled and pointing at @jeffhostetler 's open telemetry converter. It would be interesting to see these events showing up in that telemetry stream.

I did think it would be interesting to (at some point in the future) extract this trace2 emitter into a Go library for use within other Go-based tools that interact with Git. Something to think about when we have a "git ecosystem" open source org.

Comment on lines +132 to +143
// Get the caller of the function in trace2.go
// Skip up two levels:
// 0: this function
// 1: the caller of this function (StartTrace, LogEvent, etc.)
// 2: the function calling this trace2 library
_, fileName, lineNum, ok := runtime.Caller(2)
if ok {
fields = append(fields,
zap.String("file", filepath.Base(fileName)),
zap.Int("line", lineNum),
)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! Thanks for commenting carefully here.

Copy link

@jeffhostetler jeffhostetler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very Nice!

@jeffhostetler
Copy link

BTW, I have client-side code for talking to a unix socket in my git-telemetry-service-tool
https://github.com/github/git-trace2-otel/blob/ccc408440bc03ee0465ecccd7eb7ff875436393d/src/git-telemetry-service-tool/send_other.go#L14

It's just a net.Dial() to set up. And then use the net.Conn.Write() method to send data. My code just takes the socket pathname (without the af_unix: prefix stuff that
I had to add to core Git to disambiguate).

Also, you can use the git-telemetry-service-tool to send a txt file containing JSON
from a command to the service if you want to see what the data looks like on the other
side.

@vdye
Copy link
Collaborator Author

vdye commented Feb 27, 2023

I think Git requires GIT_TRACE2_EVENT to specify a fully-qualified path in order to write to a file. At least, I keep needing to write GIT_TRACE2_EVENT="$(pwd)/trace" in the test suite instead of just =trace. Having that extra ability here seems fine, though.

You're right, the implementation as it is now allows relative paths, while Git doesn't. I'll make sure to note it in the documentation in Part 3!

@vdye vdye merged commit bec0c2f into main Feb 27, 2023
@vdye vdye deleted the vdye/initial-trace2 branch February 27, 2023 23:44
@jeffhostetler
Copy link

WRT absolute paths, yes, Trace2 requires absolute paths within Git because relative paths are ambiguous.
That is, there's the context of where you type the git config to set it, where you hack your ~/.gitconfig to
manually set it, vs the PWD of where you issue Git commands that generate telemetry. And I couldn't
hack into the config code to change a value from a relative to absolute path, so I had to force an absolute
path here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants