It is difficult to write a reliable unit test for logging behavior of an application #156

exarkun · 2025-01-08T18:14:22Z

As an application developer on a project that uses Katip, I want to be able to write unit tests that demonstrate my application code emits log events that meet certain criteria - for example, so that if I add HTTP request logging behavior to my HTTP server that I can be confident it logs correct details from the HTTP request.

It is currently very difficult to write a reliable unit test for this kind of thing using the public Katip APIs because there is an asynchronous step between application code emitting a log event (say, using logLocM) and test code observing that log event (eg, by registering a scribe to accumulate all events).

The asynchronous step arises from use of TBQueue and an Async task (created in spawnScribeWorker) to read from the queue and deliver messages to a scribe.

In one case, the delivery of the message itself serves as notification that the asynchronous step is complete and the test can proceed to make assertions about the log event. However, in the general case, there's no way for a test to know how long to wait for that message to be delivered. If the application code is broken, it might wait forever for an event that will never arrive.

Additionally, a naively written test may initially pass due to lucky timing / scheduling outcomes that cause the event to reach the scribe before the test code runs but it will eventually fail when it gets unlucky and the test thread runs before the spawnScribeWorker thread.

It would be great if there were a reliable way to know that all log events that have been emitted as of some point in time have been delivered to all registered scribes (or even to one specific scribe) or if there were a public interface to Katip for skipping all of the asynchronous processing and doing all log event processing in a single thread.

The text was updated successfully, but these errors were encountered:

MichaelXavier · 2025-01-20T00:44:11Z

Would it be reasonable to use closeScribes for your test? When you run that it should close the queues to new writes and wait for them to empty out. It calls scribeFinalizer on all the scribes. Scribes are supposed to provide a finalizer that blocks until all writes are flushed.

exarkun · 2025-01-23T16:23:36Z

Initially I tried using closeScribes for just this reason. From what I can tell, it doesn't wait for shChan to empty out and so there's still a chance of missing a message (and in some of my very simple tests, it worked just often enough to convince me I'd solved the problem but fortunately not often enough to make it through CI before landing my changes).

The workaround I have in place right now is that I provide an IO action for retrieving the current logs and in it I dig out shChan, wait for it to be empty (so I don't miss messages that haven't been grabbed by the scribe's worker thread yet), then also threadDelay 1 (so I don't miss messages that have been removed from the queue but haven't been delivered to the scribe yet) before fetching log events from my scribe and returning them.

Ideally there'd be a way to make reliable log observations without closing the scribe (to support more complex tests where there might be multiple actions emitting multiple log events) but something that relied on closeScribes would also probably be usable.

MichaelXavier · 2025-01-27T05:27:59Z

That seems like a bug to me. Are you using a custom scribe or one of the ones that ship with the project? It's been a long time since I wrote this code but it looks like when you close a scribe it writes a special poison pill message to the queue. The worker loops over the messages in the queue sequentially and then quits. Then the main thread calls the finalizer callback on the handle before unblocking. Scribe implementers are supposed to ensure that all writes are flushed in the finalizer. I could see writes getting lost if you're somehow writing to the handle after you've attempted to close the scribe. I think the write would go on to the queue but then the poison pill would cease reading from the queue and it would eventually get GCed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

It is difficult to write a reliable unit test for logging behavior of an application #156

It is difficult to write a reliable unit test for logging behavior of an application #156

exarkun commented Jan 8, 2025

MichaelXavier commented Jan 20, 2025

exarkun commented Jan 23, 2025

MichaelXavier commented Jan 27, 2025

It is difficult to write a reliable unit test for logging behavior of an application #156

It is difficult to write a reliable unit test for logging behavior of an application #156

Comments

exarkun commented Jan 8, 2025

MichaelXavier commented Jan 20, 2025

exarkun commented Jan 23, 2025

MichaelXavier commented Jan 27, 2025