autocomplete: Add "recent senders criterion" to user-mention autocomplete #828

sm-sayedi · 2024-07-19T20:27:23Z

Besides recent activity in DMs, activity in the current topic/stream is also considered (as the first criterion) when suggesting users in @-mention autocomplete.

Note: This is the third PR in the series of PRs #608 is divided into, coming after #692. Next PR in the series: #849.

Fixes part of: #228

gnprice

Thanks @sm-sayedi for all your work on this!

For the model code, small comments below.

Then in the test code, please do a new self-review of your changes, in light of the comments on #692 and the commits I added to that branch. In particular look to:

make each test case compact so it's easy to scan through them — in this case there are already helpers that go a long way, but the formatting can be adjusted for more compactness
think of all the different scenarios you can that should be tested:
- scenarios where the code's behavior should vary
- scenarios where, given the way you did write the code, the path taken through the control flow will vary
- possible bugs that an implementation of this could have if someone made a plausible mistake, or if part of your code were accidentally deleted or disabled, and scenarios that would reproduce those bugs
and make sure that for each scenario, there's a test that covers it and where the reader can easily confirm that it does so.

gnprice · 2024-07-20T04:19:00Z

lib/model/autocomplete.dart

+  /// If [a] is non-null and [b] is null, returns a positive number.
+  /// If both are null, returns zero.
+  @visibleForTesting
+  static int compareNullable(int? a, int? b) => switch ((a, b)) {


From the name of this, I can't tell whether null comes before all the integers, or after all of them, or something else (equal to zero?). That makes it hard to tell whether or why this function is appropriate to use in any given context.

The name should be something that indicates that, and the summary line in the doc should indicate it too.

It could be something direct about the behavior: like compareNullBeforeInt, or compareNullAsLeast, or something like that.

Or maybe better is to give a name that reflects why it is that two methods on this class both want it: the point is that these aren't just any ints, but specifically are the greatest message IDs in two sets. When the set of message IDs is empty, its value is mathematically $-\infty$, negative infinity; we're representing that as null but it should therefore be treated as more negative than any number.

So the name could be compareRecentMessageIds. The doc can say:

/// Compares [a] to [b], with null less than all integers. /// /// The values should represent the most recent message ID in each of two /// sets of messages, with null meaning the set is empty. /// /// Return values are as with [Comparable.compareTo].

And that's it — the summary line, plus the reference to [Comparable.compareTo] to resolve any ambiguity about what "compare" means, completely specifies the behavior so there's no need to transcribe the implementation into English.

gnprice · 2024-07-20T04:19:58Z

lib/model/autocomplete.dart

+  static int compareNullable(int? a, int? b) => switch ((a, b)) {
+    (int a, int b) => a.compareTo(b),


nit: block body, like at #692 (comment)

gnprice · 2024-07-20T04:24:07Z

lib/model/autocomplete.dart

+      StreamNarrow() => (narrow.streamId, null),
+      TopicNarrow() => (narrow.streamId, narrow.topic),
+      // The CombinedFeedNarrow case should be impossible anyway.
+      DmNarrow() || CombinedFeedNarrow() => (null, null),


Let's move the assert above about CombinedFeedNarrow to inside here — that way we're splitting up the cases for narrow's type in just one place.

To make that comfortable, this will probably want to become a switch statement instead of expression.

gnprice · 2024-07-20T04:27:46Z

lib/model/autocomplete.dart

+      if (result != 0) {
+        return result;
+      }


nit: on one line, same way as below

gnprice · 2024-07-20T04:29:05Z

lib/model/autocomplete.dart

+      final aMessageId = recentSenders.latestMessageIdOfSenderInTopic(
+        streamId: streamId, topic: topic, senderId: userA.userId);
+      final bMessageId = recentSenders.latestMessageIdOfSenderInTopic(
+        streamId: streamId, topic: topic, senderId: userB.userId);
+
+      final result = -compareNullable(aMessageId, bMessageId);


nit: I think this is a little easier to read if these first two variables are inlined:

Suggested change

final aMessageId = recentSenders.latestMessageIdOfSenderInTopic(

streamId: streamId, topic: topic, senderId: userA.userId);

final bMessageId = recentSenders.latestMessageIdOfSenderInTopic(

streamId: streamId, topic: topic, senderId: userB.userId);

final result = -compareNullable(aMessageId, bMessageId);

final result = -compareNullable(

recentSenders.latestMessageIdOfSenderInTopic(

streamId: streamId, topic: topic, senderId: userA.userId),

recentSenders.latestMessageIdOfSenderInTopic(

streamId: streamId, topic: topic, senderId: userB.userId));

Not really any denser that way, and there's a little fewer names flying around to have to track.

sm-sayedi · 2024-07-24T12:23:54Z

Thank you for the review @gnprice. Changes pushed. Please have a look.

In @-mention autocomplete, users are suggested based on: 1. Recent activity in the current topic/stream. 2. Recent DM conversations. Note: By "recent activity" we mean recent messages sent to a topic/stream. Fixes part of: zulip#228

…s first

I think this is a bit more idiomatic than setting a record which we promptly destructure; and it doesn't come out any longer.

This way they fit on a line.

Without this, the method's other unit tests (the other tests in this group) pass even if the method were to ignore its `topic` parameter and only use the per-stream data.

The full names of these tests were like this: MentionAutocompleteView sorting users results MentionAutocompleteView.compareRecentMessageIds both a and b are non-null MentionAutocompleteView sorting users results MentionAutocompleteView.compareRecentMessageIds one of a and b is null MentionAutocompleteView sorting users results MentionAutocompleteView.compareRecentMessageIds both of a and b are null MentionAutocompleteView sorting users results MentionAutocompleteView.compareByRecency favor user most recent in topic To see the names, one can run a command like: $ flutter test test/model/autocomplete_test.dart -r expanded We can cut the second mention of MentionAutocompleteView.

…ow type

…rrow If for example one substitutes some other narrow like `topicNarrow` in this `checkResultsIn` call, the test still passes. That's because now the autocomplete does return results... but they don't match the arbitrary `expected` list that was passed in, so the inner `check` call fails, and the outer one succeeds because of that exception. Instead, check more specifically that the MentionAutocompleteView.init call throws.

The [List.sort] method is documented as being not necessarily stable, meaning that elements that compare equal could end up in an arbitrary order in the result. Here, that means that users which aren't distinguished by recency in the topic or stream or in DM conversations can appear in any order in the autocomplete results. Several of these test cases were therefore vulnerable to flaking because users 2 and 4 have no DMs and (in these tests) no messages in the stream or topic. I think what they cover is now also covered by the "ranking across signals" tests above, together with the tests for RecentSenders and RecentDmConversations that check those data structures' handling of events. So just cut the tests.

It's good to have an end-to-end test here that the actual autocomplete results reflect the sorting that's tested by all the unit tests in the rest of this file: a test that makes a MentionAutocompleteView, then gives it a query, and inspects the results. The several such test cases that were here, though, don't exercise any distinct scenarios when it comes to that end-to-end logic; they differ from each other only in ways that are exercised by the unit tests above, plus the tests for RecentSenders and RecentDmConversations that check those data structures' handling of events. So pick just one of these test cases and cut the rest.

Now that many of these helpers have just one call site, they can be simplified.

Zulip user IDs (like channel/stream IDs, message IDs, and so on) are positive integers. So avoiding zero helps keep the test data realistic.

…nd test This is nearly NFC, but also adds a `dispose` call to tidy things up after getting the results.

…test This way it's possible to look at the expected results at the end of this test and compare them with the test data that leads to those results, without scanning past a lot of other code that isn't related to why these particular results are the right answers.

This test was relying on its `topic` variable (whose value is "topic") being different from the topic chosen by default by eg.streamMessage. It does happen to be different, but it could just as well not have been. Because the difference matters -- the test would fail if they happened to be the same -- the test case itself should be explicit about that.

gnprice

Thanks for the revision!

This didn't quite get the tests to the kind of tests that I was hoping for. I decided the best way this time to communicate that would be by demonstration — so I wound up spending some time today writing tests of the kind I want to see, and revising these tests in that direction. I'll push those changes to the tip of the PR branch.

For each idea that might otherwise have been a separate comment on the PR, I made a separate commit. As a result this ended up being a longer series of separate commits added to the tip of the branch than the number of commits would have been if I were writing the same tests from scratch:

adeb012 autocomplete [nfc]: Format second half of compareByRecency same way as first
a0d0248 autocomplete [nfc]: Set streamId and topic as variables directly
3c7096f autocomplete test [nfc]: Tighten formatting in compareByRecency tests
a5b3cc8 autocomplete test: Check both directions in compareByRecency tests
6e72a5b autocomplete test [nfc]: Tighten names of compareByRecency tests
d91ecca autocomplete test: Check compareByRecency uses per-topic recency
72caacf autocomplete test [nfc]: Deduplicate in test names
4762e4e autocomplete [nfc]: Expose debugCompareUsers, for testing
bfce142 autocomplete test: Unit tests of between-signals logic, for each narrow type
37e643d autocomplete test: Fix test that almost can't fail, of CombinedFeedNarrow
3823ec4 autocomplete test: Cut tests that could flake due to indeterminate sort
a14ea8b autocomplete test: Highlight one end-to-end test, cut the others
1e781dd autocomplete test [nfc]: Do constant-folding in end-to-end test
739fa84 autocomplete test: Fix user IDs of zero
2b8d848 autocomplete test: Separate getting results from checking in end-to-end test
0288ee4 autocomplete test: Test ranking applies in presence of filters, too
b436fb5 autocomplete test [nfc]: Bring details next to details in end-to-end test
363dafa autocomplete test [nfc]: Cut unneeded MessageListView in end-to-end test
3444a68 autocomplete test: Make distinct topic explicitly distinct

Please take a look through each of those commits, including the commit messages, to see the points I had in mind. I also left one GitHub comment below, for an aspect that the commits don't go into full detail on.

With that, this code all looks good; and I also did some quick manual testing and confirms it works end-to-end in the app. So, merging! Thanks again @sm-sayedi for all your work building this feature #228, and @rajveermalviya @hackerkid @chrisbobbe for the previous reviews on this series of PRs.

gnprice · 2024-07-25T04:22:24Z

test/model/autocomplete_test.dart

+        test('DmNarrow, with no topic/stream message history', () async {
+          await prepareStore();
+          await checkResultsIn(dmNarrow, expected: [3, 0, 1, 2, 4]);


It's hard to interpret most of the test cases in this "suggests relevant users" group — for example, why is 3, 0, 1, 2, 4 the correct result to expect here, and why is 0, 3, 1, 2, 4 the correct result to expect in some of the tests above? The details that cause the right answers to be those lists and not something else are separated from here by about 100 lines of code, and they require some careful working-through to verify.

In general it's best for a test to be self-contained — all the key details that tell the story of the test, and determine why one value is expected instead of another, live in the test case's own source code, or failing that very close above it. That way

a reader can look at the test and confirm it's checking for the right answer instead of a wrong answer;

a reader can look at the test and understand what it's testing, and look at the ensemble of tests to think about things that aren't yet tested;

in the future when changing something about the code under test and the test breaks, a reader can understand what the test was trying to say so they can decide whether the failure signals a bug in their change or is an intended effect of their change, and so they can update the test if necessary so it continues to check what it was intended to check.

We use a lot of test helper functions to enable test cases to be concise and focused, because that also aids in all three of the points above. But when doing so, the key is to move out the boring details (the things that can be summarized as the helpers just doing things that ought to be done), and keep in the details that are part of the test's story, or that are necessary in order to make sense of other details in the test. So for example as long as this ranking [3, 0, 1, 2, 4] is here, the details of messages and dmConversations that give rise to it should be here or near here too.

sm-sayedi · 2024-07-25T07:20:34Z

Thank you @gnprice for your time and patience in improving the PR. I read through all the commits and learned a lot. I believe this will improve my limited ability to write good tests.

Zulip user IDs are positive integers. So avoiding zero here helps keep the test data realistic. This follows up on 739fa84 (in zulip#828), which made a similar fix for one test case.

Zulip user IDs are positive integers. So avoiding zero here helps keep the test data realistic. This follows up on 739fa84 (in #828), which made a similar fix for one test case.

sm-sayedi added the integration review Added by maintainers when PR may be ready for integration label Jul 19, 2024

sm-sayedi requested a review from gnprice July 19, 2024 20:28

gnprice reviewed Jul 20, 2024

View reviewed changes

sm-sayedi mentioned this pull request Jul 22, 2024

model: Introduce data structures for "recent senders criterion" of user-mention autocomplete #692

Merged

sm-sayedi force-pushed the issue-228-@-mention-recent-senders-criterion branch 3 times, most recently from 5cc7896 to b816843 Compare July 24, 2024 11:59

sm-sayedi mentioned this pull request Jul 24, 2024

autocomplete: Sort user-mention autocomplete results #608

Closed

sm-sayedi requested a review from gnprice July 24, 2024 12:24

sm-sayedi and others added 20 commits July 24, 2024 16:45

autocomplete [nfc]: Factor out compareRecentMessageIds

68c90d7

autocomplete [nfc]: Format second half of compareByRecency same way a…

adeb012

…s first

autocomplete [nfc]: Set streamId and topic as variables directly

a0d0248

I think this is a bit more idiomatic than setting a record which we promptly destructure; and it doesn't come out any longer.

autocomplete test [nfc]: Tighten formatting in compareByRecency tests

3c7096f

autocomplete test: Check both directions in compareByRecency tests

a5b3cc8

autocomplete test [nfc]: Tighten names of compareByRecency tests

6e72a5b

This way they fit on a line.

autocomplete test: Check compareByRecency uses per-topic recency

d91ecca

Without this, the method's other unit tests (the other tests in this group) pass even if the method were to ignore its `topic` parameter and only use the per-stream data.

autocomplete [nfc]: Expose debugCompareUsers, for testing

4762e4e

autocomplete test: Unit tests of between-signals logic, for each narr…

bfce142

…ow type

autocomplete test [nfc]: Do constant-folding in end-to-end test

1e781dd

Now that many of these helpers have just one call site, they can be simplified.

autocomplete test: Fix user IDs of zero

739fa84

Zulip user IDs (like channel/stream IDs, message IDs, and so on) are positive integers. So avoiding zero helps keep the test data realistic.

autocomplete test: Separate getting results from checking in end-to-e…

2b8d848

…nd test This is nearly NFC, but also adds a `dispose` call to tidy things up after getting the results.

autocomplete test: Test ranking applies in presence of filters, too

0288ee4

autocomplete test [nfc]: Cut unneeded MessageListView in end-to-end test

363dafa

gnprice force-pushed the issue-228-@-mention-recent-senders-criterion branch from b816843 to 3444a68 Compare July 25, 2024 04:58

gnprice reviewed Jul 25, 2024

View reviewed changes

gnprice merged commit 3444a68 into zulip:main Jul 25, 2024
1 check passed

chrisbobbe mentioned this pull request Jul 25, 2024

autocomplete: Sort user-mention autocomplete results #228

Closed

sm-sayedi mentioned this pull request Jul 29, 2024

autocomplete: Add "human vs. bot user" and "Alphabetical order" criteria #849

Merged

gnprice mentioned this pull request Aug 10, 2024

test [nfc]: Ensure user IDs, message IDs, etc. are positive #877

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

autocomplete: Add "recent senders criterion" to user-mention autocomplete #828

autocomplete: Add "recent senders criterion" to user-mention autocomplete #828

sm-sayedi commented Jul 19, 2024 •

edited

Loading

gnprice left a comment

gnprice Jul 20, 2024

gnprice Jul 20, 2024

gnprice Jul 20, 2024

gnprice Jul 20, 2024

gnprice Jul 20, 2024

sm-sayedi commented Jul 24, 2024

gnprice left a comment

gnprice Jul 25, 2024

sm-sayedi commented Jul 25, 2024

		static int compareNullable(int? a, int? b) => switch ((a, b)) {
		(int a, int b) => a.compareTo(b),

autocomplete: Add "recent senders criterion" to user-mention autocomplete #828

autocomplete: Add "recent senders criterion" to user-mention autocomplete #828

Conversation

sm-sayedi commented Jul 19, 2024 • edited Loading

gnprice left a comment

Choose a reason for hiding this comment

gnprice Jul 20, 2024

Choose a reason for hiding this comment

gnprice Jul 20, 2024

Choose a reason for hiding this comment

gnprice Jul 20, 2024

Choose a reason for hiding this comment

gnprice Jul 20, 2024

Choose a reason for hiding this comment

gnprice Jul 20, 2024

Choose a reason for hiding this comment

sm-sayedi commented Jul 24, 2024

gnprice left a comment

Choose a reason for hiding this comment

gnprice Jul 25, 2024

Choose a reason for hiding this comment

sm-sayedi commented Jul 25, 2024

sm-sayedi commented Jul 19, 2024 •

edited

Loading