fix(web): enhanced timer for prediction algorithm #7037

jahorton · 2022-08-04T08:08:14Z

Hopefully addresses some aspects of #6985.

This PR is primarily designed to address some CI unit testing stability issues. Since our browser-based unit testing is handled via an external service, it's quite possible that our unit tests may be run alongside others in a manner that causes context-switching. Noting that we've had issues arise from context switching before - see #5467 / #5479 - it's quite possible that some of our prediction instability is arising from the same underlying problem - a user often does have background threads and processes running on their mobile device, after all.

Noting the similarity between this and intermittent failures for predictive text to produce no results when it often otherwise produces something with identical text, if my suspicions are correct, this may actually help mitigate some of the aspects of #6985.

The inner class ExecutionTimer here is an attempt to determine the predictive-text search's active execution time (that is, "actual time on the processor") by detecting possible context-switching time intervals and ignoring them. If the longest-interval observed at any point is at least twice as long as the next two longest-intervals observed, we'll call that an outlier. "Zero-length intervals" will also be ignored, since otherwise even a 1ms interval would automatically be an outlier.

[1, 1, 4]: the '4' is considered an outlier
[1, 2, 4]: the '4' isn't - 3*2 > 4, after all.

Note that excluded outliers aren't considered for any future outliers. The expectation/assumption is that each measured loop iteration should take a consistent amount of time... unless affected by context switching. To facilitate this, I've set things up for the timer (as used) to avoid tracking time in the search's yield state.

@keymanapp-test-bot skip

I think we should be fine without user testing... though I could see an argument for a mild bit of user testing for predictive text in the mobile apps if someone wanted to push for it.

keymanapp-test-bot · 2022-08-04T08:08:17Z

User Test Results

Test specification and instructions

User tests are not required

Test Artifacts

Android
Developer
iOS
- Keyman for iOS (simulator image)
- FirstVoices Keyboards for iOS (simulator image)
- TestFlight internal PR build version - 16.0.43 (0.7037.6741)
Keyboards
- Test Keyboards
Web
- KeymanWeb Test Home
Windows
- Keyman for Windows
- FirstVoices Keyboards for Windows

jahorton · 2022-08-05T03:57:37Z

I've re-run the lm-worker test multiple times now - the unit test that's addressed by these changes hasn't failed yet out of at least 5 attempts.

...not that this prevented BrowserStack from failing to run against one of the configured test browser configs for some of those attempts. 😞

…web/pred-text-stability

jahorton · 2022-08-08T05:00:07Z

This comment (#7036 (comment)) on the base branch also occurs here, including for the LMLayer's in-browser tests.

mcdurdin · 2022-08-08T18:46:18Z

I think we should be fine without user testing... though I could see an argument for a mild bit of user testing for predictive text in the mobile apps if someone wanted to push for it.

It's a functional change internally. I'd like to see some user tests.

mcdurdin

I think I understand what this is doing. I don't have enough context to understand why this would be fixing the inconsistency in predictions, though. So if you can add extra context comments in the code, that'd be great.

common/web/lm-worker/src/correction/distance-modeler.ts

mcdurdin · 2022-08-08T18:53:07Z

common/web/lm-worker/src/correction/distance-modeler.ts

@@ -582,17 +644,22 @@ namespace correction {

      let batcher = new BatchingAssistant();

+      const timer = new ExecutionTimer(maxTime*3, maxTime);


Why maxTime*3 vs maxTime? Is that likely to be a constant multiplier? If so, we should build that into the class I should think.

... because I needed to set the max true-system-time threshold to something larger than the max execution time, and something based on the max execution time seemed to make the most sense. I just picked '3' for the coefficient because there'd be nearly no chance of that being too short for the execution threshold to be exceeded.

Could probably be maxTime*2 pretty safely, and I could entertain arguments for other values. Maybe even 1.5; we don't expect heavy "thrashing" behavior from constant OS context switches.

The key interpretation: we're willing to wait up to 0.1 sec for predictions if execution gets delayed due to context switches, but for the common case we should only need to wait 0.033 sec (when no context switches occur). That 0.033 sec value has long been set via constant.

Using the * [coefficient] approach will dynamically adjust that maximum wait up if we're given more execution time, but for the common case, there should be no observable effect.

OK. The numbers are fairly off-the-cuff, but we need to start somewhere. Measuring and testing on real low-end Android devices may be helpful.

jahorton · 2022-08-09T02:19:47Z

common/web/lm-worker/src/correction/distance-modeler.ts

+           * Base assumption:  OS context switches generally last at least 16ms.  (Based on
+           *                   a window.setTimeout() usually not evaluating for at least
+           *                   that long, even if set to 1ms.)


Basis: try running this code in your browser's developer mode a few times:

var now = Date.now(); var timer = window.setTimeout(() => {console.log(Date.now() - now)}, 1);

Of course, keep in mind that this code will be triggered via user-input interrupt, so it can easily be triggered while the relevant thread would otherwise be asleep, possibly in the middle of whatever timer mechanism the OS uses to periodically context-switch, "off the beat" in a sense. As a result, you will see notably lower values than 16ms... but you'll likely also see some higher values.

Example runs on Win 10 / Firefox:

If a 23ms context-switch pause occurred during a prediction search, with no other interruptions... that'd cut execution time from 33ms to just 10ms without the special handling added here. If it were paused even longer, well... good luck operating with literally 0ms allotted execution time. As noted by #5467 / #5479, it can happen with some regularity.

While we don't know the exact conditions of the BrowserStack servers, devices, etc used for unit testing... I wouldn't be surprised if they have at least a few VMs set up per server; if so, there's a pretty solid chance of significant context-switching happening on those systems.

https://developer.mozilla.org/en-US/docs/Web/API/setTimeout#reasons_for_delays_longer_than_specified may also be helpful background.

On Windows, this may also be relevant:

The resolution of the GetTickCount function is limited to the resolution of the system timer, which is typically in the range of 10 milliseconds to 16 milliseconds. The resolution of the GetTickCount function is not affected by adjustments made by the GetSystemTimeAdjustment function.

I've got no idea what sort of behaviour we are going to see on mobile devices for these scenarios. Sounds like some testing on low-end devices is seriously needed.

jahorton · 2022-08-16T01:16:14Z

I've got no idea what sort of behaviour we are going to see on mobile devices for these scenarios. Sounds like some testing on low-end devices is seriously needed.

What did you have in mind for this? Do we have access to suitable devices for such tests? I assume that we'd need physical devices to test this; after all, this PR's changes are aimed at handling OS behavior that'd be very difficult to emulate directly.

Or... is that why it's been given approval - for a 'test in the wild' on alpha? Not the sort of thing we'd usually do, so unless explicitly stated otherwise, I'll avoid merging for now.

mcdurdin · 2022-08-16T04:52:10Z

What did you have in mind for this? Do we have access to suitable devices for such tests? I assume that we'd need physical devices to test this; after all, this PR's changes are aimed at handling OS behavior that'd be very difficult to emulate directly.

Yeah, I am struggling with this. We do have a couple of old devices in the Keyman office there don't we?

Or... is that why it's been given approval - for a 'test in the wild' on alpha? Not the sort of thing we'd usually do, so unless explicitly stated otherwise, I'll avoid merging for now.

I'm not sure what else we can do here. So if possible, test on those old devices in the office there. And then go ahead and merge.

mcdurdin · 2022-08-16T04:54:11Z

(Of course, the alternative is to avoid timing-based approaches altogether, and go with a 'number of iterations' approach where we determine a suitable number based on empirical testing of what would be a typically acceptable response time on some old devices. (And then as devices get faster over time, we can gradually raise the threshold.)

keyman-server · 2022-09-10T18:02:03Z

Changes in this pull request will be available for download in Keyman version 16.0.60-alpha

fix(web): enhanced timer for prediction alg

af219de

jahorton added this to the A16S7 milestone Aug 4, 2022

keymanapp-test-bot bot added the user-test-missing User tests have not yet been defined for the PR label Aug 4, 2022

github-actions bot added common/ common/web/ fix web/ labels Aug 4, 2022

jahorton mentioned this pull request Aug 4, 2022

bug(common/models): predictions are not always giving useful results #6985

Closed

keymanapp-test-bot bot removed the user-test-missing User tests have not yet been defined for the PR label Aug 4, 2022

jahorton mentioned this pull request Aug 4, 2022

fix(web): button, float init timer cleanup #7036

Merged

mcdurdin modified the milestones: A16S7, A16S8 Aug 7, 2022

chore(web): Merge branch 'fix/web/ui-module-test-stability' into fix/…

050bf51

…web/pred-text-stability

jahorton marked this pull request as ready for review August 8, 2022 04:58

jahorton requested review from ermshiperete and mcdurdin as code owners August 8, 2022 04:58

mcdurdin requested changes Aug 8, 2022

View reviewed changes

chore(web): tweaks, docs for execution-timer inner-class

814a492

jahorton commented Aug 9, 2022

View reviewed changes

mcdurdin approved these changes Aug 12, 2022

View reviewed changes

Base automatically changed from fix/web/ui-module-test-stability to master August 16, 2022 02:56

mcdurdin mentioned this pull request Aug 16, 2022

feat(core): add sample file, update union #5015 🙀 #7071

Merged

mcdurdin removed this from the A16S8 milestone Aug 19, 2022

mcdurdin added this to the A16S9 milestone Aug 19, 2022

mcdurdin mentioned this pull request Aug 23, 2022

chore: merge master into feature-ldml #7104

Merged

mcdurdin modified the milestones: A16S9, A16S10 Sep 5, 2022

jahorton mentioned this pull request Sep 8, 2022

fix(common/models): fixes quote-adjacent pred-text suggestions #7205

Merged

mcdurdin merged commit 13a3df3 into master Sep 9, 2022

mcdurdin deleted the fix/web/pred-text-stability branch September 9, 2022 19:45

jahorton mentioned this pull request Sep 14, 2022

bug(web): Prediction lag on the Suggestion bar #7283

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(web): enhanced timer for prediction algorithm #7037

fix(web): enhanced timer for prediction algorithm #7037

jahorton commented Aug 4, 2022 •

edited

Loading

keymanapp-test-bot bot commented Aug 4, 2022 •

edited

Loading

jahorton commented Aug 5, 2022

jahorton commented Aug 8, 2022

mcdurdin commented Aug 8, 2022

mcdurdin left a comment

mcdurdin Aug 8, 2022

jahorton Aug 9, 2022 •

edited

Loading

mcdurdin Aug 9, 2022

jahorton Aug 9, 2022 •

edited

Loading

mcdurdin Aug 9, 2022

jahorton commented Aug 16, 2022

mcdurdin commented Aug 16, 2022

mcdurdin commented Aug 16, 2022

keyman-server commented Sep 10, 2022

		@@ -582,17 +644,22 @@ namespace correction {

		let batcher = new BatchingAssistant();

		const timer = new ExecutionTimer(maxTime*3, maxTime);

fix(web): enhanced timer for prediction algorithm #7037

fix(web): enhanced timer for prediction algorithm #7037

Conversation

jahorton commented Aug 4, 2022 • edited Loading

keymanapp-test-bot bot commented Aug 4, 2022 • edited Loading

User Test Results

Test Artifacts

jahorton commented Aug 5, 2022

jahorton commented Aug 8, 2022

mcdurdin commented Aug 8, 2022

mcdurdin left a comment

Choose a reason for hiding this comment

mcdurdin Aug 8, 2022

Choose a reason for hiding this comment

jahorton Aug 9, 2022 • edited Loading

Choose a reason for hiding this comment

mcdurdin Aug 9, 2022

Choose a reason for hiding this comment

jahorton Aug 9, 2022 • edited Loading

Choose a reason for hiding this comment

mcdurdin Aug 9, 2022

Choose a reason for hiding this comment

jahorton commented Aug 16, 2022

mcdurdin commented Aug 16, 2022

mcdurdin commented Aug 16, 2022

keyman-server commented Sep 10, 2022

jahorton commented Aug 4, 2022 •

edited

Loading

keymanapp-test-bot bot commented Aug 4, 2022 •

edited

Loading

jahorton Aug 9, 2022 •

edited

Loading

jahorton Aug 9, 2022 •

edited

Loading