Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: break fraudAssessment into evaluations #442

Merged
merged 21 commits into from
Jan 27, 2025

Conversation

bajtos
Copy link
Member

@bajtos bajtos commented Jan 9, 2025

Break Measurement.fraudAssessment into two new fields:

  • taskingEvaluation for the tasking-related checks
    Example: DUP_INET_GROUP

  • consensusEvaluation for the result of the majority seeking process based on committees.
    Example: MINORITY_RESULT

After this change, places filtering "accepted" measurements have to explicitly spell out how they define "accepted".

  • Some places are interested in tasking evaluation results only and consider minority results as "accepted" too. Example: RSR calculated from individual measurements.

  • Other places are stricter and want only measurements in majority. Example: which measurements to reward.

However, this pull request is intended to be pure refactoring with no change in the functionality. It should simply expand the check fraudAssessment === 'OK' into taskingEvaluation === 'OK' && consensusEvaluation === 'MAJORITY_RESULT' and surface the places where we may want to include minority measurements too. Such changes can be implement in follow-up pull requests.

This is a follow-up for #396 (comment)

See also #439

Break `Measurement.fraudAssessment` into two new fields:

- `taskingEvaluation` for the tasking-related checks
  Example: DUP_INET_GROUP

- `majorityEvaluation` for the result of the majority seeking
  process based on committees.
  Example: MINORITY_RESULT

After this change, places filtering "accepted" measurements have to
explicitly spell out how they define "accepted".

- Some places are interested in tasking evaluation results only
  and consider minority results as "accepted" too. Example: RSR
  calculated from individual measurements.

- Other places are stricter and want only measurements in majority.
  Example: which measurements to reward.

Signed-off-by: Miroslav Bajtoš <[email protected]>
Signed-off-by: Miroslav Bajtoš <[email protected]>
Copy link
Member Author

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧵 TODO items blocking the conversion from "draft" to "ready" 👇🏻

lib/evaluate.js Outdated Show resolved Hide resolved
lib/evaluate.js Outdated Show resolved Hide resolved
bin/evaluate-measurements.js Outdated Show resolved Hide resolved
lib/round.js Show resolved Hide resolved
Copy link
Member Author

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More TODOs 👇🏻

lib/platform-stats.js Outdated Show resolved Hide resolved
@@ -81,8 +81,9 @@ export const buildRetrievalStats = (measurements, telemetryPoint) => {
const endAt = m.end_at
const ttfb = startAt && firstByteAt && (firstByteAt - startAt)
const duration = startAt && endAt && (endAt - startAt)
const isAccepted = m.taskingEvaluation === 'OK' && m.majorityEvaluation === 'OK'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: check if we need to add a test

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in c08c10a

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To double check: We will now be calling measurements that will be rewarded "accepted measurements", right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is a good call! 🤔

I want to call measurements that will be rewarded "measurementsToReward"; that's what I use inside the main evaluation loop.

Everywhere else, we need to decide whether we are interested in measurements that passed tasking evaluation but may be in the minority, or whether we are interested in measurements that will be rewarded (passed the tasking evalution and are in majority).

In that light, maybe it would be better to avoid using the term "accepted measurements" altogether. Or qualify it as "measurements accepted by tasking"?

I don't have a good answer 😞

lib/retrieval-stats.js Outdated Show resolved Hide resolved
@bajtos bajtos marked this pull request as ready for review January 10, 2025 09:43
bin/dry-run.js Outdated Show resolved Hide resolved
bin/evaluate-measurements.js Outdated Show resolved Hide resolved
bajtos and others added 3 commits January 10, 2025 15:23
Co-authored-by: Julian Gruber <[email protected]>
Signed-off-by: Miroslav Bajtoš <[email protected]>
Copy link
Member

@juliangruber juliangruber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

lib/evaluate.js Outdated Show resolved Hide resolved
lib/evaluate.js Outdated Show resolved Hide resolved
lib/platform-stats.js Outdated Show resolved Hide resolved
lib/preprocess.js Outdated Show resolved Hide resolved
@@ -81,8 +81,9 @@ export const buildRetrievalStats = (measurements, telemetryPoint) => {
const endAt = m.end_at
const ttfb = startAt && firstByteAt && (firstByteAt - startAt)
const duration = startAt && endAt && (endAt - startAt)
const isAccepted = m.taskingEvaluation === 'OK' && m.majorityEvaluation === 'OK'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To double check: We will now be calling measurements that will be rewarded "accepted measurements", right?

@juliangruber
Copy link
Member

Feel free to land this work next week during my vacation, without further review by me

bajtos and others added 2 commits January 13, 2025 17:23
Copy link
Contributor

@NikolasHaimerl NikolasHaimerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I'll approve once Julians changes have been addressed.

@bajtos bajtos requested a review from juliangruber January 15, 2025 12:41
@bajtos
Copy link
Member Author

bajtos commented Jan 21, 2025

@juliangruber @NikolasHaimerl the pull request is ready for the final review 👀

@bajtos bajtos requested a review from NikolasHaimerl January 21, 2025 14:08
lib/committee.js Outdated Show resolved Hide resolved
lib/typings.d.ts Outdated Show resolved Hide resolved
Signed-off-by: Miroslav Bajtoš <[email protected]>
@bajtos bajtos requested a review from juliangruber January 21, 2025 14:59
lib/committee.js Outdated Show resolved Hide resolved
…und, retrievalMajorityFound

Signed-off-by: Miroslav Bajtoš <[email protected]>
@bajtos bajtos requested a review from pyropy as a code owner January 22, 2025 09:02
@bajtos bajtos requested a review from juliangruber January 22, 2025 09:04
@bajtos
Copy link
Member Author

bajtos commented Jan 23, 2025

@NikolasHaimerl would you like to review the final version or can I go ahead and land the PR?

@@ -241,6 +241,43 @@ describe('evaluate', async function () {
assertPointFieldValue(point, 'total_nodes', '3i')
})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nitpick -- should rename this file from evaluate.js to evaluate.test.js to make things consistent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we have a bit of inconsistency in file naming. I am fine to rename it. Let's keep such changes out of the scope of this pull request, please.

@@ -22,8 +22,10 @@ export class Measurement {
// Note: providerId is recorded by spark-publish but we don't use it for evaluations yet
this.providerId = pointerize(m.provider_id)
this.spark_version = pointerize(m.spark_version)
/** @type {import('./typings.js').FraudAssesment} */
this.fraudAssessment = null
/** @type {import('./typings.js').TaskingEvaluation} */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since initial value is set to null by default we might should set property type to be optional, maybe something like:

/** @type {import('./typings.js').TaskingEvaluation?} */

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed 👍

This pull request is not making this any worse, so let's leave such improvement out of the scope, please.

@bajtos bajtos merged commit 0f6d5bf into main Jan 27, 2025
6 checks passed
@bajtos bajtos deleted the refactor-fraud-assessment-step2 branch January 27, 2025 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ done
Development

Successfully merging this pull request may close these issues.

4 participants