Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deterministic find_distributed_partition (non-set) #529

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

matthiasdiener
Copy link
Collaborator

@matthiasdiener matthiasdiener commented Jul 25, 2024

abandon (almost) all sets ye who enter here

Closes #465.
Closes #498.

Things to test:

  • is the orderedset change to DirectPredecessorsGetter necessary? Edit: In my tests, it didn't seem to have made a difference, but from the code structure, it appears that DirectPredecessorsGetter needs to be deterministic too; changed from orderedsets to unique tuples in 076a76e

Please squash

@matthiasdiener matthiasdiener self-assigned this Jul 25, 2024
@matthiasdiener
Copy link
Collaborator Author

This is ready for a first look @inducer.

@matthiasdiener matthiasdiener changed the title Deterministic find_distributed_partition v2 Deterministic find_distributed_partition (non-set) Jul 26, 2024
Copy link
Owner

@inducer inducer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a look, just a few minor nits. Looks good to go from my perspective.

@@ -826,7 +772,7 @@ def find_distributed_partition(
raise comm_batches_or_exc

comm_batches = cast(
Sequence[AbstractSet[CommunicationOpIdentifier]],
Sequence[list[CommunicationOpIdentifier]],
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Sequence[list[CommunicationOpIdentifier]],
Sequence[Collection[CommunicationOpIdentifier]],

maybe?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the cast completely in 168ef53

We only consider the predecessors of a nodes in a data-flow sense.
"""
def _get_preds_from_shape(self, shape: ShapeType) -> frozenset[Array]:
return frozenset({dim for dim in shape if isinstance(dim, Array)})
def _get_preds_from_shape(self, shape: ShapeType) -> abc_Set[Array]:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _get_preds_from_shape(self, shape: ShapeType) -> abc_Set[Array]:
def _get_preds_from_shape(self, shape: ShapeType) -> AbstractSet[Array]:

(from typing)? I'm not sure mypy understands the collections.abc types well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may have been fixed with 076a76e

Comment on lines 858 to 859
output_arrays = tuple(unique(outputs._data.values()))
mso_arrays = tuple(unique(materialized_arrays + sent_arrays + output_arrays))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these might be fine left as lists (as returned from) unique, typed as Sequence to have mypy flag attempted mutation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand - could you please clarify? (unique doesn't return a list)

@matthiasdiener matthiasdiener marked this pull request as ready for review August 14, 2024 19:43
@matthiasdiener
Copy link
Collaborator Author

matthiasdiener commented Aug 14, 2024

This is ready for another look @inducer. It seems to work fine with the main version of pytato, but when merging it with our production branch, the execution is not deterministic.

@matthiasdiener
Copy link
Collaborator Author

but when merging it with our production branch, the execution is not deterministic.

Nevermind, this seems to have been a merge error.

@matthiasdiener
Copy link
Collaborator Author

matthiasdiener commented Sep 28, 2024

This is ready for review @inducer. The current version shows the same performance as the baseline, and I have not seen any determinism-related issues when using this PR.

@matthiasdiener
Copy link
Collaborator Author

As far as I can see, the mypy errors are unrelated to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants