Provide a way for sets to zipper with more things #26595

lydia-duncan · 2025-01-23T23:27:36Z

Prior to this work, sets could only zipper with identical other sets (or basically just itself). Zippering with sets of the same length, or with arrays, would result in errors about unequal zippering lengths.

Resolves #15824

Adds a leader/follower pair of iterators to ChapelHashtable to support this effort. These iterators and their support functions allow the hashtable to evenly divide the elements stored in it, instead of evenly dividing the entire hashtable's storage capacity.

While here, I fixed a comment that was missing a closing parenthesis.

New tests:

Added a version of setZipDifferentSize.chpl that reversed the order, to show that it still worked
Added a future test of zippering with an array leader that is the same length as the set
Added a future test of zippering with an array follower that is the same length as the set
Added a test of zippering with an array leader that is shorter than the set
Added a future test of zippering with an array follower that is shorter than the set
Added a test of zippering two sets with different contents (basically, David Longnecker's original test from Zippering of two sets can fail even when they are the same length #15824)
Added a test of zippering an array with the set's toArray result (user code, but not expected to have changed as a result of this effort)

Places for improvement:

It currently iterates over the whole backing hashtable for each follower. Ideally we'd iterate over it once and store the location of all filled elements somewhere helpful, but that's a problem for another time

Passed a full paratest with futures

Adds a pair of leader/follower iterators to ChapelHashtable and a helper method, and calls them from the Set `these` iterators. The implementation does frequent recomputation of how to divide up the iteration space to find full slots. This can probably be improved on, but I wanted to take a snapshot of what worked before attempting it. It also does not help when zippering with arrays yet. That is also a next step ---- Signed-off-by: Lydia Duncan <[email protected]>

Test comes from issue chapel-lang#15824 and checks zippering two sets of the same length (but different contents). Sorts the output so that it is consistent instead of potentially racy. ---- Signed-off-by: Lydia Duncan <[email protected]>

---- Signed-off-by: Lydia Duncan <[email protected]>

This is a first step towards being able to zip sets with other types instead of just other sets ---- Signed-off-by: Lydia Duncan <[email protected]>

This meant always computing the number of chunks and which chunk we were ourselves, since the array iterator expects the other components of the tuple to be ranges as well (so we can't use followThis to send that information) and the argument list for the follower is supposed to match the argument list of the leader aside from followThis. This is unfortunate, but given the sizes I checked, seemed to be okay. ---- Signed-off-by: Lydia Duncan <[email protected]>

Covers: - converting the set to an array and then zippering - zippering with the array leading - zippering with an array when the set leads With the prior commit, all three work (though toArray already worked prior to this effort). ---- Signed-off-by: Lydia Duncan <[email protected]>

The original is only failing on one of the two machines I checked, so I'm curious if this also fails there or if it behaves as expected (it passes on the machine where the original still works) ---- Signed-off-by: Lydia Duncan <[email protected]>

---- Signed-off-by: Lydia Duncan <[email protected]>

I'm not sure if this will actually allow us to check for unequal lengths, but try it and see ---- Signed-off-by: Lydia Duncan <[email protected]>

This argument indicates the size of the leader in the iteration, so that we can validate if the iteration space is uneven. It has a default value so that the non-zippered case is not as annoying to write Update various tests to use this iterator instead ---- Signed-off-by: Lydia Duncan <[email protected]>

- document new iterator - remove commented out writelns - restore old behavior to Set's these iterators, since we're using a different iterator - comment an if statement in _determineEvenChunks - close a parenthetical in an older comment in ChapelHashtable ---- Signed-off-by: Lydia Duncan <[email protected]>

It'll present a cleaner interface to users ---- Signed-off-by: Lydia Duncan <[email protected]>

Covers: - Zippering with a shorter array as leader - Zippering with a shorter array as follower - Serial version of the new iterator - Standalone version of the new iterator ---- Signed-off-by: Lydia Duncan <[email protected]>

---- Signed-off-by: Lydia Duncan <[email protected]>

…f it Would have added a warning to all the overloads, except for the bug reported in all the right contexts, whether the bug or the unstable warning gets resolved first. ---- Signed-off-by: Lydia Duncan <[email protected]>

Use a halt wrapper instead of `__primitive("chpl_error"`, especially because this is a new error instead of trying to match errors that the compiler can insert. While doing that, made the error messages clearer about what was wrong and what should be done instead (and hopefully head off confusion for the follower case). And add a test locking in the error message in the leader case ---- Signed-off-by: Lydia Duncan <[email protected]>

modules/standard/Set.chpl

---- Signed-off-by: Lydia Duncan <[email protected]>

bradcray · 2025-01-27T22:31:34Z

Capturing my thoughts after seeing a demo of this today:

I think it'd be preferable to have the set.these() iterator support this kind of zippering than to rely on a new iterator that takes the size explicitly
The case where a leader set is smaller than the follower seems like an instance of zippered forall loops with size mismatches can silently drop iterations on the floor #11428 and a general problem with our leader-follower framework rather than something we should worry about solving specifically for sets
- this would cause a change in behavior for test/library/standard/Set/these/setZipDifferentSize.chpl, but I think that's acceptable since (a) it's a case that doesn't "work" today anyway (in the sense that it causes a halt) and (b) cases that ought to work today like the one in Zippering of two sets can fail even when they are the same length #15824 don't. That is, I think the bug fix trumps the halt.
- if a user wanted to zip two sets safely/correctly, I'd advise them to assert the two set sizes are equal before doing the zippering rather than using the .contents() iterator
I'd propose having the set's leader simply invoke the range's leader on 0..<size rather than recreating that logic

Some argued that since a set yields its elements arbitrarily, sets shouldn't even support zippering. I see that point (and started with that mindset myself), but think that the presence of #15824 suggests that people may want that anyway, and it also supports things like copying a set into an array when you don't care about ordering (or haven't until wanting to put it into the array).

lydia-duncan · 2025-01-28T18:58:13Z

I'd propose having the set's leader simply invoke the range's leader on 0..<size rather than recreating that logic

Interestingly, it looks as though ranges follow a different way of distributing uneven chunks than I was expecting, and this can result in the current implementation missing values from the set.

For a range of 10 elements with 6 tasks, the chunks will be [0..1, 2..3, 4..4, 5..6, 7..8, 9..9], instead of [0..1, 2..3, 4..5, 6..7, 8..8, 9..9] like I was expecting, meaning one element of the set gets printed twice and another element gets dropped entirely. I need to double check that this doesn't mean arrays/domains with the same size/tasks combo will encounter the same issue, and if so adjust the follower accordingly

bradcray · 2025-01-29T01:30:20Z

Could you summarize the follower's approach using pseudocode?

lydia-duncan · 2025-01-29T15:03:50Z

I think I need to limit the scope back down to only "sets can zip with sets". I did encounter the same issue with arrays of that same size/number of tasks and the attempts to solve it so far have been running into walls.

While I can use the range leader to get the chunk divisions when in the set/hashtable leader, trying to use it during the follower to determine how to divide up the slots gives a different answer (I'm pretty sure because when we're in the follower, the iterator understandably thinks there are a different number of tasks available to do parallel iteration). I'm not sure how to solve that without sending more information between the leader and the follower.

Additionally, while I was trying to do that, I encountered several issues when trying to deal with arrays of ranges. I'm going to write them up separately, but with scaling down the scope, they won't be as relevant

lydia-duncan · 2025-01-29T15:31:40Z

Could you summarize the follower's approach using pseudocode?

Here's the word problem description of it, will write up better pseudocode once I de-ice my driveway

// determine the number of elements to return from the followThis argument (size of the range)

// extrapolate from that number the number of expected chunks (basically, solve for x in totalElements = elements in this chunk * x)
// based on where the followThis range starts, guess which chunk we are referring to in order to determine where to start looking for elements to return (divide the starting location by the chunk size to see where we start)

// then go through and determine the range of slots needed to find each chunk of elements to return. The set up when I made this PR basically assumed a known threshold point where the number of elements in a chunk would change (the idea for this was based on the parIters primer). So for 10 elements and 6 tasks, we know that 6 divides 10 once, with a remainder of 4, so four tasks would take 2 elements and the remaining 2 would take 1 element.

// So iterate over the slots in the hashtable, storing the point at which each threshold would be matched (2, 4, 6, 8, 9), except for the last one which is known to be "end of the table". Front-loading the tasks that take more to the beginning allowed us to mathematically compute when to stop looking for a second element - once we have filled "remainder" number of slots, we will look for the lower number of elements.

// Finishing that iteration gives us the upper bound for each task's range of elements in an array, except for the last task (which again, will be the end of the table). E.g., for a hashtable with 25 slots, this could look like [2, 6, 8, 13, 17], which would give us ranges of 0..2, 3..6, 7..8, 9..13, 14..17, 18..24 to find the full slots.

// To determine which of those ranges this task should actually use, go back to our guess at the chunk based on the lower bound for the element number to return. If we are task 3 (when the tasks are 0 based), for instance, the followThis would be 6..7, the chunk size is 2, and 6 divided by 2 is 3, so we have determined which task we are and will return all elements in 9..13

(Edit: note that once we get to actual tasks 4 and 5, the chunk size would be 1. But since our starting location was expected to be mathematically computable and have a specific threshold where it switches, this won't pose a problem. A chunk size of 1 will lead to every element having their own subtask, and our starting element number would be 8 and 9, so we'd act as though we were task 8 and task 9, which are the last two chunks anyways).

lydia-duncan · 2025-01-29T16:22:54Z

// So iterate over the slots in the hashtable, storing the point at which each threshold would be matched (2, 4, 6, 8, 9), except for the last one which is known to be "end of the table". Front-loading the tasks that take more to the beginning allowed us to mathematically compute when to stop looking for a second element - once we have filled "remainder" number of slots, we will look for the lower number of elements.

This is the part that needs better pseudocode.

// save the minimum number of elements for each task.
// save the number of tasks that will have one additional element.

// make an array to store the upper bounds of all tasks except the
// last one (which will use the end of the hashtable as its upper bound).

// track our position in the upper bounds array.
// track the number of elements we've actually seen.

   for i in `the whole hashtable` {
//    increment our number of seen elements, if we found one.

//    check if we've filled the upper bounds array yet or found the last element
//         early exit the loop if so

//    otherwise, check if our position in the upper bounds array is less than the number of tasks which will have an additional element.
//         if so, our threshold to store is `minimum number per task * (position in upper bounds array + 1)  + (position in upper bounds array + 1)`.
//             store the location we met that threshold in the upper bounds array
//             update our position in the upper bounds array

//         if not, our threshold to store is `minimum number per task * (position in upper bounds array + 1) + remainder`.
//             store the location we met that threshold in the upper bounds array
//             update our position in the upper bounds array
    }

We now have all the start and end points for tasks to find each element evenly. They are 0, each element in the upper bounds array, the end of the hashtable's storage. E.g., for a hashtable with 25 slots, the upper bounds array could look like [2, 6, 8, 13, 17], which would give us ranges of 0..2, 3..6, 7..8, 9..13, 14..17, 18..24 to find the full slots.

This reverts commit 09a4bca. Partial change, but seemed like the natural way to do that part ---- Signed-off-by: Lydia Duncan <[email protected]>

…roperly ---- Signed-off-by: Lydia Duncan <[email protected]>

- Replace uses of the now-defunct contents iterator with the these iterator - Drop the tests that were explicitly only for contents (serial, standalone, and the error message test on the argument) - Futurize the tests of zippering with arrays - Add a future for zippering with a longer array - Add two futures of zippering with ranges Ranges and arrays don't work with the restored strategy (but they didn't always work with the previous one, as demonstrated by the range and longer array tests) ---- Signed-off-by: Lydia Duncan <[email protected]>

withRange1's behavior is inconsistent ---- Signed-off-by: Lydia Duncan <[email protected]>

---- Signed-off-by: Lydia Duncan <[email protected]>

bradcray · 2025-01-30T18:20:30Z

To avoid making any sort of assumptions about what the leader is passing us / what we're zippering with, I'd suggest going for simplicity and correctness over performance in a first implementation for the set follower iterator. For example, if I'm asked to follow lo..hi, I'd start from the beginning of the set, iterate through "full" slots, counting them, until I hit slot # lo, then yield and continue until I hit slot # hi, then exit (if the leader's range is strided, this would also need to skip every 's' elements). This is obviously very serial, and worse, O(size) for the last follower, but since sets aren't distributed (so will only be following #cores tasks at most, typically), it may not be the end of the world for many cases. I may also be positing that many sets may not be super-huge.

For more scalable approaches, my mind goes to things like having some sort of directory structure within the set / hash table that (lazily?) maintains some sort of rough sense of where items live, for example, through a computed-on-demand map, a scan operation, or the like. This problem also seems very equivalent to the kind of problem Jeremiah was wrestling with in the ParallelIO module: I want to read a file in parallel, but don't know what offset task # i of n should start at since the "rows" of a file (say) aren't all of equal size. I can't recall what approach he took there, and felt similarly like it was very challenging to parallelize, but remember being convinced that what he ended up with was sound (I think? Though it worries me that I can't remember it).

But again, I'd prefer to have it working correctly and flexibly first, and then worry about performance in a follow-on effort.

benharsh

The code as-written looks good to me, but I'm inclined to agree with Brad about focusing on simplicity and correctness.

lydia-duncan · 2025-01-30T19:17:24Z

To avoid making any sort of assumptions about what the leader is passing us / what we're zippering with, I'd suggest going for simplicity and correctness over performance in a first implementation for the set follower iterator. For example, if I'm asked to follow lo..hi, I'd start from the beginning of the set, iterate through "full" slots, counting them, until I hit slot # lo, then yield and continue until I hit slot # hi, then exit (if the leader's range is strided, this would also need to skip every 's' elements). This is obviously very serial, and worse, O(size) for the last follower, but since sets aren't distributed (so will only be following #cores tasks at most, typically), it may not be the end of the world for many cases. I may also be positing that many sets may not be super-huge.

Hmm, to be fair, that's both simpler and probably better performance than what I had implemented. I'll give it a shot and see where it gets me, thanks!

Brad pointed out that we could just count through the filled elements of the hashtable until we reach the count specified by the `low` value, and then start yielding. This is more performant than what I had been doing and it allows more flexibility, enabling it to actually work with ranges and arrays The one drawback is that it does mean we don't detect when the leader is shorter than the follower, but that's already true for arrays so we'll at least be consistent with them. ---- Signed-off-by: Lydia Duncan <[email protected]>

- Updates the expected behavior for setZipDifferentSize.chpl and arrayShorter.chpl - Removes the .bad and .future files for the tests that now work - Adds a .prediff I missed for the second range test because the test wasn't working before - Updates the expected output for withArrayLonger1.chpl, again because the test wasn't working before ---- Signed-off-by: Lydia Duncan <[email protected]>

lydia-duncan · 2025-01-30T21:36:43Z

Okay, it looks like that does result in us failing to detect when the leader is shorter than the follower, but we already said that was okay because it was consistent with what arrays do. Running a paratest now, but otherwise I think this is ready for another look (sorry Ben XD)

lydia-duncan added 16 commits January 10, 2025 15:30

Make the leader actually parallel (oops)

a62e2ca

---- Signed-off-by: Lydia Duncan <[email protected]>

Make the chunk yielded reflect the number of elements we want each round

529c1c1

This is a first step towards being able to zip sets with other types instead of just other sets ---- Signed-off-by: Lydia Duncan <[email protected]>

Catch case where leader has more elements than follower, consistently

602bc55

---- Signed-off-by: Lydia Duncan <[email protected]>

Try sending the number of slots in explicitly

a11a0ea

I'm not sure if this will actually allow us to check for unequal lengths, but try it and see ---- Signed-off-by: Lydia Duncan <[email protected]>

Use size as the default instead of the internal information

375e1aa

It'll present a cleaner interface to users ---- Signed-off-by: Lydia Duncan <[email protected]>

Add some more tests

336dd2c

Covers: - Zippering with a shorter array as leader - Zippering with a shorter array as follower - Serial version of the new iterator - Standalone version of the new iterator ---- Signed-off-by: Lydia Duncan <[email protected]>

Merge branch 'main' into investigateSetZip

422a353

---- Signed-off-by: Lydia Duncan <[email protected]>

lydia-duncan added area: Libraries / Modules area: Tests / Benchmarks stat: Needs Design Review stat: Needs Code Review labels Jan 23, 2025

benharsh approved these changes Jan 24, 2025

View reviewed changes

modules/standard/Set.chpl Outdated Show resolved Hide resolved

Update the unstable warning, per Ben's feedback

24404fe

---- Signed-off-by: Lydia Duncan <[email protected]>

lydia-duncan mentioned this pull request Jan 27, 2025

leader/follower improvements aka leader/follower 2.0 #11505

Open

Revert "Get the set/hashtable iterator to work with arrays as well"

719eb26

This reverts commit 09a4bca. Partial change, but seemed like the natural way to do that part ---- Signed-off-by: Lydia Duncan <[email protected]>

lydia-duncan added 3 commits January 29, 2025 14:55

Drop the contents iterator and call the updated ChapelHashtable one p…

f2d94ed

…roperly ---- Signed-off-by: Lydia Duncan <[email protected]>

Remove another test I forgot about and the .bad for withRange1

c5b77fe

withRange1's behavior is inconsistent ---- Signed-off-by: Lydia Duncan <[email protected]>

lydia-duncan mentioned this pull request Jan 30, 2025

[Unimplemented feature]: Support zippering sets with other types #26627

Closed

Use the issue number for the futures

6438c6b

---- Signed-off-by: Lydia Duncan <[email protected]>

lydia-duncan marked this pull request as ready for review January 30, 2025 17:42

Merge branch 'main' into investigateSetZip

7267651

---- Signed-off-by: Lydia Duncan <[email protected]>

lydia-duncan changed the title ~~Provide a way for sets to zipper with more things~~ Provide a way for sets to zipper with other sets Jan 30, 2025

lydia-duncan removed the stat: Needs Design Review label Jan 30, 2025

benharsh reviewed Jan 30, 2025

View reviewed changes

lydia-duncan added 2 commits January 30, 2025 13:27

lydia-duncan changed the title ~~Provide a way for sets to zipper with other sets~~ Provide a way for sets to zipper with more things Jan 30, 2025

lydia-duncan requested a review from benharsh January 31, 2025 16:01

benharsh approved these changes Jan 31, 2025

View reviewed changes

lydia-duncan added stat: Ready to Merge and removed stat: Needs Code Review labels Jan 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a way for sets to zipper with more things #26595

Provide a way for sets to zipper with more things #26595

lydia-duncan commented Jan 23, 2025 •

edited

Loading

bradcray commented Jan 27, 2025

lydia-duncan commented Jan 28, 2025

bradcray commented Jan 29, 2025

lydia-duncan commented Jan 29, 2025

lydia-duncan commented Jan 29, 2025 •

edited

Loading

lydia-duncan commented Jan 29, 2025

bradcray commented Jan 30, 2025

benharsh left a comment

lydia-duncan commented Jan 30, 2025

lydia-duncan commented Jan 30, 2025

Provide a way for sets to zipper with more things #26595

Are you sure you want to change the base?

Provide a way for sets to zipper with more things #26595

Conversation

lydia-duncan commented Jan 23, 2025 • edited Loading

bradcray commented Jan 27, 2025

lydia-duncan commented Jan 28, 2025

bradcray commented Jan 29, 2025

lydia-duncan commented Jan 29, 2025

lydia-duncan commented Jan 29, 2025 • edited Loading

lydia-duncan commented Jan 29, 2025

bradcray commented Jan 30, 2025

benharsh left a comment

Choose a reason for hiding this comment

lydia-duncan commented Jan 30, 2025

lydia-duncan commented Jan 30, 2025

lydia-duncan commented Jan 23, 2025 •

edited

Loading

lydia-duncan commented Jan 29, 2025 •

edited

Loading