Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Scan Intrinsics #290

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

everythingfunctional
Copy link
Member

This an attempt at #273.

Open questions:

  • How to spell the name? (Straw vote for plenary)
  • Should the initial implementation be restricted to 1D arrays?
    • This would eliminate the DIM argument and make it easier to describe.
    • Behavior for other ranks is not difficult to implement outside of scan (as illustrated by the descriptions in the first draft)
    • Support for additional ranks could be added later without breaking backwards compatibility
  • Should the initial implementation have an optional argument to perform a segmented scan?
    • There are different ways of specifying the segments that need to be considered in designing the interface
    • Support for it could be added later without breaking backwards compatibility

@certik
Copy link
Member

certik commented Jan 10, 2023

Awesome, thanks @everythingfunctional !

@klausler
Copy link

I probably missed it, but I didn't see a way for exclusive prefix scans to specify the identity values of their operations, which are needed to define the first value in their result sequence. Your examples return 1 for the exclusive scan with MY_MULT but I don't see how the language could know that 1 is its identity.

@FortranFan
Copy link
Member

Re: the spelling of the name, what is your preference and those who have requested for it?

@w6ws
Copy link

w6ws commented Jan 11, 2023 via email

@everythingfunctional
Copy link
Member Author

I probably missed it, but I didn't see a way for exclusive prefix scans to specify the identity values of their operations, which are needed to define the first value in their result sequence. Your examples return 1 for the exclusive scan with MY_MULT but I don't see how the language could know that 1 is its identity.

The scan_exclusive procedure takes an additional argument (IDENTITY). The example provides that as an argument.

Re: the spelling of the name, what is your preference and those who have requested for it?

The preference is scan, but of course that's taken and is ambiguous in terms of is it inclusive or exclusive (unless of course we have an optional argument to select between the two, which is I suppose an possibility).

@w6ws , thanks for the feedback. I have had one or two others suggest looking at HPF for inspiration. I think the operator specific variants do have potential for better performance, but I'm not sure they really are any more compile-time/type safe. The compiler is already obligated to check the type of the array argument vs the operation provided, so I don't think there's chance of a user getting it wrong. I'm not opposed to adding the operator specific variants, but perhaps as a next iteration/separate paper. I'm leaning towards restricting to 1D to start simpler and not including segmented, again to keep things simpler.

@w6ws
Copy link

w6ws commented Jan 13, 2023 via email

@everythingfunctional
Copy link
Member Author

First, thank you all for your comments and suggestions. They have revealed some aspects that I had not initially considered. I would like to try and address some of the various comments.

The most common suggestion has amounted to something like, "Why not just do exactly what HPF did?"

Looking at what HPF did has been very valuable as a reference. I can certainly understand vendors wanting to be able to reuse existing work, but I want to still explore whether HPF necessarily did it perfectly, or whether we could potentially do better now. If the way we choose to do it now is slightly different than HPF, it doesn't mean that prior work is not reusable, just that it may require some modification.

HPF provided the combinatorial set of operations with {PRE,SUF}FIX specific procedures, with optional argument to determine inclusive or exclusive. With the recent sentiments expressed that all new features should have compelling use cases, there is much more burden to justifying each one individually. By providing a generic procedure that is applicable for all operations, only a few use cases are needed to justify it, including use cases not covered by the operation specific versions.

It has been suggested that the operation specific versions mean that the code can be type checked, but the compiler absolutely could type check the generic version. It is required the ARRAY and OPERATION have the same types, which the compiler can see at the call site and enforce. This is just like the generic REDUCE function. I will note that this does mean COUNT_{PRE,SUF}FIX is not possible with the generic version. A slight change to the description and arguments could re-enable it though. I.e.

PURE FUNCTION OPERATION(x, y) RESULT(res)
TYPE(<type_of_identity_argument>), INTENT(in) :: x
TYPE(<type_of_array_argument>), INTENT(in) :: y
TYPE(<type_of_identity_argument>) :: res
END FUNCTION

and make IDENTITY a required argument.

HPF provided {PRE/SUF}FIX functions, with an optional argument to do inclusive vs exclusive. It would be reasonable to do the opposite arrangement, have {IN/EX}CLUSIVE functions with an optional argument to do forward vs backward iteration. Or even just single functions with optional arguments for both, or individual functions for each. Each option has certain advantages and disadvantages that should be considered.

HPF defined that the result of a {PRE/SUF}FIX function has the same shape as the array argument, regardless of the presence of a MASK argument, and that elements of the result for which no elements of the input contribute there is a "default" value with which that element is defined. This works for the HPF functions because all the specified operations either have a meaningful "default" value, or in the case of COPY, don't allow a mask or exclusive argument (i.e. no chance of zero elements contributing). This is not necessarily possible in the generic case. The generic REDUCE function overcomes this with an IDENTITY argument. My initial thought though was that the behavior would be more like SUM_PREFIX(ARRAY, MASK) == SUM_PREFIX(PACK(ARRAY, MASK)). I'm not sure I know enough about various use cases to decide which is more appropriate. I'm open to suggestions here. I'll just note that I believe most other intrinsics with a MASK argument do follow this pattern (or a similar pattern with loops over dimensions other than DIM). For example

res => MINLOC(ARRAY, MASK)
res(s1, ..., sdim-1, :, sdim+1, ..., sn) == MINLOC(PACK(ARRAY(s1, ..., sdim-1, :, sdim+1, ..., sn), MASK(s1, ..., sdim-1, :, sdim+1, ..., sn)))

HPF treated the SEGMENT argument as adjacent elements with .EQV. corresponding elements in MASK being members of the same segment. I.e. SUM_PREFIX([1, 2, 3, 4, 5], MASK=[.true., .true., .false., .true., .true.]) == [1, 3, 3, 4, 9]. However, many implementations in other languages treat .true. values in the SEGMENT argument as signifying the first element in a SEGMENT. I.e. SUM_PREFIX([1, 2, 3, 4, 5], MASK=[.true., .false., .true., .true., .false.]) == [1, 3, 3, 4, 9]. Is it better to be consistent with HPF or with other languages?

HPF did not provide collective subroutine versions for any of these operations. Depending on the chose design, what operations should we provide collective subroutines for?

For all of these various dimensions for the possible design of this feature I of course have opinions, but I'm open to any considerations I may not have thought of. Overall my "values" for the design would be, in order:

  • Is easy for users to understand and use correctly
  • Allows for efficient implementations by vendors
  • Doesn't add too much to the standard

I'm also open to the idea of starting with a restricted subset of the above discussed functionality such that we can avoid having to settle on certain aspects of the design initially.

Looking forward to hearing more ideas.

@everythingfunctional
Copy link
Member Author

As for the user-supplied function, if you really think you need it I'd suggest it be ELEMENTAL rather than PURE. ELEMENTAL is, of course, a much more restricted version of pure. That would considerably simplify the documentation of the argument.

That's an interesting idea (and tempting) as the constraints for elemental functions are nearly the exact constraints needed. However, I'm not sure it quite works because:

A dummy procedure or procedure pointer shall not be specified to be ELEMENTAL.

An elemental subprogram is a pure subprogram unless it has the prefix-spec IMPURE.

It's also not really called in an elemental way to perform the scan, so is somewhat disingenuous to say it needs to be possible.

@w6ws
Copy link

w6ws commented Jan 16, 2023 via email

with multiple optional arguments for determining
contributing element selection
@sblionel
Copy link
Member

I can only comment on the name at this point. I don't like overloading the existing SCAN name as the purpose is entirely different, and it complicates how one categorizes the procedure. Yes, it can be distinguished by a compiler, but it will be very confusing for programmers as well as documentation. I see there are alternatives offered and would prefer any of them over just SCAN.

@everythingfunctional
Copy link
Member Author

@sblionel , I totally understand. I was somewhat aware of those points and agree with them. I didn't have many ideas for a different name that seemed to fit well, and wanted to see how much and how strong the opposition was. If that's the biggest problem I'll be happy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants