[Proposal]: First-Class Span Types #7905

333fred · 2024-02-03T00:52:05Z

First-Class Span Types

Proposed
Prototype: Not Started
Implementation: test plan
Specification: link

Summary

We introduce first-class support for Span<T> and ReadOnlySpan<T> in the language, including new implicit conversion types and consider them in more places,
allowing more natural programming with these integral types.

Motivation

Since their introduction in C# 7.2, Span<T> and ReadOnlySpan<T> have worked their way into the language and base class library (BCL) in many key ways. This is great for
developers, as their introduction improves performance without costing developer safety. However, the language has held these types at arm's length in a few key ways,
which makes it hard to express the intent of APIs and leads to a significant amount of surface area duplication for new APIs. For example, the BCL has added a number of new
tensor primitive APIs in .NET 9, but these APIs are all offered on ReadOnlySpan<T>. Because C# doesn't recognize the
relationship between ReadOnlySpan<T>, Span<T>, and T[], it means that any developers looking to use those APIs with anything other than a ReadOnlySpan<T> have to explicitly
convert to a ReadOnlySpan<T>. Further, it also means that they don't have IDE tooling guiding them to use these APIs, since nothing will indicate to the IDE that it is valid
to pass them after conversion. There are also issues with generic inference in these scenarios. In order to provide maximum usability for this style of API, the BCL will have to
define an entire set of Span<T> and T[] overloads, which is a lot of duplicate surface area to maintain for no real gain. This proposal seeks to address the problem by
having the language more directly recognize these types and conversions.

Detailed Design

https://github.com/dotnet/csharplang/blob/main/proposals/first-class-span-types.md

Design meetings

The text was updated successfully, but these errors were encountered:

TahirAhmadov · 2024-02-05T00:23:26Z

I'm probably missing something obvious, but why doesn't just adding implicit operators suffice?

333fred · 2024-02-05T00:34:59Z

Implicit operators already exist. They do not suffice because they do not participate in the pieces that I outlined in the proposal.

TahirAhmadov · 2024-02-05T01:00:17Z

Oh I see what you mean, it's because of generics.
Isn't there an opportunity to do something more global here? Say, look at the target type; it's generic; look at implicit operators and try to solve "the generic equation" - is it possible that the expression type satisfies the implicit operator's parameter type with any type argument; if solved, determine the generic argument; use the implicit operator.

333fred · 2024-02-20T21:17:57Z

These changes require a deep understanding of the relationship between the array type and a type parameter; I don't really see a generalizable mechanism that wouldn't be massively overcomplicated for the class of problem that we're trying to fix here.

KalleOlaviNiemitalo · 2024-03-08T14:08:18Z

Would you allow an extension method invocation to combine an implicit span conversion with an identity conversion?

using System;

class C {
    void M() {
        object[] objs = {};
        S.X(objs); // OK
        objs.X(); // ?

        (int a, int b)[] tuples = {};
        S.Y(tuples); // OK
        tuples.Y(); // ?
    }
}

static class S {
    internal static void X(this Span<dynamic> span) {}
    internal static void Y(this Span<(int c, int d)> span) {}
}

Or alternatively, allow this identity conversion as part of the implicit span conversion:

From any single-dimensional array_type with element type Ei to System.Span<EiUi>, provided that there is an identity conversion (§10.2.2) from Ei to Ui

KalleOlaviNiemitalo · 2024-03-08T14:21:53Z

The conversion to Span<T> throws in the covariant case

using System;

class C {
    static void Main() {
        object[] objs = new string[1];
        S.X(objs); // implicit conversion to Span<object> throws ArrayTypeMismatchException
        objs.X();  // implicit span conversion presumably likewise
    }
}

static class S {
    internal static void X(this Span<object> span) {}
}

The proposal defines the implicit span conversion to be a standard implicit conversion. It's then also a pre-defined conversion according to §10.4.1:

The standard conversions are those pre-defined conversions that can occur as part of a user-defined conversion.

This paragraph in §10.2.1 then has to be changed:

The pre-defined implicit conversions always succeed and never cause exceptions to be thrown.

333fred · 2024-03-08T16:07:34Z

Would you allow an extension method invocation to combine an implicit span conversion with an identity conversion?

No, that is not intended by this proposal.

KalleOlaviNiemitalo · 2024-03-08T17:15:38Z

Does that mean:

using System;

class C {
    void M() {
        (int a, int b)[]? array1 = null;
        (int c, int d)[]? array2 = null;
        array1.N(); // calls X.N(array1)
        array2.N(); // calls Y.N(array2)
        array1.AsSpan().N(); // error CS0121, ambiguous
    }
}

static class X {
    public static void N(this Span<(int a, int b)> span) {}
}

static class Y {
    public static void N(this Span<(int c, int d)> span) {}
}

I think this would be the first time that tuple element names affect which method is called.

Neme12 · 2024-05-01T00:01:47Z

I'm firmly against the idea that the language would pick a span overload over an array or string overload and I think it would be really unfortunate. There are issues with that, as I mentioned here:

I think that would be unfortunate. There are many reasons where a method takes a span as well as an array/string, and the span overload potentially has to allocate, whether for legacy reasons (e.g. Stream, where the default implementation has to potentially allocate and make a copy and forward to the original array-taking one - and I would guess the majority of real-world stream do not override the span one from what I've seen in other people's code), and I've seen it in cases other than legacy reasons as well where a new API was added with overloads for both either string/array and a span and the span-taking one had to allocate. I think it was for String implementing ISpanParsable - the regular parsable doesn't allocate, the span-taking one does. But it was still added for it to be usable in generic scenarios where you only have a span and not a string. But if you do have a string, you should still prefer the string-taking one. After all, string/arrays are more derived and more usable, and can be reused. Spans cannot be reused, and so sometimes you have to allocate an array/string from them.

The only case where preferring span overloads adds value is for params, where the array version allocates invisibly by default (but I might be missing other cases as well in the language where there are implicit allocations similar to this that would be mitigated by span). Of course, If C# didn't have invisible allocations from the start, even this wouldn't be a problem.

An array is a strictly derived type from a span. It has more featueres, it is more usable, and it can be put on the heap. If all you get is a span, you cannot put it on the heap or reuse it as a field, and you might have to allocate and copy the data to get something more usable. String, on the other hand, can be reused. If what you already have is an array or a string, you should call those corresponding overloads. Preferring span is only useful in cases like params span when you don't already have an array you could pass in, and it will implicitly allocate one. But that's an issue with the language implicitly allocating.

I'd be fine with the language understanding span types, but only if it still considered them base types of arrays or strings - not the other way around. It would be nice to avoid the duplication of overloads on MemoryExtensions for both Span<T> and ReadOnlySpan<T> that's there for basically every method.

CyrusNajmabadi · 2024-05-01T00:05:50Z

I'm firmly against the idea that the language would pick a span overload over an array or string overload

We've already decided, based on empirical need, that we will pick the span overload.

There are issues with that, as I mentioned

These issues do not apply in aggregate. And it would be bad for the ecosystem as a whole if we punished the majority of cases for a tiny subset of cases where there might be a problem. Those tiny subset should be designed to not have an issue here, versus requiring the rest of the ecosystem to try to get the perf that is sensible by default.

CyrusNajmabadi · 2024-05-01T00:06:57Z

An array is a strictly derived type from a span. It has more featueres, it is more usable, and it can be put on the heap

That's a negative. it cannot be put on the stack. The span approach means you can operate on both uniformly. And, in the normal common case, you get the stack benefits across the ecosystem. It is strictly the type that should be preferred. And that is borne out by the teams looking at this and specifically wnating both the most flexibility and the best perf.

Neme12 · 2024-05-01T00:07:07Z

But why would picking Span<T> when a method has an array overload as well be better in any case at all?

TahirAhmadov · 2024-05-01T00:07:27Z

@Neme12 the method knows how it's going to use the parameter, so it (in the overwhelming majority of cases) knows if it needs to allocate from a span (thus negating the performance gains of spans in the first place or even making it worse in some cases). Why don't we use the overload resolution attribute to solve this? Basically, if the span overload knows it's better, it has an attribute with a higher value, and if not, the array/string one will have a higher value.

CyrusNajmabadi · 2024-05-01T00:07:57Z

But why would picking Span when a method has an array overload as well be better in any case at all?

So that the caller doesn't need to heap allocate. Which will often be the case the majority of the time.

TahirAhmadov · 2024-05-01T00:08:41Z

So that the caller doesn't need to heap allocate. Which will often be the case the majority of the time.

To clarify, I think she is asking why we would call the span overload if you already have an array.

Neme12 · 2024-05-01T00:08:52Z

So that the caller doesn't need to heap allocate. Which will often be the case the majority of the time.

But as I said, it only allocates implicitly for params. Otherwise, when you do already have an array and a method has overloads for both array and span, it's completely fine to call the array one.

CyrusNajmabadi · 2024-05-01T00:09:01Z

you cannot put it on the heap

The vast vast vast majority of cases do not do this. And they would not do this either if they got an array (as the data could change out from underneath them). So a copy needs to happen in the array case as well here. So spans are no worse.

CyrusNajmabadi · 2024-05-01T00:09:31Z

Otherwise, when you do already have an array and a method has overloads for both array and span, it's completely fine to call the array one.

Which is exactly what the lang/compiler would do as that would be an identity conversion.

TahirAhmadov · 2024-05-01T00:10:15Z

Basically, the span overload would only be preferred if it's a collection expression and we target type it to span?

Neme12 · 2024-05-01T00:10:33Z

So why would picking a Span<T> overload be better than an existing array overload in any case other than params where the language otherwise implicitly allocates an array?

CyrusNajmabadi · 2024-05-01T00:10:38Z

Basically, the span overload would only be preferred if it's a collection expression

No. It would also be preferred for things like params.

TahirAhmadov · 2024-05-01T00:11:03Z

No. It would also be preferred for things like params.

Right, I didn't mention it because @Neme12 already said it.

CyrusNajmabadi · 2024-05-01T00:11:54Z

Ok. So, as i've been saying. We prefer span because it is better when a non-identity conversion is involved. If you have an array, and it's an identity conversion, then identity conversion rules continue to apply. :)

Neme12 · 2024-05-01T00:13:08Z

Oh sorry, I misunderstood the proposal then. I should have read all of the detailed design before commenting.

CyrusNajmabadi · 2024-05-01T00:13:36Z

i thnk you guys may be confusing what an identity conversion is versus the 'implicit span conversion' that fred is talking about adding. That's a implicit widening conversion. not an identity conversion. An identity conversion si still first priority in the conversion list.

CyrusNajmabadi · 2024-05-01T00:13:42Z

no worries :)

Neme12 · 2024-05-01T00:18:16Z

I agree it definitely makes sense to prefer Span in cases like collection expressions, where if it called the array one instead, it would have to allocate an array/list/whatever the natural type ends up being.

Neme12 · 2024-05-01T00:24:43Z

However, the language has held these types at arm's length in a few key ways,
which makes it hard to express the intent of APIs and leads to a significant amount of surface area duplication for new APIs. For example, the BCL has added a number of new
tensor primitive APIs in .NET 9, but these APIs are all offered on ReadOnlySpan<T>. Because C# doesn't recognize the
relationship between ReadOnlySpan<T>, Span<T>, and T[], it means that any developers looking to use those APIs with anything other than a ReadOnlySpan<T> have to explicitly
convert to a ReadOnlySpan<T>.

I'm still not understanding this part of the motivation though. When I look at the linked API proposal, I don't see any duplication. And why would a developer have to convert to ReadOnlySpan<T> explicitly when they have something like an array, and an implicit conversion exists? Or were these supposed to be extension methods originally?

333fred · 2024-05-01T00:43:26Z

When I look at the linked API proposal, I don't see any duplication.

Because they're counting on this proposal to come in and not force them to duplicate 🙂.

And why would a developer have to convert to ReadOnlySpan explicitly when they have something like an array, and an implicit conversion exists?

Because the implicit conversion is a user-defined conversion, which doesn't carry through in a number of places, including type inference for these highly-generic cases.

Neme12 · 2024-05-01T00:46:36Z

Ohh, I see what the issue is now (the T cannot be inferred), thanks.

My concerns were about making spans to be considered more specific than e.g. arrays with respect to conversions, as opposed to the opposite, like today. Although now than I'm thinking about it, if a type has implicit conversions to both arrays and spans, the span ones could indeed be better as they should be non-allocating (at least that's the assumption for implicit conversions to spans), and could return a span directly over a struct field.

But I'd still definitely want to avoid a scenario where given an existing array and method overloads for both arrays and spans, the span one is picked anyway even though I have an exact type.

333fred added Proposal Proposal champion labels Feb 3, 2024

333fred assigned 333fred and stephentoub Feb 3, 2024

jjonescz mentioned this issue May 13, 2024

Test plan for "First-class Span Types" dotnet/roslyn#73445

Closed

64 tasks

333fred added this to the Working Set milestone Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal]: First-Class Span Types #7905

[Proposal]: First-Class Span Types #7905

333fred commented Feb 3, 2024 •

edited

Loading

TahirAhmadov commented Feb 5, 2024

333fred commented Feb 5, 2024

TahirAhmadov commented Feb 5, 2024

333fred commented Feb 20, 2024

KalleOlaviNiemitalo commented Mar 8, 2024

KalleOlaviNiemitalo commented Mar 8, 2024

333fred commented Mar 8, 2024

KalleOlaviNiemitalo commented Mar 8, 2024

Neme12 commented May 1, 2024 •

edited

Loading

CyrusNajmabadi commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

Neme12 commented May 1, 2024

TahirAhmadov commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

TahirAhmadov commented May 1, 2024

Neme12 commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

TahirAhmadov commented May 1, 2024

Neme12 commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

TahirAhmadov commented May 1, 2024

CyrusNajmabadi commented May 1, 2024 •

edited

Loading

Neme12 commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

Neme12 commented May 1, 2024

Neme12 commented May 1, 2024 •

edited

Loading

333fred commented May 1, 2024

Neme12 commented May 1, 2024 •

edited

Loading

[Proposal]: First-Class Span Types #7905

[Proposal]: First-Class Span Types #7905

Comments

333fred commented Feb 3, 2024 • edited Loading

First-Class Span Types

Summary

Motivation

Detailed Design

Design meetings

TahirAhmadov commented Feb 5, 2024

333fred commented Feb 5, 2024

TahirAhmadov commented Feb 5, 2024

333fred commented Feb 20, 2024

KalleOlaviNiemitalo commented Mar 8, 2024

KalleOlaviNiemitalo commented Mar 8, 2024

333fred commented Mar 8, 2024

KalleOlaviNiemitalo commented Mar 8, 2024

Neme12 commented May 1, 2024 • edited Loading

CyrusNajmabadi commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

Neme12 commented May 1, 2024

TahirAhmadov commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

TahirAhmadov commented May 1, 2024

Neme12 commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

TahirAhmadov commented May 1, 2024

Neme12 commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

TahirAhmadov commented May 1, 2024

CyrusNajmabadi commented May 1, 2024 • edited Loading

Neme12 commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

CyrusNajmabadi commented May 1, 2024

Neme12 commented May 1, 2024

Neme12 commented May 1, 2024 • edited Loading

333fred commented May 1, 2024

Neme12 commented May 1, 2024 • edited Loading

333fred commented Feb 3, 2024 •

edited

Loading

Neme12 commented May 1, 2024 •

edited

Loading

CyrusNajmabadi commented May 1, 2024 •

edited

Loading

Neme12 commented May 1, 2024 •

edited

Loading

Neme12 commented May 1, 2024 •

edited

Loading