-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Ensure differently ordered unions and intersections are considered equivalent #15516
base: main
Are you sure you want to change the base?
Conversation
|
1% regression on codspeed. I'll wait for review comments before trying to optimise this |
@@ -105,7 +105,7 @@ def f1( | |||
from typing import Literal | |||
|
|||
def f(v: Literal["a", r"b", b"c", "d" "e", "\N{LATIN SMALL LETTER F}", "\x67", """h"""]): | |||
reveal_type(v) # revealed: Literal["a", "b", b"c", "de", "f", "g", "h"] | |||
reveal_type(v) # revealed: Literal["a", "b", "de", "f", "g", "h", b"c"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think there's value in preserving the source order. E.g I would find this confusing as a user. The same in the example above. It's not the end of the world, but it is irritating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I wouldn't want something like int | str | None
to be reordered to int | None | str
in the output of the type checker (just an example with case-ignoring alphabetical order, not sure if that's the actual order we would choose here).
The idea in this comment was that we could potentially keep a second list inside Union
and Intersection
structs that would keep track of this order — if sensible to do so. As in: we could always opt-out if we need to perform complex operations on unions/intersections, but we would try to preserve it on a best-effort basis.
I'm imagining something like this: When a user specifies the type int | str | None
in an annotation, we would create an internal union type that conceptually looks like this:
Union {
elements: [int, None, str],
user_order: Some([0, 2, 1])
}
The idea would be that we could usually ignore this user_order
field, but then try to reconstruct the original order in the rare case that we need to display a type to the user.
It might be fine as a first solution to switch to user_order: None
as soon as we perform any sort of operation on the union type. But I think we could even try to do better by creating new user_order
s in a limited set of operations. For example, when joining two unions (A | B) | (C | D)
, which would both have user_order: Some([0, 1])
, we could probably join these by adding max(lhs.user_order) + 1
to all elements on the right hand side:
Union {
elements: [A, B, C, D],
user_order: Some([0, 1, 2, 3])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea in this comment was that we could potentially keep a second list inside
Union
andIntersection
structs that would keep track of this order — if sensible to do so. As in: we could always opt-out if we need to perform complex operations on unions/intersections, but we would try to preserve it on a best-effort basis.
One issue I see with this approach -- that also exists in our current approach on main
-- is that it means that our incremental type-checking is very sensitive to the exact source order of union elements. E.g. if somebody made this change in their source code
- x: None | int
+ x: int | None
then that might trigger a lot of otherwise-unnecessary recomputation. The type before the change would not compare equal with the type after the change, even though it represents the exact same set of union elements, because the type inferred for x
would contain a record of the source-order of elements. It will also result in higher memory usage for us.
One idea I've had in the past is that types could have an Option<(File, TextRange)>
field -- Type::display
could just go back to the original source code and get the exact original representation of the type. This would be elegant in some ways because it also solves tricky problems such as this:
type T = int | str | memoryview | Literal["foo", "bar", "baz", "spam"]
def f(x: T):
reveal_type(x) # ideally we'd say "Revealed type is `T`"
# rather than saying "Revealed type is `int | str | memoryview | Literal["foo", "bar", "baz", "spam"]`"
But carrying a TextRange
attribute around would make our incremental type-checking even more sensitive to trivia such as the exact amount of whitespace prior to the type definition, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then that might trigger a lot of otherwise-unnecessary recomputation. The type before the change would not compare equal with the type after the change, even though it represents the exact same set of union elements, because the type inferred for x would contain a record of the source-order of elements.
I'm not too concerned about this. We need to do well at dealing with this because it's no different from when the user makes an actual meaning full change.
But carrying a TextRange attribute around would make our incremental type-checking even more sensitive to trivia such as the exact amount of whitespace prior to the type definition, etc.
I consider this a problem because it means any change will invalidate any public type in that module, unless we hide it behind a salsa interned but that has other costs. It also has the downside that types with a range are displayed differently from types without a range (e.g. it doesn't join Literal
values etc).
Have we looked into how other type checkers (TypeScript, flow, Pyright, mypy) handle and represent union types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems TypeScript
carries the origin type along
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also seems that preserving the original notation has advantages other than just to display a union
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then that might trigger a lot of otherwise-unnecessary recomputation. The type before the change would not compare equal with the type after the change, even though it represents the exact same set of union elements, because the type inferred for x would contain a record of the source-order of elements.
I'm not too concerned about this. We need to do well at dealing with this because it's no different from when the user makes an actual meaning full change.
But carrying a TextRange attribute around would make our incremental type-checking even more sensitive to trivia such as the exact amount of whitespace prior to the type definition, etc.
I consider this a problem because it means any change will invalidate any public type in that module, unless we hide it behind a salsa interned but that has other costs. It also has the downside that types with a range are displayed differently from types without a range (e.g. it doesn't join
Literal
values etc).
Fair enough. It sounds like maintaining a separate boxed array recording the source order would be the way to go here, then, like @sharkdp suggests.
It also has the downside that types with a range are displayed differently from types without a range (e.g. it doesn't join
Literal
values etc).
Hmm, but I thought that this was what you were arguing we wanted -- I thought you were arguing that we should display the type as the user presented it, without any simplifications, if the type comes directly from a user annotation.
Have we looked into how other type checkers (TypeScript, flow, Pyright, mypy) handle and represent union types?
Mypy does not eagerly simplify unions at all; it only lazily simplifies them. This means that it has to do a lot of recomputation that for us is unnecessary. Pyright does some simplification (it deduplicates elements) but not nearly as much as we do (it doesn't remove a subtype from a union if the supertype is present). Even when it removes duplicate elements, it maintains order among the other elements (which we don't do because we use swap_remove()
): https://pyright-play.net/?code=GYJw9gtgBALgngBwJYDsDmUkQWEMogCmAboQIYA2A%2BvAoQFCMAmhwUwAFAB4BcmK%2BAD5QAzjBBRhqIVABGYMBUlQIhCLjjEkhAO7LgFMGRnTlFJGIDaYkAF1lTJAGMY18QBp%2BMe8JgBXBApCSw4ASltQnnooGIIScmpaQm5QoA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also seems that preserving the original notation has advantages other than just to display a union
microsoft/TypeScript@
4a18b5c
/src/compiler/checker.ts#L22445-L22458
How did you obtain that link?? When I click on it, GitHub just shows me this 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I open it in VS code / Rust Rover and then do Ctrl+P Copy link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we probably should continue to make an effort to not reorder user-authored types that we haven't simplified (and that it's ok to reorder in simplifying.)
As far as how we do this, I spent some time thinking about whether there could be solution overlap with the more general case of showing users a nice short type alias name instead of the complex type it aliases, which I think we will also want to do, as in Alex's example above. The proposal of carrying around a TextRange
doesn't work, as discussed, because it is too sensitive to unrelated source-code changes. I'm not really coming up with any good unified solution here, though; I don't think we want every union/intersection type carrying around a string, for example. So it may be best to go with separate solutions here; a user_order
field for unions and intersections, and something else (maybe something like a lookup table mapping types to their known aliases in a file?) for the type alias case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great write up with a lot of context. Thank you.
this would make it difficult to write mdtests,
I don't think this is true. The order is predictable for as long as no multi-threading is involved.
astly, this doesn't solve all problems with Type::is_equivalent_to() that we'll ever encounter.
I'd like to understand this more. What makes Protocols
different from what we have today? How would sorting the types help?
I do remain concerned about reordering types, especially if that ordering is non-obvious and, therefore, appears to be arbitrary for users. I often write my TypeScript types with a specific order in mind, e.g. group semantically relevant types first and I'd find it rather annoying if the extension removes that ordering when hovering over the type.
I think this is different from making other simplifications because it sort of teaches me how I could have written my type instead, although I have slight reservations there as well.
Oh, interesting; I didn't think that through. And we don't use multi-threading in red-knot as invoked by I still think it's useful generally for the display of types to be stable between red-knot runs. Among plausible benefits to this, I can think of:
Well, it didn't help that I forgot to actually write up the example in my initial PR description 🙃 I've edited it in now. The difference with from typing import Protocol
class Nominal:
x: int
class Structural(Protocol):
x: int
class Unrelated:
x: int Going back to the example I've edited into my PR description: the types def _(obj: Q):
reveal_type(obj) # revealed: P So when we get to protocols (and
I know what you mean, but our existing simplifications already frequently perturb the order of unions and intersections because of our use of |
I think Micha was trying to say that reorderings are fine if any simplifications are performed. But the default case will be that users write union-type annotations that can not be simplified, and maybe we should try a bit harder to make the UX in this case better. It's great that we can simplify (don't get me wrong, I am very much excited about this PR 👍) |
I don't know about that. I've seen a lot of |
Thank you! And don't get me wrong -- I'm not wedded to the approach here! There are some really tricky tradeoffs to think through. Even if we don't go with this approach, I don't consider this PR to be time wasted, because of the number of bugs I discoverered along the way:
|
Thanks for this! I do think it's likely that even with more complex equivalence cases, this can make equivalence a lot cheaper to implement, as long as our equivalent-types reliably end up sorted near each other. (As a side note, I'm not sure it will ever be necessary, or even an improvement in practice, that we consider two different Protocol types such as |
I guess in terms of next steps, if we are in agreement on the above discussion of user order preservation, the question would be whether we land this as-is and introduce user-order preservation as a follow-up, or add user-order preservation to this PR. I think probably the latter is better, as otherwise it'll be a lot more churn and re-churn to the mdtest output? |
Right, though that would mean we'd have to abandon this property test if we decided not to recognise the equivalence there -- we'd have two types which would be mutual subtypes but would not be equivalent: ruff/crates/red_knot_python_semantic/src/types/property_tests.rs Lines 360 to 365 in e2da33a
(This PR promotes that property test to stable) |
Yeah, after more discussion in Discord I think we would want to simplify |
@AlexWaygood and I discussed today in 1:1 that deterministically sorting union/intersection elements and also tracking a |
Just one additional thought: The other scenario where the always-ordered canonical representation (proposed in this PR) would help would be during construction of unions and intersections, i.e. in the |
Often quite a large problem! See #13549 for a summary of some of the many problems mypy and pyright have had with large unions of the years |
No, that's a good point. I suspect that we should probably aim for "simplest implementation that gets the semantics right and isn't obviously inefficient" initially, and tackle deeper optimization as its own project, once we are faced with real-world projects with large-union problems (or at least some projects we can use as benchmarks that make heavier use of unions than tomllib.) |
We could also do the sorting lazily and cache the result in a separate field that's not part of eq using internal mutability |
c37f8d5
to
0034a03
Compare
(Type::Instance(instance), Type::SubclassOf(subclass)) | ||
| (Type::SubclassOf(subclass), Type::Instance(instance)) => { | ||
let Some(base_class) = subclass.subclass_of().into_class() else { | ||
return false; | ||
}; | ||
|
||
instance.class.is_known(db, KnownClass::Type) | ||
&& base_class.is_known(db, KnownClass::Object) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that this branch was redundant, since we now eagerly normalize type[object]
to type
. But I could separate this out into another PR, if we prefer that
(Type::StringLiteral(left), Type::StringLiteral(right)) => { | ||
left.value(db).cmp(right.value(db)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that the ordering is no longer user-facing, I'm not sure that Salsa db lookups like this in the ordering algorithm make sense anymore. I think it's still useful to make sure that all the StringLiteral
elements are next to each other in the elements list, but I don't think it matters anymore whether Literal["bar"]
comes before Literal["foo"]
. So maybe I should just sort by the Salsa IDs here, which would be cheaper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this in 64351be
b022cf7
to
64351be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now is basically equivalent to what we'd get if we just slapped #[derive(PartialOrd, Ord)]
on Type
, KnownInstanceType
, SubclassOfType
, ClassLiteralType
, etc. So maybe we should just do that at this point -- it would certainly be less code. But it still doesn't make sense semantically (to me) for them to implement Ord
. There are many different ways you could plausibly order types; this is only one possible (arbitrary) ordering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also define your own trait UnionElementOrd
and implement your own derive macro. But that seems painful too :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably with what you have now and you can create a (help wanted?) issue that adds a ArbitraryOrd
(or similar) trait with a derive macro. Seems like a fun exercise for someone interested in Rust.
@@ -4204,6 +4211,18 @@ impl<'db> UnionType<'db> { | |||
) -> Type<'db> { | |||
Self::from_elements(db, self.elements(db).iter().map(transform_fn)) | |||
} | |||
|
|||
#[salsa::tracked] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you measure the performance impact of caching the sorted unions?
I'm asking because it's unclear to me if caching it always is worth it. E.g. sorting a union with only a few elements is super cheap. So maybe it's only worth calling into the cached variant after a certain threshold?
It might also be worth to test first if the union elements aren't sorted already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit shortly before your review to see what happens to the benchmarks if we don't cache them. Though I'm still worried about what will happen if we encounter very large unions, which do occur regularly in popular Python libraries such as pydantic (see #13549), even if they don't happen much in tomllib
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benchmarks do seem to be much improved without the caching (codspeed is now neutral). So I guess it makes sense to keep these uncached for now until we benchmark how red-knot performs on projects with catastrophically huge unions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might also be worth to test first if the union elements aren't sorted already
it would be easier if we upgraded our MSRV to 1.82 :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be too hard to implement ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in 1e47693 :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version where we first check to see whether it is already sorted before attempting to sort it does (I think) seem to be doing slightly worse on the benchmarks? But it's all within the noise range; I might be looking for signal here in something that's actually random variation.
(_, Type::AlwaysFalsy) => Ordering::Greater, | ||
|
||
(Type::KnownInstance(left_instance), Type::KnownInstance(right_instance)) => { | ||
match (left_instance, right_instance) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think implementing PartialOrd
for KnownInstance
should be fine. I agree that Type
itself should not implement Ord
. It would drastically reduce the code needed here (and I'd probably to the same for other types, e.g. `ClassBase)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think implementing
PartialOrd
forKnownInstance
should be fine.
Hmm, why do you think it's okay for KnownInstanceType
but not for Type
? The two enums seem pretty similar to me in terms of whether it makes sense for them to implement Ord
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consider Type
as the main public interface. I agree that probably the same argument applies as for Type
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda feel inclined to either add it to Type
and all its wrapped enums/structs or none of them. Writing it all out like this is a lot of boilerplate code, but it's not too hard to review it for correctness (it's all quite simple). But maybe it would be better to just write a macro here...
crates/red_knot_python_semantic/resources/mdtest/type_properties/is_equivalent_to.md
Outdated
Show resolved
Hide resolved
static_assert(is_equivalent_to(Intersection[Q, R, Not[P]], Intersection[Not[P], R, Q])) | ||
static_assert(is_equivalent_to(Intersection[Q | R, Not[P | S]], Intersection[Not[S | P], R | Q])) | ||
static_assert(is_equivalent_to(str | None, None | str)) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I recall correctly, one of the original concerns with the ordering approach was that we would not recognize if two equivalent-but-not-Eq
types like type
and type[object]
would be part of the union.
Can we add a test for this? Like
static_assert(is_equivalent_to(type[object] | P | type, type | P | type[object]))
Or do we already simplify unions with type
and type[object]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No wait, that's not the right test. More like:
static_assert(is_equivalent_to(type[object] | P, P | type))
If we don't recognize that yet, that's also okay. Might still be worth adding the test.
I imagined that we would design the ordering in such a way that those equivalence classes (w.r.t. is_equivalent_to
) would be grouped together. Which would then make it easier to implement the Union(…)
branch in is_equivalent_to
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. So far we've managed to get by here by ensuring that equivalent types are recognised as such by ensuring that only one of the two can ever appear in our internal model. So type[object]
is eagerly simplified to type
(meaning no special logic is necessary to implement the equivalence between the two), bool & AlwaysTruthy
is eagerly simplified to Literal[True]
, etc.. There are some known places where we don't currently do this, but where I'm hopeful that we can, e.g. #15528 and #15513
But there will be types we encounter in the future where we cannot do the eager simplification from one type to the other: Protocol
s, TypedDict
s, type aliases, there may be others.
So this is a good reason to keep the manual ordering rather than deriving #[PartialOrd, Ord]
(links to the conversation I was having with @MichaReiser at #15516 (comment)), because we can see and control exactly how the types are ordered. But it makes it hard to test right now, because we haven't got that far yet in our implementation of the type system!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think one necessarily precludes the other. We can automatically derive ordering for some types but manually implement ord where necessary (or our custom ordering trait)
I think this should be ready for another round of review now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! I'm really not too concerned about custom trait vs macro vs explicit match; these are surface code organization issues that are easy to change later if we decide we don't like what we have.
); | ||
|
||
// Equal element sets of intersections implies equivalence | ||
// flaky at least in part because of https://github.com/astral-sh/ruff/issues/15508 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linked issue is closed. Was it closed prematurely, or is it not in fact the cause of this property test being flaky, or is this property test not in fact flaky?
Or did you mean #15528 here?
Summary
Fixes #15380
Currently, we don't understand that
str | int
andint | str
are equivalent types; nor do we understand thatP & T ≡ T & P
. This PR fixes that, by ensuring that we never construct astr | int
union --str | int
is simplified toint | str
during constructions via a sorting algorithm introduced by this PR. (For instance types from the same module, specifically, the algorithm puts the instance types in alphabetical order.) This ensures that the trivial case ofint | str ≡ str | int
can be handled, well, trivially, with our existing logic inType::is_equivalent_to()
. An added benefit of this approach should be reduced memory usage, since we'll only ever storeint | str
in the Salsadb
, neverstr | int
.Fixing #15380 means that we can now promote the
subtype_of_is_antisymmetric
property test from flaky to stable 🎉Why not just implement
Ord
onType
?It would be fairly easy to slap
#[derive(PartialOrd, Ord)]
onType
. However, this would order types according to their Salsa ID. While this would mean that types would always be consistently ordered in any single run of red-knot, the order in which they would appear might vary between different runs of red-knot. Unless we implemented an entirely different order for display purposes, this would make it difficult to write mdtests, and would also be quite confusing for users.Moreover, it doesn't really "make sense" for
Type
to implementOrd
in terms of the semantics. There are many different ways in which you could plausibly sort a list of types; the algorithm implemented in this PR is only one (somewhat arbitrary, at times) possible ordering.Costs of this approach
This approach does have several costs! Firstly, it makes the construction of unions and intersections more expensive. I think this tradeoff is worth it considering the reduced memory usage, and considering that
Type::is_equivalent_to()
does not need to beO(n^2)
for unions and intersections with this change (at least for now). It's also possible that we could mitigate the performance penalty by maintaining the union/intersection elements in sorted order at all times while the union/intersection is being constructed. I haven't looked too hard at this yet, partly because I wanted to keep the diff clean and easy to understand. I also wasn't sure how high a performance cost this might be -- I'm interested to see what codspeed says here!Another cost to this approach is that it might be confusing for users to see unions displayed in a different order to the order the user gave them in annotations. I don't see this as a significant cost, however, as we already perform many simplifications and rearrangements of unions and intersections.
Lastly, this doesn't solve all problems with
Type::is_equivalent_to()
that we'll ever encounter. When we implement protocols, for example, we'll need to understandP
andQ
in the following example as being equivalent types, and we'll need to understandP | None
as being equivalent toQ | None
, etc.But I think even when we get to these more advanced and trickier problems, having union and intersection elements guaranteed to always be in a predictable order should make answering the question of equivalence easier to do with a complexity less than
O(n^2)
.Implementation details
The algorithm for sorting union/intersection elements is placed in a new submodule,
crates/red_knot_python_semantic/src/types/type_ordering.rs
. Other than this algorithm, the only significant changes made are to add a few.sort_by()
calls toUnionBuilder::build()
andInnerIntersectionBuilder::build()
.Test Plan
AlwaysTruthy | bool
is equivalent toAlwaysTruthy | Literal[False]
#15513type & ~Literal[True] & bool
should simplify toNever
#15508QUICKCHECK_TESTS=1000000 cargo test --release -p red_knot_python_semantic -- --ignored types::property_tests::stable
to verify that the stable property tests all pass. I also ranQUICKCHECK_TESTS=1000000 cargo test --release -p red_knot_python_semantic -- --ignored types::property_tests::flaky
and examined the failures for the two added flaky property tests to check whether they looked related to the sorting algorithm; I couldn't find any that were.