Borrows in generic code can violate uniqueness constraints #713

yorickpeterse · 2024-05-17T01:24:18Z

Please describe the bug

When using generic code, it's possible to borrow a generic type parameter that ends up being assigned a uni T value, which then allows the borrow to outlive the unique value and violate the uniqueness constraints.

Please list the exact steps necessary to reproduce the bug

Consider this code:

class Example[T] {
  let @value: Option[T]
  let @borrow: Option[ref T]

  fn mut set(value: T) {
    @borrow = Option.Some(ref value)
    @value = Option.Some(value)
  }
}

class async Main {
  fn async main {
    let ex = Example(value: Option.None, borrow: Option.None)

    ex.set(recover [10, 20])

    let x = ex.value := Option.None
  }
}

Here ex ends up being of type Example[uni Array[Int]], with the borrow field storing a ref Array[Int], violating the constraints. We can then exploit that using ex.value := Option.None, giving us an Option[uni Array[Int]] while the Example value still stores a borrow to the uni Array[Int].

I'm not sure yet what the solution here is. If we introduce something similar to the mut constraint for type parameters to allow unique values, then we basically end up not being able to use them anywhere or we won't be able to use borrows. Basically it ends up being a massive pain.

Operating system

Fedora 40

Inko version

main

Rust version

1.77.2

The text was updated successfully, but these errors were encountered:

yorickpeterse · 2024-05-17T01:27:28Z

In theory something like this would work: when type-checking a method body, we set some sort of "borrowed" flag for type parameters whenever they're borrowed, either directly or by passing it to an argument of another method that borrows it. When we then assign a uni T to such a type parameter, we disallow this at the call site.

The problem with this approach is that it requires at least two passes over all method bodies: one to set the flag for direct borrows (e.g. through ref x expressions), and one to set the flag for indirect borrows. I'm also not sure this analysis would always be complete.

yorickpeterse · 2024-06-04T16:06:51Z

Here's another way this can be triggered: if you have some uni T and do that_value.some_field.iter, the returned Iter produces values of type ref Something, allowing you to violate the uniqueness constraints by storing those ref values somewhere.

yorickpeterse · 2024-06-07T15:33:23Z

Another similar case: we allow pattern matching against e.g. Option[uni ref Something], which then lets you introduce a borrow by binding the uni ref Something to a match variable:

match something {
  case Some(v) -> {
   move_unique_value_to_another_process(unique_value)
   do_something_with_the_uni_ref_we_still_have(v)
  }
  ...
}

To prevent this from happening we have to disallow binding uni ref T and uni mut T values to match variables, similar to how you can't assign them to locals.

yorickpeterse · 2024-08-16T15:08:30Z

To see how many places there are where we explicitly use ref x / mut x with x being a type parameter, I modified the compiler as follows:

diff --git a/compiler/src/type_check/expressions.rs b/compiler/src/type_check/expressions.rs
index 6573bb59..31567c34 100644
--- a/compiler/src/type_check/expressions.rs
+++ b/compiler/src/type_check/expressions.rs
@@ -3069,6 +3069,16 @@ impl<'a> CheckMethodBody<'a> {
     ) -> TypeRef {
         let expr = self.expression(&mut node.value, scope);
 
+        // TODO: remove
+        if expr.is_type_parameter(self.db()) {
+            self.state.diagnostics.warn(
+                DiagnosticId::InvalidType,
+                "ref with type parameter that might violate uni T",
+                self.file(),
+                node.location.clone(),
+            );
+        }
+
         if !expr.allow_as_ref(self.db()) {
             self.state.diagnostics.error(
                 DiagnosticId::InvalidType,
@@ -3112,6 +3122,16 @@ impl<'a> CheckMethodBody<'a> {
 
         let expr = self.expression(&mut node.value, scope);
 
+        // TODO: remove
+        if expr.is_type_parameter(self.db()) {
+            self.state.diagnostics.warn(
+                DiagnosticId::InvalidType,
+                "mut with type parameter that might violate uni T",
+                self.file(),
+                node.location.clone(),
+            );
+        }
+
         if !expr.allow_as_mut(self.db()) {
             self.state.diagnostics.error(
                 DiagnosticId::InvalidType,

Running the test suite then produces the following warnings:

src/std/array.inko:135:16 warning(invalid-type): ref with type parameter that might violate uni T
src/std/array.inko:635:15 warning(invalid-type): ref with type parameter that might violate uni T
src/std/array.inko:726:15 warning(invalid-type): mut with type parameter that might violate uni T
src/std/iter.inko:275:20 warning(invalid-type): ref with type parameter that might violate uni T

I was expected a lot more warnings, so perhaps requiring the equivalent of T: ref to allow borrowing might not be that bad after all.

A challenge here is when we deal with closures and captures. Take this code for example:

import std.stdio (STDOUT)
import std.string (ToString)

class Person {
  let @name: String
}

impl ToString for Person {
  fn pub to_string -> String {
    @name
  }
}

fn example[T: ToString](value: T) {
  fn {
    let out = STDOUT.new

    out.print(value.to_string)
    nil
  }
    .call
}

class async Main {
  fn async main {
    let alice = recover Person(name: 'Alice')

    example(alice)
  }
}

Here the closure captures value by borrowing, violating the uniqueness constraints of alice in the process. Requiring T: ref just to be able to capture it is likely going to make working with closures and generics an absolute pain in the ass, as pretty much every such case will need T: ref and through that forbid the use of uni T values.

yorickpeterse · 2024-08-16T15:10:22Z

Pony's approach to this is to introduce a variety of reference capabilities and ways to convert between them. However, the resulting complexity of this makes the language difficult to use, which is likely why it never really caught on.

yorickpeterse · 2024-08-16T15:42:27Z

We could potentially remove the need for explicit T: mut, T: move, and T: ref through inference. The rules would (roughly) be as follows:

Given a generic type T:

If it's passed to a ref, used as the value for a ref expression
If it's passed to a mut, used as the value for a mut expression, is the receiver of an fn mut method, or is captured by an fn closure, it's inferred as T: mut
If it's the receiver of an fn move, it's inferred as T: move

Closures here complicate things: since they may mutate the captured data we can't infer T: ref and instead must infer T: mut, which may be annoying. The actual inference in turn would require multiple passes over all methods, as inferring the rules for method B may affect its caller A.

yorickpeterse · 2024-08-16T17:36:21Z

If we go down the path of inference, we can avoid using multiple passes using an approach like this:

We maintain a map of Type Parameter => [Type Parameters] where the key is a type parameter used in an argument, and the values the type parameters passed to that argument. When inferring rules for a type parameter that's a key, we propagate those to the type parameters in the value array. So given code like this:

fn foo[A](a: A) {
  bar(a)
}

fn bar[B](b: B) {
  baz(b)
}

fn baz[C](c: C) {
  mut c
}

The mapping would be:

C => [B]
B => [A]

When processing mut c we note we're operating on C and flag it as C: mut. We then look it up in the map and find [B]. We then iterate over those values, flag them accordingly, and repeat this process until we run out of work.

The downside is that this won't work reliable when dealing with type parameters inside generic types, such as this:

fn foo[A](a: A) {
  bar(a)
}

fn bar[B](b: B) {
  baz([b])
}

fn baz[C](c: Array[C]) {
  mut c.pop.get
}

Here the issue is that we don't pass B directly to c. This means we'd have to traverse the type graph and build the mapping as we go (e.g. as part of type checking).

The other approach is to process methods multiple times and record method dependencies. That is, when we encounter a call to a method we've yet to process, we store the caller as a dependency of the callee (e.g. bar => [foo] in the above example). When we finish processing, we check for any methods that depend on it and reschedule them, removing them from the dependency list in the process. This requires some form of caching such that we can skip work in a rescheduled method (e.g. foo) that we've already performed.

yorickpeterse · 2024-08-16T20:52:52Z

I read through Functional Ownership through Fractional Uniqueness hoping it would perhaps provide some extra insight, but alas it felt like a paper with a lot of words but not a lot of actually useful content, though perhaps I just don't grok it.

yorickpeterse · 2024-08-16T21:28:04Z

Also read through Flexible recovery of uniqueness and immutability, but it basically just describes the Pony model using different terms and some extra fluff.

yorickpeterse · 2024-12-02T22:49:54Z

We'll need to fix this for #780 if we want to have any chance at allocating data on the stack, as doing so will introduce another scenario in which we may violate uni T constraints.

One issue with adding T: ref is that it results in T itself being rather useless, because you can only move it around. Not even fn methods could be used, because they too might borrow their receiver for a duration longer than the call.

We can handle most cases by disallowing uni ref T / uni mut T from appearing in argument and return types as a result of inference/type arguments. This however leaves us with a case where a method mutates its receiver in such a way that it stores another alias to an uni T value. This is something we can't detect on the caller's side.

yorickpeterse · 2024-12-02T23:48:10Z

For inference, I think we can take a simpler approach:

We add support for T: ref similar to T: mut and T: inline.
When a type parameter T defines one or more trait requirements that in turn define an fn or fn mut method, we automatically infer T as T: ref
We can do the same for T: mut if a trait requirement defines one or more fn mut methods
For the few cases where we want to borrow some T without additional requirements, one can use impl X if T: ref { ... }. Such cases will be rare

The nice thing about this approach is that we should be able to cut down the amount of explicit T: mut instances, as e.g. T: mut + Read can now be collapsed into T: Read, because Read defines fn mut read and thus implies T: mut.

yorickpeterse · 2024-12-03T01:41:00Z

Some additional observations:

If a type ends up containing uni ref T or uni mut T values as a result of inference, we can detect and deny such types from being constructed in the first place
If a method returns an uni ref T or uni mut T (or something wrapping such a type), we can detect this easily and disallow the call
If a uni ref T/uni mut T escapes through an argument (i.e. it's pushed into an Array[uni ref Whatever]), the first observation should protect against this

These checks would need to be performed after type checking a method body, such that all types in the body are complete.

What this doesn't guard against is aliasing inside the method body though, i.e. something like this:

impl Array {
  fn example(arg: Something) {
    get(0).whatever(arg)
  } 
}

This however might in fact be fine: based on observation 1, if Something were to store an uni ref T we'd notice it by checking constructed types at the end of a method. We don't need to worry about the method spawning a process either, because ref T and mut T values aren't sendable, meaning we don't have to worry about sending an uni ref T to another process by accident.

The downside is that for each call we need to recursively check all argument and return types for the presence of a uni ref T or uni mut T. This is likely to be more expensive compared to checking if a type meets the T: ref requirement. With that said, we already sort of pay that cost for checking if types are fully inferred when lowering to MIR, so we might be able to just reuse that traversal.

yorickpeterse added bug Defects, unintended behaviour, etc compiler Changes related to the compiler labels May 17, 2024

yorickpeterse added this to the 0.17.0 milestone Aug 16, 2024

yorickpeterse self-assigned this Aug 16, 2024

yorickpeterse modified the milestones: 0.17.0, 0.18.0 Sep 9, 2024

yorickpeterse removed their assignment Oct 29, 2024

yorickpeterse modified the milestones: 0.18.0, 0.19.0 Oct 29, 2024

yorickpeterse mentioned this issue Oct 30, 2024

Automatically recover unique receivers if unsendable return types aren't used #612

Open

yorickpeterse modified the milestones: 0.19.0, 0.18.0 Dec 2, 2024

yorickpeterse mentioned this issue Dec 2, 2024

Make enums and tuples stack allocated types #780

Closed

yorickpeterse closed this as completed in 26070da Dec 3, 2024

yorickpeterse self-assigned this Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Borrows in generic code can violate uniqueness constraints #713

Borrows in generic code can violate uniqueness constraints #713

yorickpeterse commented May 17, 2024 •

edited

Loading

yorickpeterse commented May 17, 2024

yorickpeterse commented Jun 4, 2024

yorickpeterse commented Jun 7, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Dec 2, 2024

yorickpeterse commented Dec 2, 2024 •

edited

Loading

yorickpeterse commented Dec 3, 2024 •

edited

Loading

Borrows in generic code can violate uniqueness constraints #713

Borrows in generic code can violate uniqueness constraints #713

Comments

yorickpeterse commented May 17, 2024 • edited Loading

Please describe the bug

Please list the exact steps necessary to reproduce the bug

Operating system

Inko version

Rust version

yorickpeterse commented May 17, 2024

yorickpeterse commented Jun 4, 2024

yorickpeterse commented Jun 7, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Dec 2, 2024

yorickpeterse commented Dec 2, 2024 • edited Loading

yorickpeterse commented Dec 3, 2024 • edited Loading

yorickpeterse commented May 17, 2024 •

edited

Loading

yorickpeterse commented Dec 2, 2024 •

edited

Loading

yorickpeterse commented Dec 3, 2024 •

edited

Loading