Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

salsa 3.0 #490

Merged
Merged
Show file tree
Hide file tree
Changes from 53 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
225a81a
update docs to mention durability
nikomatsakis Apr 8, 2024
4533cd9
adopt the Salsa 3.0 `Update`` trait
nikomatsakis Apr 10, 2024
e24ace2
return `&TrackedStructValue<C>` from `new_struct`
nikomatsakis Apr 12, 2024
a320781
separate marking the outputs as verified
nikomatsakis Apr 13, 2024
20cb307
give trait more info about lifetime relationships
nikomatsakis Apr 13, 2024
ea1d452
create a `struct_map` that encapsulates access
nikomatsakis Apr 13, 2024
79d24e0
allow (but don't test) lifetime parameters
nikomatsakis Apr 16, 2024
5ce5e3c
track and assert struct ingredient indices
nikomatsakis Apr 17, 2024
b6311d8
WIP permit 'db on tracked struct definitions (opt)
nikomatsakis Apr 18, 2024
cb1a2bb
Revert "WIP permit 'db on tracked struct definitions (opt)"
nikomatsakis Apr 23, 2024
6e2647f
just take salsa::Id instead of id structs
nikomatsakis Apr 25, 2024
b050bd8
remove Key from Fn configuration
nikomatsakis Apr 26, 2024
44a8a2f
make fn input/value a GAT
nikomatsakis Apr 27, 2024
e95c8b2
give fields a lifetime
nikomatsakis Apr 27, 2024
a84777d
permit `<'db>` on tracked struct
nikomatsakis Apr 27, 2024
fe4ff98
support db lifetimes in fields
nikomatsakis Apr 30, 2024
04e041b
rework debugging to be more permanent
nikomatsakis May 5, 2024
4f74037
pipe debug output through rustfmt
nikomatsakis May 5, 2024
8ba6e60
generate configuration struct in salsa_struct
nikomatsakis May 6, 2024
54c9586
move interned-specific fns out of salsa struct
nikomatsakis May 6, 2024
97fc6a0
rework interning to have a Configuration
nikomatsakis May 6, 2024
3441666
update tests for new error messages
nikomatsakis May 13, 2024
d190beb
introduce helper functions
nikomatsakis May 13, 2024
4822013
permit interned data to take 'db lifetime
nikomatsakis May 13, 2024
d6d5226
have tracked struct intern its own keys
nikomatsakis May 14, 2024
af94b25
debug dump for interned struct tokens
nikomatsakis May 14, 2024
d92f2aa
factor out useful helper fn
nikomatsakis May 15, 2024
5095d79
return a pointer from interning, not just id
nikomatsakis May 15, 2024
0b8c27b
rename from TrackedStruct to just Struct
nikomatsakis May 16, 2024
9d8a60b
parameterize salsa_struct module
nikomatsakis May 16, 2024
9607638
permit interned structs with lifetimes
nikomatsakis May 16, 2024
d361e8a
add a `'db` argument to `SalsaStruct`
nikomatsakis May 17, 2024
8d0f8fc
remove unnecessary uses of AsId
nikomatsakis May 17, 2024
cf2fa67
introduce IdLookup trait
nikomatsakis May 18, 2024
ab70786
introduce LookupId trait
nikomatsakis May 18, 2024
7519c3e
extend IdentityInterner to be based on LookupId
nikomatsakis May 18, 2024
b4b49fb
split the Id conversion traits
nikomatsakis May 18, 2024
56030df
convert a test to use 'db in tracked functions
nikomatsakis May 18, 2024
06b7097
impl Update/Send/Sync
nikomatsakis May 19, 2024
2800076
update to syn 2.0
nikomatsakis May 20, 2024
d98485d
add a derive for salsa::Update
nikomatsakis May 20, 2024
4f4d019
generate a custom `std::fmt::Debug` impl
nikomatsakis May 21, 2024
b005820
add a derive for `DebugWithDb`
nikomatsakis May 21, 2024
1560634
support methods with 'db lifetimes
nikomatsakis May 24, 2024
68502ab
'db all the things
nikomatsakis May 24, 2024
ce88a8f
apply cargo fmt
nikomatsakis May 25, 2024
a7b2805
WIP: temporarily add expanded version of test
nikomatsakis May 26, 2024
81942f3
use Alloc not Box
nikomatsakis May 27, 2024
8c51f37
Revert "WIP: temporarily add expanded version of test"
nikomatsakis May 27, 2024
07d0ead
return a NonNull instead of a `&'db`
nikomatsakis May 30, 2024
88b964d
use `const _: ()` to disable clippy lints
nikomatsakis May 30, 2024
0ad0be8
pacify the merciless clippy
nikomatsakis May 30, 2024
b9ab8fc
rustfmt has opinions
nikomatsakis May 30, 2024
ce750da
allow elided lifetimes in tracked fn return values
nikomatsakis May 30, 2024
5326683
remove "setter" function altogether
nikomatsakis May 30, 2024
f91eeb9
remove dead code
nikomatsakis Jun 11, 2024
c02f30a
remove dead code from interned structs
nikomatsakis Jun 11, 2024
af2c973
rework tutorial a bit to be more up to date
nikomatsakis Jun 11, 2024
bcad24c
add a safety comment on `Update`
nikomatsakis Jun 13, 2024
ab9aa3a
WIP: start writing a safety chapter
nikomatsakis Jun 13, 2024
1544ee9
Apply suggestions from code review
nikomatsakis Jun 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
- [Defining the checker](./tutorial/checker.md)
- [Defining the interpreter](./tutorial/interpreter.md)
- [Reference](./reference.md)
- [Durability](./reference/durability.md)
- [Algorithm](./reference/algorithm.md)
- [Common patterns](./common_patterns.md)
- [Selection](./common_patterns/selection.md)
Expand Down
12 changes: 8 additions & 4 deletions book/src/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,9 +114,12 @@ Finally, you can also modify the value of an input field by using the setter met
Since this is modifying the input, the setter takes an `&mut`-reference to the database:

```rust
file.set_contents(&mut db, String::from("fn foo() { /* add a comment */ }"));
file.set_contents(&mut db).to(String::from("fn foo() { /* add a comment */ }"));
```

Note that the setter method `set_contents` returns a "builder".
This gives the ability to set the [durability](./reference/durability.md) and other advanced concepts.
nikomatsakis marked this conversation as resolved.
Show resolved Hide resolved

## Tracked functions

Once you've defined your inputs, the next thing to define are **tracked functions**:
Expand Down Expand Up @@ -147,12 +150,13 @@ Tracked functions can return any clone-able type. A clone is required since, whe

**Tracked structs** are intermediate structs created during your computation.
Like inputs, their fields are stored inside the database, and the struct itself just wraps an id.
Unlike inputs, they can only be created inside a tracked function, and their fields can never change once they are created.
Getter methods are provided to read the fields, but there are no setter methods[^specify]. Example:
Unlike inputs, they can only be created inside a tracked function, and their fields can never change once they are created (until the next revision, at least).
Getter methods are provided to read the fields, but there are no setter methods.
Example:

```rust
#[salsa::tracked]
struct Ast {
struct Ast<'db> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the addition of the db lifetime also allow queries to return data that reference the DB?

One use case that we have is that we need a mapping from AstNode -> Id where Id for example uniquely identifie's a scope, or a symbol in the program.

The challenge we're facing is that our Ast doesn't use Arcs internally, thus cloning a Node always clones the entire sub-tree. Our "work-around" for this is to keep hold to the AST's root (wrapped in an Arc) and store a raw pointer referencing the actual node. This works pretty well, but requires heavy use of Arcs (a lot of cloning). I "think" your changes would allow us to directly store a &'db Expr instead.

If not, then the "work-around" would just be to make the AstNode -> Id map a salsa tracked so that we get access to the db lifetime

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did intend to support storing &T references like that, but it's a subtle case, and I've gone back and forth on whether it works with the stacked borrows rules etc.

Suppose you do let f = ast.field(db) in R1 and it yields a &'db Foo (reference to some field of ast) and then you store that in the database as the result of a query (or part of the result). Now say we start a new revision R2 and, in R2, the value of field changes. This means that f (considered as a pointer) still points to the same memory, but the value behind f has changed. There are two challenges: (a) under the stacked borrow rules, it is UB to use f again; (b) should we consider functions that were dependent on f as needing to be re-executed?

I've tried to write an exploration of this question in this comment like 3 times but it keeps getting unwieldy. I think I will defer it for documentation or in-person discussion, it's a good one. I'm not entirely sure if and under what conditions this can be made to be safe at the moment. =)

That said, I also want to mention a feature I've been considering that I think is may help with your use case. The idea would be to make it easy to have a value that carries a memory arena and references into that arena. This is meant to model things like MIR, where we have some data structure that represents a function, and to allow it to go through phases where it is changed and updated, but without requiring everything to be in vectors nor requiring everything to be cloned constantly. I'm not sure the ergonomics exactly but the idea is roughly that you can declare a struct with two lifetimes...

#[in_arena(AstRoot)]
struct Ast<'ast, 'db: 'ast> {
   data: AstData<'ast, 'db>,
   children: Vec<&'ast Ast<'ast, 'db>>,
}

...and the procedural macro will create a type AstRoot that "hides" the first one:

struct AstRoot<'db> {
    arena: Arc<MemoryArena>,
    root: &'static Ast<'static 'static>, // <-- the lifetimes here are obviously lies
}

Later you can do root.open(|r| { .. }) to work with the data. One of the goals is that you can create new, derived values based on the same arena that have different pointers -- so e.g. it should be possible to extra subvalues from the tree. Each of them would carry a reference count to the same base arena.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I also want to mention a feature I've been considering that I think is may help with your use case. The idea would be to make it easy to have a value that carries a memory arena and references into that arena.

Yeah, that sounds very similar to our "work-around", except that it is more flexible and the unsafety is handled by salsa instead of sprinkled through our code.

#[return_ref]
top_level_items: Vec<Item>,
}
Expand Down
8 changes: 4 additions & 4 deletions book/src/plumbing/tracked_structs.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,12 @@ For a single tracked struct we create multiple ingredients.
The **tracked struct ingredient** is the ingredient created first.
It offers methods to create new instances of the struct and therefore
has unique access to the interner and hashtables used to create the struct id.
It also shares access to a hashtable that stores the `TrackedStructValue` that
It also shares access to a hashtable that stores the `ValueStruct` that
contains the field data.

For each field, we create a **tracked field ingredient** that moderates access
to a particular field. All of these ingredients use that same shared hashtable
to access the `TrackedStructValue` instance for a given id. The `TrackedStructValue`
to access the `ValueStruct` instance for a given id. The `ValueStruct`
contains both the field values but also the revisions when they last changed value.

## Each tracked struct has a globally unique id
Expand All @@ -26,13 +26,13 @@ This will begin by creating a *globally unique, 32-bit id* for the tracked struc
* a u64 hash of the `#[id]` fields;
* a *disambiguator* that makes this hash unique within the current query. i.e., when a query starts executing, it creates an empty map, and the first time a tracked struct with a given hash is created, it gets disambiguator 0. The next one will be given 1, etc.

## Each tracked struct has a `TrackedStructValue` storing its data
## Each tracked struct has a `ValueStruct` storing its data

The struct and field ingredients share access to a hashmap that maps
each field id to a value struct:

```rust,ignore
{{#include ../../../components/salsa-2022/src/tracked_struct.rs:TrackedStructValue}}
{{#include ../../../components/salsa-2022/src/tracked_struct.rs:ValueStruct}}
```

The value struct stores the values of the fields but also the revisions when
Expand Down
13 changes: 13 additions & 0 deletions book/src/reference/durability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Durability

"Durability" is an optimization that can greatly improve the performance of your salsa programs.
Durability specifies the probably that an input's value will change.
nikomatsakis marked this conversation as resolved.
Show resolved Hide resolved
The default is "low durability".
But when you set the value of an input, you can manually specify a higher durability,
typically `Durability::HIGH`.
Salsa tracks when tracked functions only consume values of high durability
and, if no high durability input has changed, it can skip traversing their
dependencies.

Typically "high durability" values are things like data read from the standard library
or other inputs that aren't actively being edited by the end user.
3 changes: 2 additions & 1 deletion components/salsa-2022-macros/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ proc-macro = true
heck = "0.4"
proc-macro2 = "1.0"
quote = "1.0"
syn = { version = "1.0", features = ["full", "extra-traits", "visit-mut"] }
eyre = "0.6.5"
syn = { version = "2.0.64", features = ["visit-mut"] }
synstructure = "0.13.1"
34 changes: 18 additions & 16 deletions components/salsa-2022-macros/src/configuration.rs
Original file line number Diff line number Diff line change
@@ -1,20 +1,22 @@
pub(crate) struct Configuration {
pub(crate) db_lt: syn::Lifetime,
pub(crate) jar_ty: syn::Type,
pub(crate) salsa_struct_ty: syn::Type,
pub(crate) key_ty: syn::Type,
pub(crate) input_ty: syn::Type,
pub(crate) value_ty: syn::Type,
pub(crate) cycle_strategy: CycleRecoveryStrategy,
pub(crate) backdate_fn: syn::ImplItemMethod,
pub(crate) execute_fn: syn::ImplItemMethod,
pub(crate) recover_fn: syn::ImplItemMethod,
pub(crate) backdate_fn: syn::ImplItemFn,
pub(crate) execute_fn: syn::ImplItemFn,
pub(crate) recover_fn: syn::ImplItemFn,
}

impl Configuration {
pub(crate) fn to_impl(&self, self_ty: &syn::Type) -> syn::ItemImpl {
let Configuration {
db_lt,
jar_ty,
salsa_struct_ty,
key_ty,
input_ty,
value_ty,
cycle_strategy,
backdate_fn,
Expand All @@ -24,9 +26,9 @@ impl Configuration {
parse_quote! {
impl salsa::function::Configuration for #self_ty {
type Jar = #jar_ty;
type SalsaStruct = #salsa_struct_ty;
type Key = #key_ty;
type Value = #value_ty;
type SalsaStruct<#db_lt> = #salsa_struct_ty;
type Input<#db_lt> = #input_ty;
type Value<#db_lt> = #value_ty;
const CYCLE_STRATEGY: salsa::cycle::CycleRecoveryStrategy = #cycle_strategy;
#backdate_fn
#execute_fn
Expand Down Expand Up @@ -56,16 +58,16 @@ impl quote::ToTokens for CycleRecoveryStrategy {

/// Returns an appropriate definition for `should_backdate_value` depending on
/// whether this value is memoized or not.
pub(crate) fn should_backdate_value_fn(should_backdate: bool) -> syn::ImplItemMethod {
pub(crate) fn should_backdate_value_fn(should_backdate: bool) -> syn::ImplItemFn {
if should_backdate {
parse_quote! {
fn should_backdate_value(v1: &Self::Value, v2: &Self::Value) -> bool {
fn should_backdate_value(v1: &Self::Value<'_>, v2: &Self::Value<'_>) -> bool {
salsa::function::should_backdate_value(v1, v2)
}
}
} else {
parse_quote! {
fn should_backdate_value(_v1: &Self::Value, _v2: &Self::Value) -> bool {
fn should_backdate_value(_v1: &Self::Value<'_>, _v2: &Self::Value<'_>) -> bool {
false
}
}
Expand All @@ -74,13 +76,13 @@ pub(crate) fn should_backdate_value_fn(should_backdate: bool) -> syn::ImplItemMe

/// Returns an appropriate definition for `recover_from_cycle` for cases where
/// the cycle recovery is panic.
pub(crate) fn panic_cycle_recovery_fn() -> syn::ImplItemMethod {
pub(crate) fn panic_cycle_recovery_fn() -> syn::ImplItemFn {
parse_quote! {
fn recover_from_cycle(
_db: &salsa::function::DynDb<Self>,
fn recover_from_cycle<'db>(
_db: &'db salsa::function::DynDb<'db, Self>,
_cycle: &salsa::Cycle,
_key: Self::Key,
) -> Self::Value {
_key: salsa::Id,
) -> Self::Value<'db> {
panic!()
}
}
Expand Down
62 changes: 62 additions & 0 deletions components/salsa-2022-macros/src/db_lifetime.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
//! Helper functions for working with fns, structs, and other generic things
//! that are allowed to have a `'db` lifetime.

use proc_macro2::Span;
use syn::spanned::Spanned;

/// Normally we try to use whatever lifetime parameter the use gave us
nikomatsakis marked this conversation as resolved.
Show resolved Hide resolved
/// to represent `'db`; but if they didn't give us one, we need to use a default
/// name. We choose `'db`.
pub(crate) fn default_db_lifetime(span: Span) -> syn::Lifetime {
syn::Lifetime {
apostrophe: span,
ident: syn::Ident::new("db", span),
}
}

/// Require that either there are no generics or exactly one lifetime parameter.
pub(crate) fn require_optional_db_lifetime(generics: &syn::Generics) -> syn::Result<()> {
if generics.params.is_empty() {
return Ok(());
}

require_db_lifetime(generics)?;

Ok(())
}

/// Require that either there is exactly one lifetime parameter.
pub(crate) fn require_db_lifetime(generics: &syn::Generics) -> syn::Result<()> {
if generics.params.is_empty() {
return Err(syn::Error::new_spanned(
generics,
"this definition must have a `'db` lifetime",
));
}

for (param, index) in generics.params.iter().zip(0..) {
let error = match param {
syn::GenericParam::Lifetime(_) => index > 0,
syn::GenericParam::Type(_) | syn::GenericParam::Const(_) => true,
};

if error {
return Err(syn::Error::new_spanned(
param,
"only a single lifetime parameter is accepted",
));
}
}

Ok(())
}

/// Return the `'db` lifetime given be the user, or a default.
nikomatsakis marked this conversation as resolved.
Show resolved Hide resolved
/// The generics ought to have been checked with `require_db_lifetime` already.
pub(crate) fn db_lifetime(generics: &syn::Generics) -> syn::Lifetime {
if let Some(lt) = generics.lifetimes().next() {
lt.lifetime.clone()
} else {
default_db_lifetime(generics.span())
}
}
40 changes: 40 additions & 0 deletions components/salsa-2022-macros/src/debug.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
use std::io::Write;
use std::process::{Command, Stdio};
use std::sync::OnceLock;

use proc_macro2::TokenStream;

static SALSA_DEBUG_MACRO: OnceLock<Option<String>> = OnceLock::new();

pub(crate) fn debug_enabled(input_name: impl ToString) -> bool {
let Some(env_name) = SALSA_DEBUG_MACRO.get_or_init(|| std::env::var("SALSA_DEBUG_MACRO").ok())
else {
return false;
};

let input_name = input_name.to_string();
env_name == "*" || env_name == &input_name[..]
}

pub(crate) fn dump_tokens(input_name: impl ToString, tokens: TokenStream) -> TokenStream {
if debug_enabled(input_name) {
let token_string = tokens.to_string();

let _: Result<(), ()> = Command::new("rustfmt")
.arg("--emit=stdout")
.stdin(Stdio::piped())
.spawn()
.and_then(|mut rustfmt| {
rustfmt
.stdin
.take()
.unwrap()
.write_all(token_string.as_bytes())?;
rustfmt.wait_with_output()
})
.map(|output| eprintln!("{}", String::from_utf8_lossy(&output.stdout)))
.or_else(|_| Ok(eprintln!("{token_string}")));
}

tokens
}
95 changes: 95 additions & 0 deletions components/salsa-2022-macros/src/debug_with_db.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
use proc_macro2::{Literal, Span, TokenStream};

pub(crate) fn debug_with_db(input: syn::DeriveInput) -> syn::Result<proc_macro2::TokenStream> {
// Figure out the lifetime to use for the `dyn Db` that we will expect.
// We allow structs to have at most one lifetime -- if a lifetime parameter is present,
// it should be `'db`. We may want to generalize this later.

let num_lifetimes = input.generics.lifetimes().count();
if num_lifetimes > 1 {
return syn::Result::Err(syn::Error::new(
input.generics.lifetimes().nth(1).unwrap().lifetime.span(),
"only one lifetime is supported",
));
}

let db_lt = match input.generics.lifetimes().next() {
Some(lt) => lt.lifetime.clone(),
None => syn::Lifetime::new("'_", Span::call_site()),
};

// Generate the type of database we expect. This hardcodes the convention of using `jar::Jar`.
// That's not great and should be fixed but we'd have to add a custom attribute and I am too lazy.

#[allow(non_snake_case)]
let DB: syn::Type = parse_quote! {
<crate::Jar as salsa::jar::Jar< #db_lt >>::DynDb
};

let structure: synstructure::Structure = synstructure::Structure::new(&input);

let fmt = syn::Ident::new("fmt", Span::call_site());
let db = syn::Ident::new("db", Span::call_site());

// Generic the match arm for each variant.
let fields: TokenStream = structure
.variants()
.iter()
.map(|variant| {
let variant_name = &variant.ast().ident;
let variant_name = Literal::string(&variant_name.to_string());

// Closure: given a binding, generate a call to the `salsa_debug` helper to either
// print its "debug with db" value or just use `std::fmt::Debug`. This is a nice hack that
// lets us use `debug_with_db` when available; it won't work great for generic types unless we add
// `DebugWithDb` bounds though.
let binding_tokens = |binding: &synstructure::BindingInfo| {
let field_ty = &binding.ast().ty;
quote!(
&::salsa::debug::helper::SalsaDebug::<#field_ty, #DB>::salsa_debug(
#binding,
#db,
)
)
};

// Create something like `fmt.debug_struct(...).field().field().finish()`
// for each struct field; the values to be debugged are created by
// the `binding_tokens` closure above.
let fields = match variant.ast().fields {
syn::Fields::Named(_) => variant.fold(
quote!(#fmt.debug_struct(#variant_name)),
|tokens, binding| {
let binding_name =
Literal::string(&binding.ast().ident.as_ref().unwrap().to_string());
let binding_data = binding_tokens(binding);
quote!(#tokens . field(#binding_name, #binding_data))
},
),

syn::Fields::Unnamed(_) | syn::Fields::Unit => variant.fold(
quote!(#fmt.debug_tuple(#variant_name)),
|tokens, binding| {
let binding_data = binding_tokens(binding);
quote!(#tokens . field(#binding_data))
},
),
};

quote!(#fields . finish(),)
})
.collect();

let tokens = structure.gen_impl(quote! {
gen impl ::salsa::debug::DebugWithDb<#DB> for @Self {
fn fmt(&self, #fmt: &mut std::fmt::Formatter<'_>, #db: & #DB) -> std::fmt::Result {
use ::salsa::debug::helper::Fallback as _;
match self {
#fields
}
}
}
});

Ok(crate::debug::dump_tokens(&input.ident, tokens))
}
Loading
Loading