Jars and ingredients

⚠️ IN-PROGRESS VERSION OF SALSA. ⚠️

This page describes the unreleased "Salsa 2022" version, which is a major departure from older versions of salsa. The code here works but is only available on github and from the salsa-2022 crate.

This page covers how data is organized in Salsa and how links between Salsa items (e.g., dependency tracking) work.

Salsa items and ingredients

A Salsa item is some item annotated with a Salsa annotation that can be included in a jar. For example, a tracked function is a Salsa item:


#![allow(unused)]
fn main() {
#[salsa::tracked]
fn foo(db: &dyn Db, input: MyInput) { }
}

...and so is a Salsa input...


#![allow(unused)]
fn main() {
#[salsa::input]
struct MyInput { }
}

...or a tracked struct:


#![allow(unused)]
fn main() {
#[salsa::tracked]
struct MyStruct { }
}

Each Salsa item needs certain bits of data at runtime to operate. These bits of data are called ingredients. Most Salsa items generate a single ingredient, but sometimes they make more than one. For example, a tracked function generates a FunctionIngredient. A tracked struct, however, generates several ingredients, one for the struct itself (a TrackedStructIngredient, and one FunctionIngredient for each value field.

Ingredients define the core logic of Salsa

Most of the interesting Salsa code lives in these ingredients. For example, when you create a new tracked struct, the method TrackedStruct::new_struct is invoked; it is responsible for determining the tracked struct's id. Similarly, when you call a tracked function, that is translated into a call to TrackedFunction::fetch, which decides whether there is a valid memoized value to return, or whether the function must be executed.

The Ingredient trait

Each ingredient implements the Ingredient<DB> trait, which defines generic operations supported by any kind of ingredient. For example, the method maybe_changed_after can be used to check whether some particular piece of data stored in the ingredient may have changed since a given revision:

We'll see below that each database DB is able to take an IngredientIndex and use that to get an &dyn Ingredient<DB> for the corresponding ingredient. This allows the database to perform generic operations on an indexed ingredient without knowing exactly what the type of that ingredient is.

Jars are a collection of ingredients

When you declare a Salsa jar, you list out each of the Salsa items that are included in that jar:

#[salsa::jar]
struct Jar(
    foo,
    MyInput,
    MyStruct
);

This expands to a struct like so:


#![allow(unused)]
fn main() {
struct Jar(
    <foo as IngredientsFor>::Ingredient,
    <MyInput as IngredientsFor>::Ingredient,
    <MyStruct as IngredientsFor>::Ingredient,
)
}

The IngredientsFor trait is used to define the ingredients needed by some Salsa item, such as the tracked function foo or the tracked struct MyInput. Each Salsa item defines a type I so that <I as IngredientsFor>::Ingredient gives the ingredients needed by I.

A database is a tuple of jars

Salsa's database storage ultimately boils down to a tuple of jar structs where each jar struct (as we just saw) itself contains the ingredients for the Salsa items within that jar. The database can thus be thought of as a list of ingredients, although that list is organized into a 2-level hierarchy.

The reason for this 2-level hierarchy is that it permits separate compilation and privacy. The crate that lists the jars doens't have to know the contents of the jar to embed the jar struct in the database. And some of the types that appear in the jar may be private to another struct.

The HasJars trait and the Jars type

Each Salsa database implements the HasJars trait, generated by the salsa::db procedural macro. The HarJars trait, among other things, defines a Jars associated type that maps to a tuple of the jars in the trait.

For example, given a database like this...

#[salsa::db(Jar1, ..., JarN)]
struct MyDatabase {
    storage: salsa::Storage<Self>
}

...the salsa::db macro would generate a HasJars impl that (among other things) contains type Jars = (Jar1, ..., JarN):

        impl salsa::storage::HasJars for #db {
            type Jars = (#(#jar_paths,)*);

In turn, the salsa::Storage<DB> type ultimately contains a struct Shared that embeds DB::Jars, thus embedding all the data for each jar.

Ingredient indices

During initialization, each ingredient in the database is assigned a unique index called the IngredientIndex. This is a 32-bit number that identifies a particular ingredient from a particular jar.

Routes

In addition to an index, each ingredient in the database also has a corresponding route. A route is a closure that, given a reference to the DB::Jars tuple, returns a &dyn Ingredient<DB> reference. The route table allows us to go from the IngredientIndex for a particular ingredient to its &dyn Ingredient<DB> trait object. The route table is created while the database is being initialized, as described shortly.

Database keys and dependency keys

A DatabaseKeyIndex identifies a specific value stored in some specific ingredient. It combines an IngredientIndex with a key_index, which is a salsa::Id:

/// An "active" database key index represents a database key index
/// that is actively executing. In that case, the `key_index` cannot be
/// None.
#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Debug)]
pub struct DatabaseKeyIndex {
    pub(crate) ingredient_index: IngredientIndex,
    pub(crate) key_index: Id,
}

A DependencyIndex is similar, but the key_index is optional. This is used when we sometimes wish to refer to the ingredient as a whole, and not any specific value within the ingredient.

These kinds of indices are used to store connetions between ingredients. For example, each memoized value has to track its inputs. Those inputs are stored as dependency indices. We can then do things like ask, "did this input change since revision R?" by

  • using the ingredient index to find the route and get a &dyn Ingredient<DB>
  • and then invoking the maybe_changed_since method on that trait object.

HasJarsDyn

There is one catch in the above setup. The user's code always interacts with a dyn crate::Db value, where crate::Db is the trait defined by the jar; the crate::Db trait extends salsa::HasJar which in turn extends salsa::Database. Ideally, we would have salsa::Database extend salsa::HasJars, which is the main trait that gives access to the jars data. But we don't want to do that because HasJars defines an associated type Jars, and that would mean that every reference to dyn crate::Db would have to specify the jars type using something like dyn crate::Db<Jars = J>. This would be unergonomic, but what's worse, it would actually be impossible: the final Jars type combines the jars from multiple crates, and so it is not known to any individual jar crate. To workaround this, salsa::Database in fact extends another trait, HasJarsDyn, that doesn't reveal the Jars or ingredient types directly, but just has various method that can be performed on an ingredient, given its IngredientIndex. Traits like Ingredient<DB> require knowing the full DB type. If we had one function ingredient directly invoke a method on Ingredient<DB>, that would imply that it has to be fully generic and only instantiated at the final crate, when the full database type is available.

We solve this via the HasJarsDyn trait. The HasJarsDyn trait exports a method that combines the "find ingredient, invoking method" steps into one method:

/// Dyn friendly subset of HasJars
pub trait HasJarsDyn {
    fn runtime(&self) -> &Runtime;

    fn runtime_mut(&mut self) -> &mut Runtime;

    fn maybe_changed_after(&self, input: DependencyIndex, revision: Revision) -> bool;

    fn cycle_recovery_strategy(&self, input: IngredientIndex) -> CycleRecoveryStrategy;

    fn origin(&self, input: DatabaseKeyIndex) -> Option<QueryOrigin>;

    fn mark_validated_output(&self, executor: DatabaseKeyIndex, output: DependencyIndex);

    /// Invoked when `executor` used to output `stale_output` but no longer does.
    /// This method routes that into a call to the [`remove_stale_output`](`crate::ingredient::Ingredient::remove_stale_output`)
    /// method on the ingredient for `stale_output`.
    fn remove_stale_output(&self, executor: DatabaseKeyIndex, stale_output: DependencyIndex);

    /// Informs `ingredient` that the salsa struct with id `id` has been deleted.
    /// This means that `id` will not be used in this revision and hence
    /// any memoized values keyed by that struct can be discarded.
    ///
    /// In order to receive this callback, `ingredient` must have registered itself
    /// as a dependent function using
    /// [`SalsaStructInDb::register_dependent_fn`](`crate::salsa_struct::SalsaStructInDb::register_dependent_fn`).
    fn salsa_struct_deleted(&self, ingredient: IngredientIndex, id: Id);

    fn fmt_index(&self, index: DependencyIndex, fmt: &mut fmt::Formatter<'_>) -> fmt::Result;
}

So, technically, to check if an input has changed, an ingredient:

  • Invokes HasJarsDyn::maybe_changed_after on the dyn Database
  • The impl for this method (generated by #[salsa::db]):
    • gets the route for the ingredient from the ingredient index
    • uses the route to get a &dyn Ingredient
    • invokes maybe_changed_after on that ingredient

Initializing the database

The last thing to dicsuss is how the database is initialized. The Default implementation for Storage<DB> does the work:

impl<DB> Default for Storage<DB>
where
    DB: HasJars,
{
    fn default() -> Self {
        let mut routes = Routes::new();
        let jars = DB::create_jars(&mut routes);
        Self {
            shared: Arc::new(Shared {
                jars,
                cvar: Default::default(),
            }),
            routes: Arc::new(routes),
            runtime: Runtime::default(),
        }
    }
}

First, it creates an empty Routes instance. Then it invokes the DB::create_jars method. The implementation of this method is defined by the #[salsa::db] macro; it simply invokes the Jar::create_jar method on each of the jars:

            fn create_jars(routes: &mut salsa::routes::Routes<Self>) -> Self::Jars {
                (
                    (
                        <#jar_paths as salsa::jar::Jar>::create_jar(routes),
                    )*
                )
            }

This implementation for create_jar is geneated by the #[salsa::jar] macro, and simply walks over the representative type for each salsa item and asks it to create its ingredients

    quote! {
        impl<'salsa_db> salsa::jar::Jar<'salsa_db> for #jar_struct {
            type DynDb = dyn #jar_trait + 'salsa_db;

            fn create_jar<DB>(routes: &mut salsa::routes::Routes<DB>) -> Self
            where
                DB: salsa::storage::JarFromJars<Self> + salsa::storage::DbWithJar<Self>,
            {
                (
                    let #field_var_names = <#field_tys as salsa::storage::IngredientsFor>::create_ingredients(routes);
                )*
                Self(#(#field_var_names),*)
            }
        }
    }

The code to create the ingredients for any particular item is generated by their associated macros (e.g., #[salsa::tracked], #[salsa::input]), but it always follows a particular structure. To create an ingredient, we first invoke Routes::push, which creates the routes to that ingredient and assigns it an IngredientIndex. We can then invoke a function such as FunctionIngredient::new to create the structure. The routes to an ingredient are defined as closures that, given the DB::Jars, can find the data for a particular ingredient.