Jars and ingredients
This page covers how data is organized in Salsa and how links between Salsa items (e.g., dependency tracking) work.
Salsa items and ingredients
A Salsa item is some item annotated with a Salsa annotation that can be included in a jar. For example, a tracked function is a Salsa item:
#![allow(unused)] fn main() { #[salsa::tracked] fn foo(db: &dyn Db, input: MyInput) { } }
...and so is a Salsa input...
#![allow(unused)] fn main() { #[salsa::input] struct MyInput { } }
...or a tracked struct:
#![allow(unused)] fn main() { #[salsa::tracked] struct MyStruct { } }
Each Salsa item needs certain bits of data at runtime to operate.
These bits of data are called ingredients.
Most Salsa items generate a single ingredient, but sometimes they make more than one.
For example, a tracked function generates a FunctionIngredient
.
A tracked struct, however, generates several ingredients, one for the struct itself (a TrackedStructIngredient
,
and one FunctionIngredient
for each value field.
Ingredients define the core logic of Salsa
Most of the interesting Salsa code lives in these ingredients.
For example, when you create a new tracked struct, the method TrackedStruct::new_struct
is invoked;
it is responsible for determining the tracked struct's id.
Similarly, when you call a tracked function, that is translated into a call to TrackedFunction::fetch
,
which decides whether there is a valid memoized value to return,
or whether the function must be executed.
The Ingredient
trait
Each ingredient implements the Ingredient<DB>
trait, which defines generic operations supported by any kind of ingredient.
For example, the method maybe_changed_after
can be used to check whether some particular piece of data stored in the ingredient may have changed since a given revision:
We'll see below that each database DB
is able to take an IngredientIndex
and use that to get an &dyn Ingredient<DB>
for the corresponding ingredient.
This allows the database to perform generic operations on an indexed ingredient without knowing exactly what the type of that ingredient is.
Jars are a collection of ingredients
When you declare a Salsa jar, you list out each of the Salsa items that are included in that jar:
#[salsa::jar]
struct Jar(
foo,
MyInput,
MyStruct
);
This expands to a struct like so:
#![allow(unused)] fn main() { struct Jar( <foo as IngredientsFor>::Ingredient, <MyInput as IngredientsFor>::Ingredient, <MyStruct as IngredientsFor>::Ingredient, ) }
The IngredientsFor
trait is used to define the ingredients needed by some Salsa item, such as the tracked function foo
or the tracked struct MyInput
.
Each Salsa item defines a type I
so that <I as IngredientsFor>::Ingredient
gives the ingredients needed by I
.
A database is a tuple of jars
Salsa's database storage ultimately boils down to a tuple of jar structs where each jar struct (as we just saw) itself contains the ingredients for the Salsa items within that jar. The database can thus be thought of as a list of ingredients, although that list is organized into a 2-level hierarchy.
The reason for this 2-level hierarchy is that it permits separate compilation and privacy. The crate that lists the jars doens't have to know the contents of the jar to embed the jar struct in the database. And some of the types that appear in the jar may be private to another struct.
The HasJars
trait and the Jars
type
Each Salsa database implements the HasJars
trait,
generated by the salsa::db
procedural macro.
The HarJars
trait, among other things, defines a Jars
associated type that maps to a tuple of the jars in the trait.
For example, given a database like this...
#[salsa::db(Jar1, ..., JarN)]
struct MyDatabase {
storage: salsa::Storage<Self>
}
...the salsa::db
macro would generate a HasJars
impl that (among other things) contains type Jars = (Jar1, ..., JarN)
:
impl salsa::storage::HasJars for #db {
type Jars = (#(#jar_paths,)*);
In turn, the salsa::Storage<DB>
type ultimately contains a struct Shared
that embeds DB::Jars
, thus embedding all the data for each jar.
Ingredient indices
During initialization, each ingredient in the database is assigned a unique index called the IngredientIndex
.
This is a 32-bit number that identifies a particular ingredient from a particular jar.
Routes
In addition to an index, each ingredient in the database also has a corresponding route.
A route is a closure that, given a reference to the DB::Jars
tuple,
returns a &dyn Ingredient<DB>
reference.
The route table allows us to go from the IngredientIndex
for a particular ingredient
to its &dyn Ingredient<DB>
trait object.
The route table is created while the database is being initialized,
as described shortly.
Database keys and dependency keys
A DatabaseKeyIndex
identifies a specific value stored in some specific ingredient.
It combines an IngredientIndex
with a key_index
, which is a salsa::Id
:
/// An "active" database key index represents a database key index
/// that is actively executing. In that case, the `key_index` cannot be
/// None.
#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Debug)]
pub struct DatabaseKeyIndex {
pub(crate) ingredient_index: IngredientIndex,
pub(crate) key_index: Id,
}
A DependencyIndex
is similar, but the key_index
is optional.
This is used when we sometimes wish to refer to the ingredient as a whole, and not any specific value within the ingredient.
These kinds of indices are used to store connetions between ingredients. For example, each memoized value has to track its inputs. Those inputs are stored as dependency indices. We can then do things like ask, "did this input change since revision R?" by
- using the ingredient index to find the route and get a
&dyn Ingredient<DB>
- and then invoking the
maybe_changed_since
method on that trait object.
HasJarsDyn
There is one catch in the above setup.
The user's code always interacts with a dyn crate::Db
value, where crate::Db
is the trait defined by the jar; the crate::Db
trait extends salsa::HasJar
which in turn extends salsa::Database
.
Ideally, we would have salsa::Database
extend salsa::HasJars
, which is the main trait that gives access to the jars data.
But we don't want to do that because HasJars
defines an associated type Jars
, and that would mean that every reference to dyn crate::Db
would have to specify the jars type using something like dyn crate::Db<Jars = J>
.
This would be unergonomic, but what's worse, it would actually be impossible: the final Jars type combines the jars from multiple crates, and so it is not known to any individual jar crate.
To workaround this, salsa::Database
in fact extends another trait, HasJarsDyn
, that doesn't reveal the Jars
or ingredient types directly, but just has various method that can be performed on an ingredient, given its IngredientIndex
.
Traits like Ingredient<DB>
require knowing the full DB
type.
If we had one function ingredient directly invoke a method on Ingredient<DB>
, that would imply that it has to be fully generic and only instantiated at the final crate, when the full database type is available.
We solve this via the HasJarsDyn
trait. The HasJarsDyn
trait exports a method that combines the "find ingredient, invoking method" steps into one method:
/// Dyn friendly subset of HasJars
pub trait HasJarsDyn {
fn runtime(&self) -> &Runtime;
fn runtime_mut(&mut self) -> &mut Runtime;
fn maybe_changed_after(&self, input: DependencyIndex, revision: Revision) -> bool;
fn cycle_recovery_strategy(&self, input: IngredientIndex) -> CycleRecoveryStrategy;
fn origin(&self, input: DatabaseKeyIndex) -> Option<QueryOrigin>;
fn mark_validated_output(&self, executor: DatabaseKeyIndex, output: DependencyIndex);
/// Invoked when `executor` used to output `stale_output` but no longer does.
/// This method routes that into a call to the [`remove_stale_output`](`crate::ingredient::Ingredient::remove_stale_output`)
/// method on the ingredient for `stale_output`.
fn remove_stale_output(&self, executor: DatabaseKeyIndex, stale_output: DependencyIndex);
/// Informs `ingredient` that the salsa struct with id `id` has been deleted.
/// This means that `id` will not be used in this revision and hence
/// any memoized values keyed by that struct can be discarded.
///
/// In order to receive this callback, `ingredient` must have registered itself
/// as a dependent function using
/// [`SalsaStructInDb::register_dependent_fn`](`crate::salsa_struct::SalsaStructInDb::register_dependent_fn`).
fn salsa_struct_deleted(&self, ingredient: IngredientIndex, id: Id);
fn fmt_index(&self, index: DependencyIndex, fmt: &mut fmt::Formatter<'_>) -> fmt::Result;
}
So, technically, to check if an input has changed, an ingredient:
- Invokes
HasJarsDyn::maybe_changed_after
on thedyn Database
- The impl for this method (generated by
#[salsa::db]
):- gets the route for the ingredient from the ingredient index
- uses the route to get a
&dyn Ingredient
- invokes
maybe_changed_after
on that ingredient
Initializing the database
The last thing to dicsuss is how the database is initialized.
The Default
implementation for Storage<DB>
does the work:
impl<DB> Default for Storage<DB>
where
DB: HasJars,
{
fn default() -> Self {
let mut routes = Routes::new();
let jars = DB::create_jars(&mut routes);
Self {
shared: Shared {
jars: Some(Arc::from(jars)),
cvar: Arc::new(Default::default()),
noti_lock: Arc::new(parking_lot::Mutex::new(())),
},
routes: Arc::new(routes),
runtime: Runtime::default(),
}
}
}
First, it creates an empty Routes
instance.
Then it invokes the DB::create_jars
method.
The implementation of this method is defined by the #[salsa::db]
macro; it invokes salsa::plumbing::create_jars_inplace
to allocate memory for the jars, and then invokes the Jar::init_jar
method on each of the jars to initialize them:
fn create_jars(routes: &mut salsa::routes::Routes<Self>) -> Box<Self::Jars> {
unsafe {
salsa::plumbing::create_jars_inplace::<#db>(|jars| {
(
unsafe {
let place = std::ptr::addr_of_mut!((*jars).#jar_field_names);
<#jar_paths as salsa::jar::Jar>::init_jar(place, routes);
}
)*
})
}
}
This implementation for init_jar
is generated by the #[salsa::jar]
macro, and simply walks over the representative type for each salsa item and asks it to create its ingredients
quote! {
unsafe impl<'salsa_db> salsa::jar::Jar<'salsa_db> for #jar_struct {
type DynDb = dyn #jar_trait + 'salsa_db;
unsafe fn init_jar<DB>(place: *mut Self, routes: &mut salsa::routes::Routes<DB>)
where
DB: salsa::storage::JarFromJars<Self> + salsa::storage::DbWithJar<Self>,
{
(
unsafe {
std::ptr::addr_of_mut!((*place).#field_var_names)
.write(<#field_tys as salsa::storage::IngredientsFor>::create_ingredients(routes));
}
)*
}
}
}
The code to create the ingredients for any particular item is generated by their associated macros (e.g., #[salsa::tracked]
, #[salsa::input]
), but it always follows a particular structure.
To create an ingredient, we first invoke Routes::push
, which creates the routes to that ingredient and assigns it an IngredientIndex
.
We can then invoke a function such as FunctionIngredient::new
to create the structure.
The routes to an ingredient are defined as closures that, given the DB::Jars
, can find the data for a particular ingredient.