On-Demand (Lazy) Inputs

⚠️ IN-PROGRESS VERSION OF SALSA. ⚠️

This page describes the unreleased "Salsa 2022" version, which is a major departure from older versions of salsa. The code here works but is only available on github and from the salsa-2022 crate.

If you are looking for the older version of salsa, simply visit this link

Salsa inputs work best if you can easily provide all of the inputs upfront. However sometimes the set of inputs is not known beforehand.

A typical example is reading files from disk. While it is possible to eagerly scan a particular directory and create an in-memory file tree as salsa input structs, a more straight-forward approach is to read the files lazily. That is, when a query requests the text of a file for the first time:

  1. Read the file from disk and cache it.
  2. Setup a file-system watcher for this path.
  3. Update the cached file when the watcher sends a change notification.

This is possible to achieve in salsa, by caching the inputs in your database structs and adding a method to the database trait to retrieve them out of this cache.

A complete, runnable file-watching example can be found in the lazy-input example.

The setup looks roughly like this:

#[salsa::input]
struct File {
    path: PathBuf,
    #[return_ref]
    contents: String,
}

trait Db: salsa::DbWithJar<Jar> {
    fn input(&self, path: PathBuf) -> Result<File>;
}

#[salsa::db(Jar)]
struct Database {
    storage: salsa::Storage<Self>,
    logs: Mutex<Vec<String>>,
    files: DashMap<PathBuf, File>,
    file_watcher: Mutex<Debouncer<RecommendedWatcher>>,
}

impl Database {
    fn new(tx: Sender<DebounceEventResult>) -> Self {
        let storage = Default::default();
        Self {
            storage,
            logs: Default::default(),
            files: DashMap::new(),
            file_watcher: Mutex::new(new_debouncer(Duration::from_secs(1), None, tx).unwrap()),
        }
    }
}

impl Db for Database {
    fn input(&self, path: PathBuf) -> Result<File> {
        let path = path
            .canonicalize()
            .wrap_err_with(|| format!("Failed to read {}", path.display()))?;
        Ok(match self.files.entry(path.clone()) {
            // If the file already exists in our cache then just return it.
            Entry::Occupied(entry) => *entry.get(),
            // If we haven't read this file yet set up the watch, read the
            // contents, store it in the cache, and return it.
            Entry::Vacant(entry) => {
                // Set up the watch before reading the contents to try to avoid
                // race conditions.
                let watcher = &mut *self.file_watcher.lock().unwrap();
                watcher
                    .watcher()
                    .watch(&path, RecursiveMode::NonRecursive)
                    .unwrap();
                let contents = std::fs::read_to_string(&path)
                    .wrap_err_with(|| format!("Failed to read {}", path.display()))?;
                *entry.insert(File::new(self, path, contents))
            }
        })
    }
}
  • We declare a method on the Db trait that gives us a File input on-demand (it only requires a &dyn Db not a &mut dyn Db).
  • There should only be one input struct per file, so we implement that method using a cache (DashMap is like a RwLock<HashMap>).

The driving code that's doing the top-level queries is then in charge of updating the file contents when a file-change notification arrives. It does this by updating the Salsa input in the same way that you would update any other input.

Here we implement a simple driving loop, that recompiles the code whenever a file changes. You can use the logs to check that only the queries that could have changed are re-evaluated.

fn main() -> Result<()> {
    // Create the channel to receive file change events.
    let (tx, rx) = unbounded();
    let mut db = Database::new(tx);

    let initial_file_path = std::env::args_os()
        .nth(1)
        .ok_or_else(|| eyre!("Usage: ./lazy-input <input-file>"))?;

    // Create the initial input using the input method so that changes to it
    // will be watched like the other files.
    let initial = db.input(initial_file_path.into())?;
    loop {
        // Compile the code starting at the provided input, this will read other
        // needed files using the on-demand mechanism.
        let sum = compile(&db, initial);
        let diagnostics = compile::accumulated::<Diagnostic>(&db, initial);
        if diagnostics.is_empty() {
            println!("Sum is: {}", sum);
        } else {
            for diagnostic in diagnostics {
                println!("{}", diagnostic);
            }
        }

        for log in db.logs.lock().unwrap().drain(..) {
            eprintln!("{}", log);
        }

        // Wait for file change events, the output can't change unless the
        // inputs change.
        for event in rx.recv()?.unwrap() {
            let path = event.path.canonicalize().wrap_err_with(|| {
                format!("Failed to canonicalize path {}", event.path.display())
            })?;
            let file = match db.files.get(&path) {
                Some(file) => *file,
                None => continue,
            };
            // `path` has changed, so read it and update the contents to match.
            // This creates a new revision and causes the incremental algorithm
            // to kick in, just like any other update to a salsa input.
            let contents = std::fs::read_to_string(path)
                .wrap_err_with(|| format!("Failed to read file {}", event.path.display()))?;
            file.set_contents(&mut db).to(contents);
        }
    }
}