Selection

The "selection" (or "firewall") pattern is when you have a query Qsel that reads from some other Qbase and extracts some small bit of information from Qbase that it returns. In particular, Qsel does not combine values from other queries. In some sense, then, Qsel is redundant -- you could have just extracted the information the information from Qbase yourself, and done without the salsa machinery. But Qsel serves a role in that it limits the amount of re-execution that is required when Qbase changes.

Example: the base query

For example, imagine that you have a query parse that parses the input text of a request and returns a ParsedResult, which contains a header and a body:

#[derive(Clone, Debug, PartialEq, Eq)]
struct ParsedResult {
    header: Vec<ParsedHeader>,
    body: String,
}

#[derive(Clone, Debug, PartialEq, Eq)]
struct ParsedHeader {
    key: String,
    value: String,
}

#[salsa::query_group(Request)]
trait RequestParser {
    /// The base text of the request.
    #[salsa::input]
    fn request_text(&self) -> String;

    /// The parsed form of the request.
    fn parse(&self) -> ParsedResult;
} 

Example: a selecting query

And now you have a number of derived queries that only look at the header. For example, one might extract the "content-type' header:

#[salsa::query_group(Request)]
trait RequestUtil: RequestParser {
    fn content_type(&self) -> Option<String>;
}

fn content_type(db: &dyn RequestUtil) -> Option<String> {
    db.parse()
        .header
        .iter()
        .find(|header| header.key == "content-type")
        .map(|header| header.value.clone())
} 

Why prefer a selecting query?

This content_type query is an instance of the selection pattern. It only "selects" a small bit of information from the ParsedResult. You might not have made it a query at all, but instead made it a method on ParsedResult.

But using a query for content_type has an advantage: now if there are downstream queries that only depend on the content_type (or perhaps on other headers extracted via a similar pattern), those queries will not have to be re-executed when the request changes unless the content-type header changes. Consider the dependency graph:

request_text  -->  parse  -->  content_type  -->  (other queries)

When the request_text changes, we are always going to have to re-execute parse. If that produces a new parsed result, we are also going to re-execute content_type. But if the result of content_type has not changed, then we will not re-execute the other queries.

More levels of selection

In fact, in our example we might consider introducing another level of selection. Instead of having content_type directly access the results of parse, it might be better to insert a selecting query that just extracts the header:

#[salsa::query_group(Request)]
trait RequestUtil: RequestParser {
    fn header(&self) -> Vec<ParsedHeader>;
    fn content_type(&self) -> Option<String>;
}

fn header(db: &dyn RequestUtil) -> Vec<ParsedHeader> {
    db.parse().header.clone()
}

fn content_type(db: &dyn RequestUtil) -> Option<String> {
    db.header()
        .iter()
        .find(|header| header.key == "content-type")
        .map(|header| header.value.clone())
} 

This will result in a dependency graph like so:

request_text  -->  parse  -->  header -->  content_type  -->  (other queries)

The advantage of this is that changes that only effect the "body" or only consume small parts of the request will not require us to re-execute content_type at all. This would be particularly valuable if there are a lot of dependent headers.

A note on cloning and efficiency

In this example, we used common Rust types like Vec and String, and we cloned them quite frequently. This will work just fine in Salsa, but it may not be the most efficient choice. This is because each clone is going to produce a deep copy of the result. As a simple fix, you might convert your data structures to use Arc (e.g., Arc<Vec<ParsedHeader>>), which makes cloning cheap.