PLAYBOOK · 04

Playbook · Full-text search

Classic inverted-index search with BM25, analyzers, highlighters, and query-time boosting. The workload XERJ replaces Elasticsearch on, one-for-one — and where the ES-compat API on port 9200 pays the most dividends.

Schema

$ curl -sX PUT http://localhost:9200/articles \
    -H 'Content-Type: application/json' \
    -d '{
      "mappings": {
        "properties": {
          "title":    { "type": "text", "analyzer": "english" },
          "body":     { "type": "text", "analyzer": "english" },
          "tags":     { "type": "keyword" },
          "author":   { "type": "keyword" },
          "published":{ "type": "date" }
        }
      }
    }'

Bulk ingest

$ curl -sX POST http://localhost:9200/articles/_bulk \
    -H 'Content-Type: application/x-ndjson' \
    --data-binary @articles.ndjson

Multi-field query with field boosting

{
  "query": {
    "multi_match": {
      "query":  "kernel panic reboot",
      "fields": ["title^3", "body"],
      "type":   "best_fields"
    }
  },
  "size": 10,
  "sort": [ "_score", { "published": "desc" } ]
}

Phrase search with slop

{
  "query": {
    "match_phrase": {
      "body": { "query": "kernel panic", "slop": 2 }
    }
  }
}

Filtered search with facets

{
  "query": {
    "bool": {
      "must":   [ { "match": { "body": "kubernetes" } } ],
      "filter": [ { "term": { "tags": "ops" } } ]
    }
  },
  "aggs": {
    "by_author": { "terms": { "field": "author", "size": 10 } },
    "over_time": { "date_histogram": { "field": "published", "calendar_interval": "month" } }
  }
}

Source · engine/crates/fts/src/bm25.rs