Embeddings and RAG

forge-embed contains the retrieval primitives needed for RAG pipelines. It keeps embedding, reranking, vector storage, and document loading behind provider-neutral traits.

Runtime pieces

Piece Purpose
Embedding Generate vector representations for text.
Similarity Compute cosine similarity, dot product, or Euclidean distance.
Reranking Reorder candidate documents by relevance.
Vector store Store vectors and run nearest-neighbor search.
Chunking Split documents into embeddable chunks.
Document loading Load text and JSON documents into the pipeline.

Reference types

The Rust reference exposes:

  • EmbeddingProvider
  • Reranker
  • VectorStore
  • InMemoryVectorStore
  • RecursiveCharacterSplitter
  • TokenSplitter
  • TextLoader
  • JsonLoader

Usage pattern

A typical Forge RAG flow is:

  1. Load source documents.
  2. Split them into chunks.
  3. Embed chunks through a provider.
  4. Store vectors with provenance metadata.
  5. Embed the user query.
  6. Search the vector store.
  7. Rerank the candidates.
  8. Pass selected context into generation.

When RAG runs inside an accountable agent, retrieved context should be tied to the agent run through telemetry and audit metadata.