Embeddings and RAG
forge-embed contains the retrieval primitives needed for RAG pipelines.
It keeps embedding, reranking, vector storage, and document loading behind
provider-neutral traits.
Runtime pieces
| Piece | Purpose |
|---|---|
| Embedding | Generate vector representations for text. |
| Similarity | Compute cosine similarity, dot product, or Euclidean distance. |
| Reranking | Reorder candidate documents by relevance. |
| Vector store | Store vectors and run nearest-neighbor search. |
| Chunking | Split documents into embeddable chunks. |
| Document loading | Load text and JSON documents into the pipeline. |
Reference types
The Rust reference exposes:
EmbeddingProviderRerankerVectorStoreInMemoryVectorStoreRecursiveCharacterSplitterTokenSplitterTextLoaderJsonLoader
Usage pattern
A typical Forge RAG flow is:
- Load source documents.
- Split them into chunks.
- Embed chunks through a provider.
- Store vectors with provenance metadata.
- Embed the user query.
- Search the vector store.
- Rerank the candidates.
- Pass selected context into generation.
When RAG runs inside an accountable agent, retrieved context should be tied to the agent run through telemetry and audit metadata.