Generation
forge-generate provides higher-level helpers built on top of
LanguageModel::stream_chunks. Use these when you need text or structured
output without the full agent tool loop.
Plain text — buffered
use forge::generate::stream_text;
let result = stream_text(&model, "Write a haiku about the forge.", &options).await?;
println!("{}", result.text);
println!("[{} tokens]", result.usage.completion_tokens);
stream_text consumes the full chunk stream and returns a TextStreamResult
with the assembled text + usage. Internally it drives stream_chunks with
no buffering inside the model.
Plain text — streamed
use forge::generate::stream_text_chunks;
use futures_util::StreamExt;
let mut stream = stream_text_chunks(&model, "Write a haiku.", &options).await?;
while let Some(delta) = stream.next().await {
print!("{}", delta?);
}
Yields String deltas as they arrive — the lightest path to per-token text
output without writing your own match on StreamChunk.
Structured output
For typed outputs, use generate_object:
use forge::generate::generate_object;
use serde::Deserialize;
#[derive(Deserialize, Debug)]
struct WeatherSummary {
city: String,
temperature_f: f64,
conditions: String,
confidence: f64,
}
let summary: WeatherSummary = generate_object(
&model,
"Summarise the weather in Mountain View.",
schemars::schema_for!(WeatherSummary),
&options,
).await?;
generate_object adds the JSON schema to the request, asks the model for
JSON output, parses, and returns the typed value. Failures are returned as
ForgeError::SchemaValidation.
For streamed structured output, use stream_object (yields partial values
as the JSON is assembled).
GenerateOptions
let options = GenerateOptions::default()
.with_temperature(0.2)
.with_max_tokens(1024)
.with_stop_sequences(vec!["END".into()])
.with_system_prompt("Respond strictly in valid JSON.")
.with_metadata("trace_id", trace_id);
Available knobs:
| Method | Purpose |
|---|---|
with_temperature(f32) |
Sampling temperature |
with_max_tokens(u32) |
Output cap |
with_top_p(f32) |
Nucleus sampling |
with_stop_sequences(Vec<String>) |
Halt on these strings |
with_system_prompt(String) |
One-off system prompt |
with_metadata(key, value) |
Forwarded as observability metadata |
Tool-aware generation
If you want the model to call tools but don't need the full agent runtime
(no observers, no tool-loop bound, no record collection), use
stream_chunks directly with a tools argument and execute them yourself:
use futures_util::StreamExt;
let mut stream = model
.stream_chunks(&messages, &tool_defs, &options)
.await?;
while let Some(item) = stream.next().await {
match item? {
StreamChunk::TextDelta { text } => print!("{text}"),
StreamChunk::ToolCallEnd { id } => {
// Look up the buffered tool call, execute, append result, continue
}
StreamChunk::Done { .. } => break,
_ => {}
}
}
For real production use, just build a StreamingToolLoopAgent —
Agents.