> ## Documentation Index
> Fetch the complete documentation index at: https://docs.fetchhive.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Auto Summarize

> Automatically compress long conversation histories so agents never run out of context window

Auto Summarize keeps long agent conversations working by compressing prior history when it approaches the model's context limit.

## What it does

Every agent has this tool enabled by default. Before each turn, Fetch Hive checks whether the accumulated conversation history is approaching the model's context window. If it is, the prior turns are automatically summarised into a single compact message, and the agent continues with that summary as its starting context instead of the raw history.

The agent retains full awareness of what was discussed — it just works from a condensed version of the earlier turns rather than every token verbatim.

## How it fires

Auto Summarize is not a tool the agent calls. It runs as a server-side check before the model ever sees the conversation. The agent and the LLM are unaware of it — from the model's perspective, it simply receives a well-formed conversation history that fits its context window.

When summarization fires during a streaming run, a `summary` event arrives at the start of the stream before any response tokens:

```json theme={null}
{
  "type": "summary",
  "summary_text": "The conversation covered AI infrastructure trends. The user asked about evals...",
  "original_token_count": 15234,
  "context_limit": 200000,
  "model": "gpt-4.1",
  "provider": "openai"
}
```

In the **Chat** panel inside the agent editor, a **Chat summarized** accordion appears in the conversation at the point where summarization occurred. Click it to expand and read the full summary text and token counts.

See [Run with API](../run-with-api) for how to handle this event in your own integration.

## Enabling and disabling

The tool node appears on every agent canvas with a **System** badge. To disable it for a specific agent:

1. Select the **Auto Summarize** node in the editor.
2. In the settings sheet, switch the toggle to **Disabled**.

Disabling it means the agent will send the full raw history on every turn. If the conversation grows beyond the model's context limit, the oldest messages will be truncated by the model provider.

## Configuration

There are no per-agent configuration options for this tool. The summarization threshold and the model used to write summaries are set at the platform level by your workspace operator.

## Use cases

* Long support or research conversations that span many turns without losing earlier context.
* Agents running in `thread_id` mode where conversations persist across multiple sessions.
* Any use case where you want the agent to stay coherent over a long interaction without manual history management.

## Notes

* Auto Summarize only fires on persistent threads (calls that include a `thread_id`). Single-shot calls and stateless history passed via the `messages` field are not affected.
* The summarization call is made by Fetch Hive — it does not count against your token usage for that turn.
* If the summarization service is unavailable for any reason, the agent run proceeds normally with the full history. The feature is fail-open.
* To test it, use **Chat** in the agent editor — the **Chat summarized** accordion appears when the threshold is crossed.
