# Run with API

Use the public agent invoke endpoint when you want to send a message to an agent from your own app or service. In Fetch Hive, you can copy the request shape from **More** -> **Get Code** in the agents sidebar or from **Code Snippet** in the agent editor.

## Authentication

```bash
Authorization: Bearer YOUR_API_KEY
```

See [API Keys](/your-workspace/api-keys.md) for how to create and manage keys.

## Endpoint

`POST https://api.fetchhive.com/v1/agent/invoke`

If you want Fetch Hive to generate the cURL example for you, open **Agents**, then use **More** -> **Get Code**. If you are already in the editor for a specific agent, click **Code Snippet** instead.

## Request

Use this request shape:

| Field       | Type    | Required | Description                                                                                                                                                                            |
| ----------- | ------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `agent`     | string  | Yes      | The agent ID                                                                                                                                                                           |
| `message`   | string  | Yes      | The message you want to send to the agent                                                                                                                                              |
| `streaming` | boolean | No       | Whether the response should stream back as events                                                                                                                                      |
| `thread_id` | string  | No       | An arbitrary string identifying the conversation thread. Fetch Hive creates a new thread on first use and resumes it on subsequent calls with the same value.                          |
| `messages`  | array   | No       | Previous conversation turns supplied by the caller. Used as context without persisting to the database. Each item: `{ "content": string, "role": "user" \| "assistant" \| "system" }`. |

The in-app snippet shows the same body shape:

```json
{
  "agent": "AGENT_UUID",
  "message": "Your message here",
  "streaming": true
}
```

## Basic example

```bash
curl 'https://api.fetchhive.com/v1/agent/invoke' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -H 'Accept: application/json' \
  --data-raw '{
    "agent": "AGENT_UUID",
    "message": "Summarize the latest AI infrastructure trends",
    "streaming": true
  }' \
  --compressed
```

This matches the cURL snippet shown in the product. The invoke dialog currently shows **cURL**, while **Python** and **TypeScript** still show **Coming Soon**.

## Response

If `streaming` is `true`, the route returns a stream of events rather than one final JSON object.

### Streaming response

The stream can include a summary event, reasoning chunks, response chunks, tool events, and a final usage event. Events arrive in this order when all are present: `summary` → `reasoning` → `response` → `tool` → `usage`.

Summary event (only emitted on threads with auto-summarization enabled):

```json
{
  "type": "summary",
  "summary_text": "The conversation covered AI infrastructure trends. The user asked about evals and tool routing...",
  "original_token_count": 15234,
  "context_limit": 200000,
  "model": "gpt-4.1",
  "provider": "openai"
}
```

This event fires at the start of the stream when the accumulated thread history has crossed the auto-summarization threshold. It means the prior conversation was compressed into a summary before being sent to the model — the agent retains the context, but the raw history has been condensed. `original_token_count` is the token count before compression; `context_limit` is the model's total context window. You can surface this to users as a "conversation summarized" indicator, or ignore it — your app's behaviour is unaffected either way.

Reasoning event:

```json
{
  "request_id": "req_019d52846ea37682b03522fd0695cc43",
  "type": "reasoning",
  "response": "Looking at the latest model releases..."
}
```

Response event:

```json
{
  "request_id": "req_019d52846ea37682b03522fd0695cc43",
  "type": "response",
  "response": "Teams are standardizing around evals, routing, and observability.",
  "done": false
}
```

Tool event:

```json
{
  "request_id": "req_019d52846ea37682b03522fd0695cc43",
  "type": "tool",
  "tool_id": "tool_123",
  "tool": "google_search",
  "tool_input": {
    "query": "latest AI infrastructure trends 2026"
  },
  "observation": {
    "results": []
  }
}
```

Final usage event:

```json
{
  "request_id": "req_019d52846ea37682b03522fd0695cc43",
  "type": "usage",
  "usage": {
    "duration": 4.79230260848999,
    "input_tokens": {
      "total_tokens": 24,
      "cached_tokens": 0
    },
    "output_tokens": {
      "total_tokens": 170,
      "reasoning_tokens": 64
    },
    "total_tokens": 194
  },
  "stop_reason": "completed"
}
```

### Non-streaming response

If `streaming` is `false`, the route returns one JSON response with the generated output, usage data, and the request ID you can use to inspect the run in **Logs**.

The exact output field can vary by provider, but the response includes the run metadata you need. For example:

```json
{
  "request_id": "req_019d528660dd7e22b15e5b13a1931c50",
  "model": "gpt-5-nano-2025-08-07",
  "duration": 4.641960144042969,
  "response": "Teams are moving from simple wrappers to systems with evals, tool routing, and tighter cost controls.",
  "reasoning": "The request asks for a short summary of current infrastructure trends.",
  "usage": {
    "input_tokens": {
      "total_tokens": 24,
      "cached_tokens": 0
    },
    "output_tokens": {
      "total_tokens": 187,
      "reasoning_tokens": 64
    },
    "total_tokens": 211
  },
  "stop_reason": "completed"
}
```

If the agent uses tools, the non-streaming response can also include tool execution details.

## Multi-turn conversations

The invoke endpoint supports two approaches for multi-turn conversations.

### Persistent threads (Fetch Hive manages history)

Pass a `thread_id` — any string you choose — and Fetch Hive will automatically create the thread on the first call and resume it on every subsequent call with the same value. Message history is stored in Fetch Hive and included in the context automatically.

```bash
# First turn
curl 'https://api.fetchhive.com/v1/agent/invoke' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  --data-raw '{
    "agent": "AGENT_UUID",
    "message": "What are the main AI infrastructure trends right now?",
    "streaming": true,
    "thread_id": "user-456-support-session"
  }'

# Second turn — same thread_id resumes the conversation
curl 'https://api.fetchhive.com/v1/agent/invoke' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  --data-raw '{
    "agent": "AGENT_UUID",
    "message": "Which of those trends have the most enterprise adoption?",
    "streaming": true,
    "thread_id": "user-456-support-session"
  }'
```

You can use any string as a `thread_id` — a user ID, session ID, ticket number, or any other identifier that makes sense for your use case.

### Stateless history (caller manages history)

If you prefer to manage conversation state yourself, pass the previous turns in the `messages` array. Fetch Hive uses the provided history for context but does not persist it.

```bash
curl 'https://api.fetchhive.com/v1/agent/invoke' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  --data-raw '{
    "agent": "AGENT_UUID",
    "message": "Which of those trends have the most enterprise adoption?",
    "streaming": true,
    "messages": [
      { "content": "What are the main AI infrastructure trends right now?", "role": "user" },
      { "content": "Teams are focusing on evals, tool routing, and observability.", "role": "assistant" }
    ]
  }'
```

Use `messages` when you already maintain your own chat state and do not need Fetch Hive to store the conversation history.

## Next steps

* [Logs](/agents/logs.md)
* [Testing with Chat](/agents/testing-with-chat.md)
* [Run with Python SDK](/agents/run-with-python-sdk.md)
* [Run with Node.js SDK](/agents/run-with-nodejs-sdk.md)
* [Run with Ruby SDK](/agents/run-with-ruby-sdk.md)
* [Run with PHP SDK](/agents/run-with-php-sdk.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.fetchhive.com/agents/run-with-api.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
