Run with API - Fetch Hive

Use the public invoke endpoint when you want to call a prompt from your own app or service. First create a prompt deployment and variant in the dashboard, then invoke the deployment with your workspace API key.

Authentication

Authorization: Bearer YOUR_API_KEY

See API Keys for how to create and manage keys.

Endpoint

POST https://api.fetchhive.com/v1/prompt/invoke Before you call this endpoint, create or update a prompt deployment from the prompt editor. See Publishing and Versioning for the UI flow.

Request

Use this request shape:

Field	Type	Required	Description
`deployment`	string	Yes	The prompt deployment name you created for the prompt
`variant`	string	Yes	The prompt deployment variant you want to run
`inputs`	object	No	Key-value pairs for any prompt variables used by the prompt
`streaming`	boolean	No	Whether the response should be streamed
`metadata`	object	No	Flat caller-defined metadata for audit and log filtering. This is not used as prompt input.

metadata must be flat and scalar-only: strings, numbers, booleans, or null. Nested objects and arrays return a validation error before the run starts. This endpoint does not accept top-level image_urls or document attachments. If a deployed prompt uses an image URL message part, configure that image URL in the prompt editor or bind it through an inputs variable in the prompt content. For runtime image/document attachments, invoke an agent with POST /v1/agent/invoke.

Basic example

curl 'https://api.fetchhive.com/v1/prompt/invoke' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -H 'Accept: application/json' \
  --data-raw '{
    "deployment": "YOUR_DEPLOYMENT_NAME",
    "variant": "YOUR_VARIANT_NAME",
    "inputs": {
      "text": "Fetch Hive helps teams ship AI products faster."
    },
    "metadata": {
      "customer_id": "cus_123",
      "plan": "enterprise"
    },
    "streaming": true
  }' \
  --compressed

Replace YOUR_API_KEY, YOUR_DEPLOYMENT_NAME, YOUR_VARIANT_NAME, and the inputs object with your real values. Use metadata for audit fields you want to see or filter in logs, such as customer IDs, plan names, regions, or experiment names. Do not put prompt variables there; prompt variables belong in inputs. See Invoke metadata for examples and log filtering details.

Response

If streaming is true, the API returns a stream of data: events rather than one final JSON object. If the provider fails after the stream has opened, the API sends a final error event before closing the stream.

Streaming response

You can receive different event types during the stream. For example: Reasoning or thinking event:

{
  "request_id": "req_019d52846ea37682b03522fd0695cc43",
  "type": "reasoning",
  "model": "gpt-5-nano",
  "response": " seems"
}

Response event:

{
  "request_id": "req_019d52846ea37682b03522fd0695cc43",
  "type": "response",
  "model": "gpt-5-nano",
  "response": " too"
}

Final usage event:

{
  "request_id": "req_019d52846ea37682b03522fd0695cc43",
  "type": "usage",
  "usage": {
    "duration": 4.79230260848999,
    "prompt_tokens": {
      "total_tokens": 24,
      "cached_tokens": 0
    },
    "completion_tokens": {
      "total_tokens": 170,
      "reasoning_tokens": 64
    },
    "total_tokens": 194
  },
  "stop_reason": "completed"
}

Error event:

{
  "request_id": "req_019d52846ea37682b03522fd0695cc43",
  "type": "error",
  "error": "cohere provider stream error: HTTP 400: invalid request",
  "message": "cohere provider stream error: HTTP 400: invalid request",
  "provider": "cohere",
  "error_type": "provider_stream_error",
  "status_code": 502
}

Non-streaming response

If streaming is false, the API returns a single JSON response. Provider execution failures return 502 Bad Gateway with an error message. For example:

{
  "request_id": "req_019d528660dd7e22b15e5b13a1931c50",
  "model": "gpt-5-nano-2025-08-07",
  "duration": 4.641960144042969,
  "reasoning": "**Clarifying summary request**\n\nI need to follow up on the user's request for a 50-word summary since their message was incomplete. I should ask them to clarify the text or topic they'd like summarized. I can suggest options like pasting a passage or specifying a source. I’ll also mention I can summarize up to 50 words or suggest a different length if they prefer. Let’s get that response ready!",
  "response": "I’m missing the text or topic to summarize. Please paste the passage (or name the work) you want summarized, and I’ll provide exactly 50 words. If you prefer a different length, tell me the target word count. I can summarize articles, chapters, speeches, or any provided content.",
  "usage": {
    "prompt_tokens": {
      "total_tokens": 24,
      "cached_tokens": 0
    },
    "completion_tokens": {
      "total_tokens": 187,
      "reasoning_tokens": 64
    },
    "total_tokens": 211
  },
  "stop_reason": "completed"
}

Next steps

Publishing and Versioning
Run with Python SDK - Invoke a prompt from Python
Run with Node.js SDK - Invoke a prompt from Node.js
Run with Ruby SDK - Invoke a prompt from Ruby
Run with PHP SDK - Invoke a prompt from PHP

​Authentication

​Endpoint

​Request

​Basic example

​Response

​Streaming response

​Non-streaming response

​Next steps