Skip to content
Skip to Content
GuidesOverview

API Overview

rbee provides a REST API compatible with the OpenAI API specification, making it easy to integrate with existing tools and libraries.

Base URL

http://localhost:8080/v1

Authentication

API requests require a bearer token:

curl -H "Authorization: Bearer YOUR_API_KEY" \ http://localhost:8080/v1/models

Core Endpoints

List Models

GET /v1/models

Returns all available models in your rbee instance.

Chat Completions

POST /v1/chat/completions

Generate chat completions using a specified model.

Request body:

{ "model": "llama-3.1-8b", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"} ], "temperature": 0.7, "max_tokens": 150 }

Streaming Responses

Enable streaming by setting stream: true:

{ "model": "llama-3.1-8b", "messages": [...], "stream": true }

Rate Limits

Default rate limits:

  • 100 requests per minute per API key
  • 1000 tokens per second per model

Error Handling

All errors follow the standard format:

{ "error": { "message": "Model not found", "type": "invalid_request_error", "code": "model_not_found" } }
2025 © rbee. Your private AI cloud, in one command.
GitHubrbee.dev