rbee CLI Reference
Complete command reference for the rbee CLI tool.
The rbee command is provided by rbee-keeper, a thin HTTP client that communicates with queen-rbee.
Installation
curl -sSL https://install.rbee.dev | shOr install manually:
# Install to user paths (~/.local/bin)
rbee install
# Install to system paths (/usr/local/bin) - requires sudo
sudo rbee install --systemCore Commands
Inference
Run inference on a model with automatic worker provisioning.
rbee infer \
--node gpu-0 \
--model meta-llama/Llama-3.2-1B \
--prompt "Hello, world!" \
--max-tokens 100Options:
--node- Target hive ID (e.g.,gpu-0,localhost)--model- Model name or HuggingFace ID--prompt- Input prompt text--max-tokens- Maximum tokens to generate (default: 100)--temperature- Sampling temperature (default: 0.7)--stream- Enable streaming output
Output:
If no worker exists for the model, Queen automatically spawns one on the target hive.
Node Management
Add Node
Register a remote hive node in the Queen registry.
rbee setup add-node gpu-node-1 \
--ssh-host 192.168.1.100 \
--ssh-user adminOptions:
node-name- Unique identifier for the node--ssh-host- IP address or hostname--ssh-user- SSH username--ssh-port- SSH port (default: 22)--ssh-key- Path to SSH private key (optional)
What this does:
- Registers node in Queen’s hive registry
- Queen can now SSH to this node to start hive daemon
- Workers on this node can send heartbeats to Queen
List Nodes
Show all registered nodes in the Queen registry.
rbee setup list-nodesOutput:
Remove Node
Unregister a node from the Queen registry.
rbee setup remove-node gpu-node-1This removes the node from the registry. Workers on this node will no longer send heartbeats to Queen.
Worker Management
List Workers
Show all workers across all hives.
# List all workers
rbee workers list
# List workers on specific node
rbee workers list --node gpu-0Output:
Worker Health
Check health of workers on a specific node.
rbee workers health --node gpu-0Output:
Shutdown Worker
Gracefully shutdown a specific worker.
rbee workers shutdown --id worker-abc123Workers receive SIGTERM and have 30 seconds to complete ongoing requests before SIGKILL.
Log Viewing
View Logs
View logs from a specific node or worker.
# View node logs
rbee logs --node gpu-0
# Follow logs in real-time
rbee logs --node gpu-0 --follow
# View specific worker logs
rbee logs --worker worker-abc123Options:
--node- Node ID to view logs from--worker- Worker ID to view logs from--follow- Stream logs in real-time (liketail -f)--lines- Number of lines to show (default: 100)
Global Flags
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| --help | flag | Optional | — | Show help for any command |
| --version | flag | Optional | — | Show rbee version |
| --config | string | Optional | — | Path to config file (default: ~/.config/rbee/config.toml) |
| --queen-url | string | Optional | — | Override Queen URL (default: http://localhost:7833) |
Examples:
# Show version
rbee --version
# Use custom config
rbee --config ~/my-rbee.toml infer --prompt "Hello"
# Connect to remote Queen
rbee --queen-url http://192.168.1.50:7833 workers listHow It Works
Architecture Overview
User → rbee-keeper (CLI) → queen-rbee (Orchestrator) → rbee-hive → llm-workerKey principles:
- rbee-keeper is a thin client - All logic in Queen
- Automatic Queen lifecycle - Starts if needed, stops when done
- No SSH from CLI - Only Queen uses SSH for remote nodes
- Streaming by default - Real-time output via SSE
Queen Lifecycle Management
Ephemeral Mode (Default):
# When you run a command
rbee infer --prompt "Hello"
# Behind the scenes:
1. Check if Queen is running
2. If not: Start Queen daemon (ephemeral mode)
3. Submit request to Queen
4. Stream response back to you
5. Shutdown Queen (only if we started it)Daemon Mode:
# Start Queen manually (persistent)
queen-rbee --port 7833 &
# rbee-keeper detects existing Queen
rbee infer --prompt "Hello"
# (leaves Queen running after command)rbee-keeper only shuts down Queens that it started. Pre-existing Queens are left running.
Configuration
Config File Location
Default: ~/.config/rbee/config.toml
Example Config
# Queen connection
queen_port = 7833
queen_host = "localhost"
# For remote Queen
# queen_host = "192.168.1.50"
# SSH settings (for remote nodes)
[ssh]
default_user = "admin"
default_port = 22
key_path = "~/.ssh/id_rsa"
# Inference defaults
[defaults]
max_tokens = 100
temperature = 0.7
stream = trueCommon Workflows
First-Time Setup
Troubleshooting
Next Steps
Getting Started
Installation and first inference
Worker Types
Choose the right worker
Job Operations API
HTTP API reference
Advanced Usage
Scripting
Use rbee in scripts with proper error handling:
#!/bin/bash
set -euo pipefail
# Run inference and capture output
OUTPUT=$(rbee infer \
--node gpu-0 \
--model llama-3-8b \
--prompt "Summarize: $INPUT_TEXT" \
2>&1)
# Check exit code
if [ $? -eq 0 ]; then
echo "Success: $OUTPUT"
else
echo "Error: $OUTPUT"
exit 1
fiCI/CD Integration
# .github/workflows/inference-test.yml
name: Test Inference
on: [push]
jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Install rbee
run: curl -sSL https://install.rbee.dev | sh
- name: Run inference test
run: |
rbee infer \
--node localhost \
--model meta-llama/Llama-3.2-1B \
--prompt "Test prompt" \
--max-tokens 10Environment Variables
Override config with environment variables:
export RBEE_QUEEN_URL=http://localhost:7833
export RBEE_CONFIG_PATH=~/.config/rbee/config.toml
rbee infer --prompt "Hello"