Skip to content
Skip to Content
Referencerbee CLI Reference

rbee CLI Reference

Complete command reference for the rbee CLI tool.

Installation

curl -sSL https://install.rbee.dev | sh

Or install manually:

# Install to user paths (~/.local/bin) rbee install # Install to system paths (/usr/local/bin) - requires sudo sudo rbee install --system

Core Commands

Inference

Run inference on a model with automatic worker provisioning.

rbee infer \ --node gpu-0 \ --model meta-llama/Llama-3.2-1B \ --prompt "Hello, world!" \ --max-tokens 100

Options:

  • --node - Target hive ID (e.g., gpu-0, localhost)
  • --model - Model name or HuggingFace ID
  • --prompt - Input prompt text
  • --max-tokens - Maximum tokens to generate (default: 100)
  • --temperature - Sampling temperature (default: 0.7)
  • --stream - Enable streaming output

Output:

Streaming Output
Hello! How can I help you today?

Node Management

Add Node

Register a remote hive node in the Queen registry.

rbee setup add-node gpu-node-1 \ --ssh-host 192.168.1.100 \ --ssh-user admin

Options:

  • node-name - Unique identifier for the node
  • --ssh-host - IP address or hostname
  • --ssh-user - SSH username
  • --ssh-port - SSH port (default: 22)
  • --ssh-key - Path to SSH private key (optional)

What this does:

  1. Registers node in Queen’s hive registry
  2. Queen can now SSH to this node to start hive daemon
  3. Workers on this node can send heartbeats to Queen

List Nodes

Show all registered nodes in the Queen registry.

rbee setup list-nodes

Output:

Registered Nodes
localhost 127.0.0.1 Online 2 workers gpu-node-1 192.168.1.100 Online 4 workers gpu-node-2 192.168.1.101 Offline 0 workers

Remove Node

Unregister a node from the Queen registry.

rbee setup remove-node gpu-node-1

Worker Management

List Workers

Show all workers across all hives.

# List all workers rbee workers list # List workers on specific node rbee workers list --node gpu-0

Output:

Workers
WORKER ID NODE MODEL STATUS PORT worker-abc123 gpu-0 llama-3.2-1b Ready 9301 worker-def456 gpu-0 llama-3.2-3b Busy 9302 worker-ghi789 localhost llama-3.2-1b Ready 9303

Worker Health

Check health of workers on a specific node.

rbee workers health --node gpu-0

Output:

Health Check
worker-abc123: Healthy (200ms) worker-def456: Healthy (150ms) All workers operational

Shutdown Worker

Gracefully shutdown a specific worker.

rbee workers shutdown --id worker-abc123

Log Viewing

View Logs

View logs from a specific node or worker.

# View node logs rbee logs --node gpu-0 # Follow logs in real-time rbee logs --node gpu-0 --follow # View specific worker logs rbee logs --worker worker-abc123

Options:

  • --node - Node ID to view logs from
  • --worker - Worker ID to view logs from
  • --follow - Stream logs in real-time (like tail -f)
  • --lines - Number of lines to show (default: 100)

Global Flags

ParameterTypeRequiredDefaultDescription
--helpflagOptionalShow help for any command
--versionflagOptionalShow rbee version
--configstringOptionalPath to config file (default: ~/.config/rbee/config.toml)
--queen-urlstringOptionalOverride Queen URL (default: http://localhost:7833)

Examples:

# Show version rbee --version # Use custom config rbee --config ~/my-rbee.toml infer --prompt "Hello" # Connect to remote Queen rbee --queen-url http://192.168.1.50:7833 workers list

How It Works

Architecture Overview

User → rbee-keeper (CLI) → queen-rbee (Orchestrator) → rbee-hive → llm-worker

Key principles:

  1. rbee-keeper is a thin client - All logic in Queen
  2. Automatic Queen lifecycle - Starts if needed, stops when done
  3. No SSH from CLI - Only Queen uses SSH for remote nodes
  4. Streaming by default - Real-time output via SSE

Queen Lifecycle Management

Ephemeral Mode (Default):

# When you run a command rbee infer --prompt "Hello" # Behind the scenes: 1. Check if Queen is running 2. If not: Start Queen daemon (ephemeral mode) 3. Submit request to Queen 4. Stream response back to you 5. Shutdown Queen (only if we started it)

Daemon Mode:

# Start Queen manually (persistent) queen-rbee --port 7833 & # rbee-keeper detects existing Queen rbee infer --prompt "Hello" # (leaves Queen running after command)

Configuration

Config File Location

Default: ~/.config/rbee/config.toml

Example Config

# Queen connection queen_port = 7833 queen_host = "localhost" # For remote Queen # queen_host = "192.168.1.50" # SSH settings (for remote nodes) [ssh] default_user = "admin" default_port = 22 key_path = "~/.ssh/id_rsa" # Inference defaults [defaults] max_tokens = 100 temperature = 0.7 stream = true

Common Workflows

First-Time Setup

Troubleshooting

Next Steps

Advanced Usage

Scripting

Use rbee in scripts with proper error handling:

#!/bin/bash set -euo pipefail # Run inference and capture output OUTPUT=$(rbee infer \ --node gpu-0 \ --model llama-3-8b \ --prompt "Summarize: $INPUT_TEXT" \ 2>&1) # Check exit code if [ $? -eq 0 ]; then echo "Success: $OUTPUT" else echo "Error: $OUTPUT" exit 1 fi

CI/CD Integration

# .github/workflows/inference-test.yml name: Test Inference on: [push] jobs: test: runs-on: ubuntu-latest steps: - name: Install rbee run: curl -sSL https://install.rbee.dev | sh - name: Run inference test run: | rbee infer \ --node localhost \ --model meta-llama/Llama-3.2-1B \ --prompt "Test prompt" \ --max-tokens 10

Environment Variables

Override config with environment variables:

export RBEE_QUEEN_URL=http://localhost:7833 export RBEE_CONFIG_PATH=~/.config/rbee/config.toml rbee infer --prompt "Hello"
2025 © rbee. Your private AI cloud, in one command.
GitHubrbee.dev