AI Control Plane

The gateway for LLMs, MCP & Agents

Governed by the AD groups you already have.

terminal

# Drop-in OpenAI replacement — just change the base URL

$ curl http://localhost:8080/v1/chat/completions \

-H "Authorization: Bearer $PLLM_KEY" \

-d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hi"}]}'

{

"provider": "openai",

"model": "gpt-4o",

"latency_ms": 142,

"route": "least-latency"

}

Under the Hood

A boring, fast, single binary.

pLLM is one Go service — no sidecars, no Python runtime, no surprise dependencies. Boring infrastructure so your AI platform can stop being the interesting thing.

01 · Consumers
02 · pLLM Control Plane
03 · Managed Resources
IDE agents
Cursor · Zed · Continue
Chat clients
Internal ChatGPT · Teams
Backend apps
Python · Node.js · Go
Autonomous agents
LangGraph · CrewAI
pLLM
pLLM
single Go binary
live
01
Auth
SSO · group sync
02
Policy
AD-group RBAC
03
Registry
agents · skills · prompts
04
Router
latency-aware
05
Guardrails
PII · injection · moderation
06
Audit
every call logged
<1ms
overhead
12k+
rps / node
65MB
memory
LLM providers6
OpenAI
99.9%
Anthropic
99.9%
Azure OpenAI
92.1%
AWS Bedrock
99.8%
Google Vertex
99.7%
Meta Llama
99.6%
MCP servers4+
GitHub
healthy
Jira
healthy
Snowflake
degraded
PostgreSQL
healthy
Identity
Entra ID
Okta
Active Directory
single control plane · zero sidecars · zero python runtime
healthydegradedfailed

Request flow

Auth → policy → router → guardrails → provider. Toggle a simulation mode to see failover in action.

interactive
Simulation Mode
Legend
primary route
fallback
circuit
Quick Start

Running in two steps.

$ deploy → $ point your SDK → ship

Step 1 — Deploy pLLM
~2 min · Ideal for local development and trials
bash
# Clone and configure
git clone https://github.com/andreimerfu/pllm.git
cd pllm && cp .env.example .env

# Drop in your keys
echo "OPENAI_API_KEY=sk-..." >> .env

# Bring it up
docker compose up -d

# Smoke test
curl http://localhost:8080/v1/models
Step 2 — Point your app at pLLM
100% OpenAI compatible · change one line
python
from openai import OpenAI

# Same SDK. Just flip the base_url.
client = OpenAI(
    api_key="sk-...",
    base_url="https://pllm.company.com/v1"
)

response = client.chat.completions.create(
    model="smart",                         # pLLM route — picks the best model
    messages=[{"role": "user", "content": "Hello"}],
)
response
"route": "smart""provider": "openai""model": "gpt-5""latency_ms": 142200 OK
Open source · Self-hosted

Start shipping AI your security team can approve.

One gateway. One audit trail. Policies that live in the identity system you already trust.