Bring Your Own Cloud · Bring Your Own LLM

Run Speedscale in your cloud. Keep every byte of traffic in your VPC.

Speedscale BYOC installs the full capture-and-repair stack inside your AWS, GCP, or Azure account. Traffic, inference, and audit logs all run on infrastructure you own. The vendor ships software updates. That's it.

Request a BYOC demo See the architecture

Data never leaves your VPC Your LLM, your keys PCI · HIPAA · SOC2 friendly

Speedscale traffic replay dashboard showing service dependencies, inbound and outbound throughput, and live request traces — all running inside your VPC

Why BYOC, not multi-tenant SaaS

Reproducing a real production bug means you need the actual payload: headers, body, every quirk of that specific request. That data is too sensitive for most regulated teams to send outside the firewall. BYOC removes the trade-off.

Data sovereignty

PCI card data, PHI, trading positions: none of it leaves your VPC. No shared tenancy means no cross-customer data risk and no third-party SOC2 scope to inherit.

Bring your own LLM

Point the agents at whatever LLM you're already running: Anthropic, OpenAI, self-hosted Llama, or your own model. Prompts, completions, and inference costs stay under your existing AI governance.

Your cloud, your economics

Reserved instances, spot, PrivateLink, committed-use discounts: all of it applies. No egress fees on capture. No vendor markup on storage. No surprise overages when traffic spikes.

How the BYOC stack runs

One Helm install, three components. Speedscale ships the software; you run it in your Kubernetes cluster with the access controls you already have.

Capture

An eBPF agent and Kubernetes operator capture full request and response payloads in production, without SDK changes or code instrumentation. Payloads land in object storage inside your account.

Orchestrate

The agent factory runs the closed loop: discover failing requests, draft a repro spec, triage root cause, validate the fix against the same captured traffic, and open the PR. All compute stays in your cluster.

Reason

The agents call your LLM endpoint: Anthropic in your VPC, OpenAI with your DPA, self-hosted Llama on your GPUs, or an internal model. Prompts and completions log to your existing AI audit trail.

Why your observability vendor can't do this

eBPF observability vendors and APM tools were built to display spans, not to reproduce bugs. The difference shows up in the one place that matters: the request body.

eBPF observability (Pixie, Groundcover, Metoro, Coroot)

Truncate payload bodies because monitoring doesn't need full fidelity. That's fine for dashboards. But a truncated body can't replay a failed request.

APM and error tracking (Datadog, Dynatrace, Sentry)

Tell you a request failed. Don't tell you how to reproduce it. There's no path from alert to merged fix without building it yourself.

Capture-replay point tools (GoReplay, WireMock, Hoverfly)

Cover one step of the loop (usually replay) and leave the rest to you. Discovery, triage, validation, and the PR are all manual.

AI coding agents (Cursor, Claude Code, Copilot)

Write code. Can't reproduce a real customer's failed request without production data they don't have. BYOC feeds them that context.

None of these categories fit. Speedscale closes the loop from failure to fix: capture the real request, reproduce it, validate the patch against the same traffic, open the PR. BYOC is what makes that loop viable in regulated environments.

Who runs Speedscale in BYOC mode

Financial services

Banks, card networks, and trading platforms where captured payloads contain account numbers, trade data, and other regulated content. BYOC keeps all of it inside the existing audit perimeter, including the AI reasoning that touches it.

Healthcare

HIPAA-covered systems where PHI can't cross a vendor boundary under any BAA. BYOC removes that conversation: data, reproduction, and AI inference all run inside your HIPAA-eligible cloud account.

Retail and payments (PCI)

PCI-scoped systems where card data appears in real request bodies. Running capture inside your own VPC means no PCI scope expansion and no shared-tenancy ambiguity.

Travel and hospitality

High-volume booking platforms that need realistic traffic to catch problems before they become 3am incidents. FLYR runs Speedscale in BYOC to validate releases against live production patterns before each deploy.

You don't need to be AI-native on day one

The data lake is live in minutes. Most regulated teams layer in the AI loop once they've seen replay working — but that's weeks, not years. BYOC gives you the full stack; you decide the pace.

Day 1: Data lake live

Helm deploys in under 10 minutes. Production traffic starts flowing to your storage backend immediately. No code changes, no SDK, no restarts.

Week 1: First replay

Engineers replay real production requests against builds and catch the first set of regressions. Bugs that were invisible in staging are now reproducible in seconds.

Month 1–2: Closed loop

On lower-risk surfaces, the AI loop runs end to end: discover, reproduce, validate, PR. Your LLM, your policies, your audit trail. Humans keep the gate on anything sensitive.

Install in minutes via Helm

Two commands. Pick your backend. Everything stays in your cluster.

View on GitHub

Step 1 — Add the Helm repo

helm repo add speedscale-byoc https://speedscale.github.io/speedscale-byoc/
helm repo update

Step 2 — Pick your backend

Backend	Install command
Grafana + Loki	helm install byoc-grafana speedscale-byoc/grafana -n byoc-grafana --create-namespace
Elasticsearch + Kibana	helm install byoc-es speedscale-byoc/elasticsearch -n byoc-elasticsearch --create-namespace
Fluent Bit → GCS	helm install byoc-gcs speedscale-byoc/fluentbit-gcs -n byoc-fluentbit-gcs --create-namespace
Fluent Bit → S3	helm install byoc-s3 speedscale-byoc/fluentbit-s3 -n byoc-fluentbit-s3 --create-namespace

Prerequisites, operator-values.yaml examples, verify steps, and troubleshooting in the full install guide →

Ready to deploy Speedscale inside your cloud?

A typical BYOC rollout takes four to six weeks from kickoff to first reproduced bug. We do the install with your platform team. Security reviews the components on their normal schedule; AI governance signs off on the LLM endpoint before it's wired in.

Request a BYOC demo Read the BYOC architecture guide