Air-Gapped Inference

AI for analysts who
cannot afford a breach

The only local AI that protects your data even if your machine is compromised. Six layers of active defense. Zero data leaves your machine. Not to a cloud. Not to us. Not to anyone.

Qwen3.5-9B
Base Model
4-bit NF4
Quantization
6.5 GB
VRAM Required
6 Layers
Security Stack
~60%
Act. Sparsity
All systems operational — zero outbound connections
Use Cases

Built for adversarial environments

Kwyre exists because these professionals cannot upload their work to cloud AI. Not legally. Not ethically. Not safely.

Forensic Investigators
Cannot upload $3B fraud evidence to ChatGPT during an active federal case.
Full local inference, zero telemetry, no chain-of-custody risk.
Criminal Defense Attorneys
Attorney-client privilege prohibits cloud AI on case materials.
Air-gapped by architecture, not policy.
M&A Law Firms
Associates uploading NDA-protected deal docs to ChatGPT is malpractice liability.
Verified zero outbound connections, auditable.
Insurance Underwriters
Actuarial models and cedent PII cannot touch cloud APIs under compliance rules.
Compliance documentation package for your legal team.
Cleared Defense Contractors
Sensitive unclassified data. Can't use classified AI, can't use ChatGPT.
Local, offline, no cleared facility required.
Forensic Accountants
SEC whistleblower cases, active DOJ investigations. Evidence integrity is paramount.
RAM-only storage, cryptographic wipe on session end.
Defense Architecture

Six layers of active security

Every local AI tool treats "local" as the security boundary. Kwyre treats the machine itself as potentially compromised.

L1

Network Isolation

Server binds to 127.0.0.1 only — physically unreachable from any network at the OS level. No firewall rules required. The OS itself blocks all external connections.

OS-Level Enforcement
L2

Process-Level Network Lockdown

iptables rules scoped to the inference process. All outbound traffic blocked except localhost. Even a fully compromised server process cannot make outbound connections.

Kernel-Level Enforcement
L3

Dependency Integrity

SHA256 hash manifest of every installed Python package generated on clean install. Verified at server startup. Tampered torch, transformers, or any dependency causes immediate abort.

Supply Chain Defense
L4

Model Weight Integrity

SHA256 hashes of all model config files verified at every startup. Tampered or replaced model weights cause immediate process abort with clear error.

Cryptographic Verification
L5

Secure RAM Session Storage

Conversations stored only in RAM. Each session gets a unique 32-byte random key. On session end: secure_wipe() overwrites all content with random bytes before clearing references. RAM scraping returns garbage.

Cryptographic Wipe
L6

Intrusion Detection + Auto-Wipe

Background watchdog monitors for unexpected outbound connections and known analysis tools (Wireshark, x64dbg, Fiddler, Ghidra, IDA). On confirmed intrusion: all sessions wiped, server terminated.

Active Threat Response
Capabilities

Production inference engine

Frontier-class model with dual-layer compression. Spike QAT training plus 4-bit quantization. Nobody else does this.

Spike QAT Training

Custom pipeline using Straight-Through Estimator spike encoding with k-curriculum annealing (k=50→5). The model learns to tolerate spike-encoded activations.

SpikeServe Activation Encoding

Dynamic spike encoding at inference time. Significant activation sparsity without quality loss. Dual compression: weights AND activations.

Zero Content Logging

Metadata only. No telemetry, no analytics, no error reporting, no update pings, no license callbacks. Verify yourself with Wireshark.

OpenAI-Compatible API

POST /v1/chat/completions drop-in replacement. Works with any OpenAI SDK. Session management and compliance endpoints built in.

LoRA Fine-Tuning

Adapters targeting MLP layers (gate_proj, up_proj, down_proj). Fine-tuning with 0.2% trainable parameters. QLoRA rank 64, alpha 128.

Compliance & Audit

GET /audit for metadata-only compliance logs. GET /health for full security stack status. Architecture designed for HIPAA, FINRA, SOC2-adjacent requirements.

Competitive Analysis

Every security feature is unique to Kwyre

No other local inference tool targets the compliance and adversarial environment use case.

Capability Kwyre ChatGPT Ollama LM Studio Jan.ai LocalAI
Fully localYESYESYESYESYES
Open sourceYESYESYESYES
Localhost-only bindingYES
Process outbound blockYES
Dependency integrityYES
Model weight verificationYES
RAM-only sessionsYES
Cryptographic session wipeYES
Intrusion detection + auto-wipeYES
Compliance documentationYES
Audit endpointYES
Anonymous payment (XMR)YESfreefreefreefree
Spike activation encodingYES
Custom QAT trainingYES
Pricing

One-time. No subscription.

Credit card or Monero. No email required for Monero purchases. No recurring billing. No statement entries.

Personal
$299
One-time · 1 machine
  • Full inference server
  • 6-layer security stack
  • Compliance documentation
  • OpenAI-compatible API
Air-Gapped Kit
$1,499
One-time · 5 machines
  • Everything in Professional
  • Offline installer
  • Full audit package
  • White-glove setup
Console

Inference terminal

AWAITING AUTHENTICATION