Section 1
Deployment Architectures
LINUS-AI supports four canonical deployment topologies, from a single workstation
to a multi-node air-gapped cluster. Choose the pattern that matches your scale and
security requirements.
🖥
Single-Node Server
One machine runs the REST API. LAN clients connect over HTTP/HTTPS. Simplest to operate; ideal for teams of up to ~20 concurrent users.
⚡
Multi-GPU Single Node
Tensor parallelism splits model weights across N GPUs on one machine. No inter-node networking required. Supports up to 8× GPU with NVLink.
🕸
Multi-Node Cluster
Mesh networking connects a coordinator with worker nodes. Pipeline parallelism distributes transformer layers across machines. mTLS-encrypted transport.
🔒
Air-Gap / Offline
After licence activation, LINUS-AI operates with zero internet connectivity. Models are pulled once, verified by hash, and stored locally.
Single-Node Server Topology
single-node topology
┌────────────────────────────────────────────────────────┐
│ LAN / Corporate Network │
│ │
│ Client A ──┐ │
│ Client B ──┤──► Nginx :443 ──► linus-ai :8080 │
│ Client C ──┘ (TLS termination) │ │
│ ▼ │
│ Model on disk │
│ (GPU / CPU infer) │
└────────────────────────────────────────────────────────┘
Multi-GPU Single-Node Topology
tensor parallel — 4× GPU
┌────────────────────────────────────────────────────────┐
│ linus-ai server (tensor_parallel = 4) │
│ │
│ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │ GPU0 │ │ GPU1 │ │ GPU2 │ │ GPU3 │ NVLink/PCIe │
│ │shard0│◄►│shard1│◄►│shard2│◄►│shard3│ │
│ └──────┘ └──────┘ └──────┘ └──────┘ │
│ AllReduce sync on every forward pass │
└────────────────────────────────────────────────────────┘
Multi-Node Cluster Topology
coordinator + workers — mesh networking
┌────────────────────────────────────────────────────────┐
│ Cluster (LAN / VPN) │
│ │
│ Clients ──► ┌─────────────────────┐ │
│ │ Coordinator Node │ :8080 REST API │
│ │ 192.168.1.10 │ :9090 mesh │
│ └────────┬────────────┘ │
│ mTLS │ mesh │
│ ┌────────────┴─────────────┐ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Worker Node │ │ Worker Node │ │
│ │ 192.168.1.11 │ │ 192.168.1.12 │ │
│ │ layers 0–15 │ │ layers 16–31 │ │
│ └─────────────────┘ └─────────────────┘ │
└────────────────────────────────────────────────────────┘
Air-Gap / Offline Deployment
air-gap deployment flow
# Step 1 — on internet-connected machine, generate offline token
$ linus-ai --generate-offline-token --key LNAI-XXXX-XXXX-XXXX-XXXX
✓ Offline activation token saved: linus-ai-offline-token.bin
# Step 2 — transfer token + binary + model files to air-gapped machine
# (USB drive, secure file transfer, etc.)
# Step 3 — activate offline
$ linus-ai --activate-offline linus-ai-offline-token.bin
✓ Licence activated offline (Enterprise · air-gap mode)
✓ No outbound connections required from this point forward
Section 3
Production Deployment
Run LINUS-AI as a managed system service with automatic restarts, structured logging,
and an Nginx reverse proxy handling TLS termination.
Systemd Service Unit
Create the unit file at /etc/systemd/system/linus-ai.service:
/etc/systemd/system/linus-ai.service
[Unit]
Description=LINUS-AI Private Inference Engine
Documentation=https://linus-ai.com/docs/admin
After=network.target
Wants=network-online.target
[Service]
Type=simple
User=linus-ai
Group=linus-ai
WorkingDirectory=/var/lib/linus-ai
ExecStart=/usr/local/bin/linus-ai --serve --host 127.0.0.1 --port 8080
Restart=always
RestartSec=5s
TimeoutStopSec=30s
# Environment
Environment=HOME=/var/lib/linus-ai
Environment=LINUS_AI_CONFIG=/etc/linus-ai/config.toml
Environment=LOG_LEVEL=info
# Logging — write to journald
StandardOutput=journal
StandardError=journal
SyslogIdentifier=linus-ai
[Install]
WantedBy=multi-user.target
# Enable and start the service
$ sudo systemctl daemon-reload
$ sudo systemctl enable --now linus-ai
● linus-ai.service — LINUS-AI Private Inference Engine
Active: active (running)
Nginx Reverse Proxy (HTTPS Termination)
Place this config at /etc/nginx/sites-available/linus-ai and symlink to sites-enabled:
/etc/nginx/sites-available/linus-ai
server {
listen 80;
server_name ai.example.com;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl http2;
server_name ai.example.com;
# TLS — managed by certbot
ssl_certificate /etc/letsencrypt/live/ai.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/ai.example.com/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
# Proxy to linus-ai
location / {
proxy_pass http://127.0.0.1:8080;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_read_timeout 300s; # allow time for long responses
proxy_buffering off; # required for SSE streaming
}
# Block public access to metrics endpoint
location /metrics {
allow 10.0.0.0/8;
deny all;
}
}
Let's Encrypt Certificate (certbot)
certbot — initial issuance & auto-renewal
$ sudo apt install certbot python3-certbot-nginx
$ sudo certbot --nginx -d ai.example.com --email admin@example.com --agree-tos
✓ Certificate issued. Valid for 90 days.
✓ Auto-renewal cron job installed at /etc/cron.d/certbot
# Verify auto-renewal
$ sudo certbot renew --dry-run
✓ Dry run: renewal successful
Docker Compose Alternative
docker-compose.yml
version: "3.9"
services:
linus-ai:
image: ghcr.io/linus-ai/linus-ai:latest
restart: always
ports:
- "127.0.0.1:8080:8080"
volumes:
- ./config:/etc/linus-ai:ro
- linus-ai-data:/var/lib/linus-ai
- /path/to/models:/models:ro
environment:
- LINUS_AI_CONFIG=/etc/linus-ai/config.toml
- LINUS_AI_API_KEY=${LINUS_AI_API_KEY}
- LOG_LEVEL=info
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 60s
volumes:
linus-ai-data:
Health Check Endpoint
GET /health
$ curl -s http://localhost:8080/health | jq .
{
"status": "ok",
"version": "2.4.1",
"uptime_seconds": 84321,
"model_loaded": "llama3.2",
"gpu_available": true,
"gpu_backend": "cuda",
"active_connections": 3
}
Log Rotation
Create /etc/logrotate.d/linus-ai to manage log file growth:
/etc/logrotate.d/linus-ai
/var/lib/linus-ai/.linus_ai/logs/*.log {
daily
rotate 14
compress
delaycompress
missingok
notifempty
postrotate
systemctl kill -s HUP linus-ai.service
endscript
}
Section 4
Access Control
LINUS-AI provides layered access control: API key authentication, IP allowlisting,
per-key rate limiting, and LDAP/SSO integration for Enterprise deployments.
API Key Authentication
Set the LINUS_AI_API_KEY environment variable (or server.api_key in config.toml) to
require bearer token authentication on all API requests.
API key setup
# Generate a strong random key
$ openssl rand -hex 32
a3f8c2d1e9b74056c1f2a3b4d5e6f708a9b0c1d2e3f4a5b6c7d8e9f0a1b2c3d4
# Set in environment (recommended: use a secrets manager)
$ export LINUS_AI_API_KEY=a3f8c2d1e9b74056c1f2a3b4d5e6f708...
# Clients include the key as a Bearer token
$ curl http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer a3f8c2d1e9b74056..." \
-H "Content-Type: application/json" \
-d '{"model":"llama3.2","messages":[...]}'
Multiple API Keys (Team / Enterprise)
Define per-team or per-user keys in config.toml with optional rate limits:
config.toml — multiple API keys
[server]
host = "127.0.0.1"
port = 8080
[[server.api_keys]]
name = "engineering"
key = "eng-a3f8c2d1e9b74056..."
rate_limit = 100 # requests per minute
token_limit = 50000 # tokens per hour
[[server.api_keys]]
name = "data-science"
key = "ds-7b2c9f1a4d8e3b5c..."
rate_limit = 200
token_limit = 200000
[[server.api_keys]]
name = "readonly-bot"
key = "bot-2e5a8f3c1d7b9e4a..."
rate_limit = 10
allowed_endpoints = ["/v1/chat/completions", "/health"]
IP Allowlisting
config.toml — IP allowlist
[server]
allowed_ips = [
"10.0.0.0/8", # private class A
"172.16.0.0/12", # private class B
"192.168.0.0/16", # private class C
"127.0.0.1", # loopback
]
# Requests from IPs outside this list receive HTTP 403
LDAP / Active Directory Integration Enterprise
config.toml — LDAP/AD integration
[auth.ldap]
enabled = true
server = "ldap://dc01.corp.example.com:389"
bind_dn = "cn=svc-linusai,ou=service,dc=corp,dc=example,dc=com"
bind_password = "${LDAP_BIND_PASSWORD}" # from env
search_base = "ou=users,dc=corp,dc=example,dc=com"
search_filter = "(&(objectClass=user)(sAMAccountName={username}))"
group_filter = "(&(objectClass=group)(member={dn}))"
# Only members of this AD group can access the API
allowed_groups = ["CN=AI-Users,OU=groups,DC=corp,DC=example,DC=com"]
# Admin group — can access /metrics and management endpoints
admin_groups = ["CN=AI-Admins,OU=groups,DC=corp,DC=example,DC=com"]
# TLS for LDAPS
tls = true
tls_ca_cert = "/etc/ssl/certs/corp-ca.pem"
SSO via OAuth2 / OIDC Enterprise
config.toml — OIDC / SSO
[auth.oidc]
enabled = true
provider_url = "https://sso.corp.example.com/realms/main"
client_id = "linus-ai-prod"
client_secret = "${OIDC_CLIENT_SECRET}"
redirect_uri = "https://ai.example.com/auth/callback"
scopes = ["openid", "profile", "email", "groups"]
required_groups = ["ai-users"] # claim name from IdP
admin_groups = ["ai-admins"]
Section 5
Multi-Seat Licence Management
Team and Enterprise licences support multiple concurrent seat activations against a single licence key.
Manage seats from the CLI or the Enterprise portal.
🪑
Team Licence
Up to 10 named machine seats per key. Each user/machine activates against the shared key. Seat count enforced server-side.
🏢
Enterprise Licence
Unlimited or negotiated seat count. Managed via the Enterprise portal. Supports air-gap activation and licence server (on-prem).
🔄
Seat Transfer
Professional: deactivate old machine, then reactivate on new. Team/Enterprise: managed via portal or support ticket.
Check Seat Usage
linus-ai --license-status
$ linus-ai --license-status
Licence: Team Edition
Key: LNAI-TEAM-XXXX-XXXX-XXXX
Seats: 7 / 10 active
Expires: 2026-12-31
Active machines:
1. build-server-01 activated 2025-09-01
2. dev-workstation-a activated 2025-09-03
3. dev-workstation-b activated 2025-09-03
4. ml-node-01 activated 2025-10-15
5. ml-node-02 activated 2025-10-15
6. staging-server activated 2025-11-01
7. prod-server activated 2025-12-01
Activate a New Seat
activating a new machine
$ linus-ai --activate LNAI-TEAM-XXXX-XXXX-XXXX
✓ Licence activated (Team · 8/10 seats used)
Machine ID: new-workstation-c
Deactivate a Machine
deactivating a seat
# Deactivate from the machine you want to remove
$ linus-ai --deactivate
✓ Machine deactivated. Seat released (7/10 seats now used).
# If the machine is inaccessible, contact support:
# support@linus-ai.com — include your licence key
Air-Gap Seat Activation (Enterprise)
offline seat activation flow
# 1. On the air-gapped machine, generate an activation request
$ linus-ai --generate-activation-request --key LNAI-ENT-XXXX-XXXX-XXXX
✓ Activation request saved: activation-request.json
# 2. Transfer activation-request.json to internet-connected machine
# 3. Submit via Enterprise portal → get activation-token.bin
# 4. Transfer token back to air-gapped machine
$ linus-ai --activate-offline activation-token.bin
✓ Seat activated offline (Enterprise · air-gap mode)
Section 6
Monitoring & Observability
LINUS-AI exposes Prometheus-compatible metrics and structured JSON logs for integration
with your existing observability stack.
Prometheus Metrics
Metrics are available at GET /metrics in Prometheus text format.
| Metric |
Type |
Description |
linus_ai_request_count_total |
Counter |
Total API requests, labelled by method, path, status |
linus_ai_tokens_per_second |
Gauge |
Current inference throughput in tokens/s |
linus_ai_model_load_time_seconds |
Histogram |
Time taken to load model into memory at startup |
linus_ai_active_connections |
Gauge |
Number of currently active client connections |
linus_ai_gpu_utilization |
Gauge |
GPU utilisation percentage (per device, labelled by gpu_id) |
linus_ai_memory_used_bytes |
Gauge |
Process RSS memory usage in bytes |
linus_ai_request_duration_seconds |
Histogram |
End-to-end request latency |
linus_ai_tokens_in_total |
Counter |
Total input tokens processed |
linus_ai_tokens_out_total |
Counter |
Total output tokens generated |
Prometheus Scrape Config
prometheus.yml — scrape config
scrape_configs:
- job_name: "linus-ai"
scrape_interval: 15s
scrape_timeout: 10s
static_configs:
- targets:
- "10.0.1.10:8080" # prod-server
- "10.0.1.11:8080" # ml-node-01
- "10.0.1.12:8080" # ml-node-02
relabel_configs:
- source_labels: [__address__]
target_label: instance
metrics_path: /metrics
# Add bearer token if metrics auth is enabled
authorization:
credentials: "${LINUS_AI_METRICS_TOKEN}"
Structured JSON Logs
Logs are written to ~/.linus_ai/logs/linus-ai.log (or /var/lib/linus-ai/.linus_ai/logs/
when running as a service). Each line is a JSON object.
linus-ai.log — sample entries
{"timestamp":"2025-12-01T09:14:22.341Z","level":"info","method":"POST","path":"/v1/chat/completions","model":"llama3.2","tokens_in":142,"tokens_out":318,"duration_ms":1204,"status":200,"client_ip":"10.0.0.42"}
{"timestamp":"2025-12-01T09:14:25.102Z","level":"info","method":"GET","path":"/health","duration_ms":1,"status":200,"client_ip":"127.0.0.1"}
{"timestamp":"2025-12-01T09:14:31.889Z","level":"warn","event":"rate_limit_exceeded","key_name":"engineering","client_ip":"10.0.0.55"}
{"timestamp":"2025-12-01T09:15:00.001Z","level":"error","event":"model_load_failed","model":"llama3-70b","reason":"insufficient_vram","detail":"need 40GB, have 24GB"}
Log Levels
log level configuration
# Set via environment variable
$ export LOG_LEVEL=debug # debug | info | warn | error
# Or in config.toml
[logging]
level = "info"
format = "json" # json | text
path = "~/.linus_ai/logs/linus-ai.log"
max_size_mb = 100
max_backups = 7
compress_old = true
Health Check
health check — full response
$ curl -s http://localhost:8080/health | jq .
{
"status": "ok",
"version": "2.4.1",
"build": "2025-12-01T00:00:00Z",
"uptime_seconds": 84321,
"model_loaded": "llama3.2",
"gpu_available": true,
"gpu_backend": "cuda",
"gpu_devices": ["NVIDIA RTX 4090", "NVIDIA RTX 4090"],
"active_connections": 3,
"tokens_per_second": 87.4
}
Section 7
Backup & Recovery
Back up configuration, the encrypted vault, and your licence file. Model files are
large but re-pullable; back them up only if your recovery time objective requires it.
What to Back Up
| Path |
Size |
Criticality |
Notes |
~/.linus_ai/config.toml |
< 5 KB |
Critical |
All runtime configuration |
~/.linus_ai/vault/ |
Variable |
Critical |
AES-256-GCM encrypted conversation store |
~/.linus_ai/licence.bin |
< 1 KB |
Critical |
Locally cached licence; restore from key if lost |
~/.linus_ai/vault.key |
< 1 KB |
Critical |
Vault encryption key — required when migrating machines |
~/.linus_ai/models/ |
1–100+ GB |
Optional |
Re-pullable from registry; backup if air-gapped |
Vault key warning: The vault encryption key is auto-derived from hardware identifiers on each machine. If you restore the vault on a different machine without also restoring the key file, the vault contents will be unreadable. Always export the key with linus-ai --export-vault-key before migrating.
Backup Script (Bash / cron)
/usr/local/bin/backup-linus-ai.sh
#!/usr/bin/env bash
set -euo pipefail
BACKUP_DIR="/mnt/backups/linus-ai"
DATE=$(date +%Y%m%d-%H%M%S)
DEST="${BACKUP_DIR}/${DATE}"
LINUS_HOME="/var/lib/linus-ai/.linus_ai"
mkdir -p "${DEST}"
# Back up critical files
cp "${LINUS_HOME}/config.toml" "${DEST}/"
cp "${LINUS_HOME}/licence.bin" "${DEST}/"
cp "${LINUS_HOME}/vault.key" "${DEST}/"
cp -r "${LINUS_HOME}/vault/" "${DEST}/vault/"
# Compress
tar -czf "${BACKUP_DIR}/linus-ai-${DATE}.tar.gz" -C "${BACKUP_DIR}" "${DATE}"
rm -rf "${DEST}"
# Prune backups older than 30 days
find "${BACKUP_DIR}" -name "*.tar.gz" -mtime +30 -delete
echo "Backup complete: linus-ai-${DATE}.tar.gz"
# Add to crontab: daily at 02:00
# 0 2 * * * /usr/local/bin/backup-linus-ai.sh >> /var/log/linus-ai-backup.log 2>&1
Vault Key Export / Migration
vault key export & restore
# On the source machine — export vault key
$ linus-ai --export-vault-key --output vault-export.key
✓ Vault key exported (keep this file secure!)
# On the destination machine — import vault key before restoring vault
$ linus-ai --import-vault-key vault-export.key
✓ Vault key imported. Vault data now readable on this machine.
Disaster Recovery Procedure
- Install the LINUS-AI binary on the new machine (same version or newer).
- If migrating vault data: import the vault key before restoring vault files.
- Restore
config.toml, vault/, and licence.bin to ~/.linus_ai/.
- Re-activate the licence:
linus-ai --activate LNAI-XXXX-XXXX-XXXX-XXXX (or use offline token for air-gap).
- If using system models directory, re-pull or restore model files.
- Start the service:
sudo systemctl start linus-ai and verify with /health.
Section 8
Security Hardening
For production deployments, apply the following hardening checklist.
Each layer reduces the blast radius of a potential compromise.
👤
Non-root Service User
Run as a dedicated linus-ai system user with no shell and no sudo access. Never run as root.
🔥
Firewall Rules
Expose port 8080 only to trusted subnets. Block the metrics endpoint on public interfaces via Nginx or iptables.
🔑
API Key Rotation
Rotate API keys at least quarterly, or immediately upon suspected compromise. Zero-downtime rotation via config reload.
📋
Audit Log Review
Periodically review structured logs for anomalous patterns: unusual IPs, high error rates, unexpected models, or rate-limit events.
🏠
Principle of Least Privilege
Use systemd security directives to confine the service. The unit should not be able to access network namespaces, kernel modules, or arbitrary paths.
🔒
TLS Everywhere
Always use HTTPS for client-facing traffic. For multi-node mesh, mTLS is auto-configured. Never expose plain HTTP on a public or shared network.
Create the Service User
create dedicated linus-ai system user
$ sudo useradd --system --no-create-home --shell /sbin/nologin \
--home-dir /var/lib/linus-ai linus-ai
$ sudo mkdir -p /var/lib/linus-ai /etc/linus-ai
$ sudo chown linus-ai:linus-ai /var/lib/linus-ai
$ sudo chmod 750 /var/lib/linus-ai
Firewall Rules (ufw)
ufw — restrict access
# Allow HTTPS from the corporate network only
$ sudo ufw allow from 10.0.0.0/8 to any port 443 proto tcp
$ sudo ufw allow from 192.168.0.0/16 to any port 443 proto tcp
# Deny direct access to linus-ai port from all external hosts
# (Nginx reverse proxy handles 443 → 8080)
$ sudo ufw deny 8080
# Allow mesh networking port from cluster subnet only
$ sudo ufw allow from 10.0.1.0/24 to any port 9090 proto tcp
Hardened Systemd Unit
Apply additional systemd security directives to sandbox the service:
/etc/systemd/system/linus-ai.service — hardened
[Unit]
Description=LINUS-AI Private Inference Engine
After=network.target
[Service]
Type=simple
User=linus-ai
Group=linus-ai
WorkingDirectory=/var/lib/linus-ai
ExecStart=/usr/local/bin/linus-ai --serve --host 127.0.0.1 --port 8080
Restart=always
RestartSec=5s
# Privilege restrictions
NoNewPrivileges=yes
CapabilityBoundingSet=
AmbientCapabilities=
SecureBits=keep-caps-locked
# Filesystem isolation
PrivateTmp=yes
PrivateDevices=no # set yes if not using GPU
ProtectSystem=strict
ProtectHome=yes
ReadWritePaths=/var/lib/linus-ai /etc/linus-ai
ReadOnlyPaths=/usr/local/bin/linus-ai
# Network namespace (remove if mesh networking required)
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
# System call filtering
SystemCallFilter=@system-service
SystemCallFilter=~@privileged @resources
SystemCallArchitectures=native
# Kernel hardening
ProtectKernelTunables=yes
ProtectKernelModules=yes
ProtectKernelLogs=yes
ProtectControlGroups=yes
ProtectClock=yes
LockPersonality=yes
MemoryDenyWriteExecute=no # JIT inference requires W+X
[Install]
WantedBy=multi-user.target
API Key Rotation (Zero Downtime)
rotating an API key without downtime
# 1. Add the new key alongside the old key in config.toml
# 2. Reload config (no restart required)
$ sudo systemctl kill -s HUP linus-ai.service
✓ Config reloaded. Both old and new keys are now active.
# 3. Update clients to use the new key
# 4. Remove the old key from config.toml, reload again
$ sudo systemctl kill -s HUP linus-ai.service
✓ Old key deactivated.
Section 9
Updating LINUS-AI
Updates are distributed as single-file binaries via GitHub Releases.
The update process is designed to be fast and low-risk, with rollback support.
Standard Update Procedure
update to latest release
# 1. Download the new binary
$ VERSION="2.5.0"
$ curl -Lo /tmp/linus-ai-new \
"https://github.com/LINUS-AI-PRO/linus-ai/releases/download/v${VERSION}/linus-ai-linux-x86_64"
# 2. Verify SHA-256 checksum
$ curl -sL "https://github.com/LINUS-AI-PRO/linus-ai/releases/download/v${VERSION}/checksums.txt" \
| grep "linus-ai-linux-x86_64" | sha256sum --check
linus-ai-linux-x86_64: OK
# 3. Keep previous binary as rollback
$ sudo cp /usr/local/bin/linus-ai /usr/local/bin/linus-ai.prev
# 4. Replace the binary
$ chmod +x /tmp/linus-ai-new
$ sudo mv /tmp/linus-ai-new /usr/local/bin/linus-ai
# 5. Restart the service
$ sudo systemctl restart linus-ai
# 6. Confirm the new version
$ linus-ai --version
LINUS-AI v2.5.0 (linux/amd64, cuda 12.3)
Rollback Procedure
rollback to previous version
$ sudo systemctl stop linus-ai
$ sudo mv /usr/local/bin/linus-ai.prev /usr/local/bin/linus-ai
$ sudo systemctl start linus-ai
✓ Rolled back to previous version.
Check release notes before upgrading. Some releases introduce breaking changes to
config.toml structure. The release notes at
github.com/LINUS-AI-PRO/linus-ai/releases
include a "Breaking Changes" section. Always review before upgrading in production.
Symlink-Based Version Management
versioned binaries with symlink
# Store versioned binaries, point symlink at active version
$ sudo mv /tmp/linus-ai-new /usr/local/lib/linus-ai/linus-ai-2.5.0
$ sudo ln -sfn /usr/local/lib/linus-ai/linus-ai-2.5.0 /usr/local/bin/linus-ai
# Rollback = update symlink
$ sudo ln -sfn /usr/local/lib/linus-ai/linus-ai-2.4.1 /usr/local/bin/linus-ai
$ sudo systemctl restart linus-ai
Section 10
Troubleshooting
Diagnose and resolve the most common production issues.
Start with journalctl -u linus-ai -n 50 for recent service logs.
Service Won't Start
diagnosing startup failures
# Check recent service logs
$ journalctl -u linus-ai -n 50 --no-pager
# Check service status
$ systemctl status linus-ai
# Common causes:
# - Config parse error: validate with: linus-ai --validate-config
# - Port already in use: ss -tlnp | grep 8080
# - Permission error: check /var/lib/linus-ai ownership
# - Binary not found: which linus-ai
$ linus-ai --validate-config --config /etc/linus-ai/config.toml
✓ Config valid
GPU Not Detected
GPU diagnostics
# NVIDIA
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.54 Driver Version: 535.54 CUDA Version: 12.2 |
# AMD ROCm
$ rocm-smi
# Check CUDA environment
$ nvcc --version
$ ldconfig -p | grep libcuda
# Force CPU fallback while debugging GPU
$ linus-ai --serve --backend cpu
# Check linus-ai GPU detection
$ linus-ai --system-info
GPU: NVIDIA RTX 4090 (24 GB VRAM) — CUDA 12.2 ✓
High Memory Usage
reducing memory pressure
# config.toml — reduce context window and enable KV cache quantisation
[inference]
context_len = 4096 # reduce from 8192
kv_cache_quantize = true # Q8 KV cache — ~50% VRAM reduction
gpu_memory_fraction = 0.85 # leave headroom for OS
batch_size = 1 # reduce concurrent requests
# Or use a more aggressive quantisation level
quantize = "Q4_K_M" # instead of Q8_0 or F16
Slow First Request (Model Load Time)
pre-warming the model at startup
# config.toml — load model at startup, not on first request
[inference]
warmup_on_start = true
model = "llama3.2" # model to pre-load
# Systemd: extend start timeout to allow model loading
[Service]
TimeoutStartSec = 300 # 5 minutes for large models
Mesh Nodes Not Connecting
mesh connectivity diagnostics
# 1. Check firewall — port 9090 must be open between all mesh nodes
$ nc -zv 192.168.1.10 9090
Connection to 192.168.1.10 9090 port [tcp] succeeded!
# 2. Check mesh status on coordinator
$ linus-ai --mesh-status
Cluster: 1 coordinator, 1/2 workers connected
✓ 192.168.1.11 worker online (ping: 0.4ms)
✗ 192.168.1.12 worker offline (last seen: 5m ago)
# 3. mTLS cert rotation — rotate if certificates have expired
$ linus-ai --rotate-mesh-certs
✓ mTLS certificates rotated. Workers will reconnect automatically.
# 4. Check logs on the disconnected worker
$ journalctl -u linus-ai -n 30 --no-pager | grep mesh
Common Error Codes
| Code / Event |
HTTP Status |
Cause |
Resolution |
auth_required |
401 |
No or invalid API key |
Include Authorization: Bearer KEY header |
forbidden |
403 |
IP not in allowlist, or key lacks endpoint access |
Check allowed_ips and allowed_endpoints in config |
rate_limit_exceeded |
429 |
Per-key rate limit reached |
Reduce request rate, or increase rate_limit for the key |
model_not_found |
404 |
Requested model not downloaded |
Run linus-ai --pull-model MODEL_NAME |
context_too_long |
400 |
Prompt exceeds context_len |
Shorten prompt or increase context_len in config |
insufficient_vram |
503 |
GPU VRAM insufficient for model |
Use a smaller model or lower quantisation; enable CPU offload |
licence_invalid |
503 |
Licence expired, revoked, or seats exceeded |
Renew licence or deactivate unused seats |
mesh_unavailable |
503 |
Worker nodes not reachable |
Check firewall, mTLS certs, and worker service status |
inference_timeout |
504 |
Response took longer than inference_timeout_s |
Reduce max_tokens, increase timeout, or scale hardware |
Still stuck? Collect a support bundle with
linus-ai --support-bundle — this generates a redacted archive of logs, config (secrets masked), and system info. Attach it when contacting
support@linus-ai.com.
Section 12 · New in v1.4.0
Compliance & Audit Configuration
LINUS-AI v1.4.0 introduces a fully integrated compliance layer with tamper-evident audit logging,
regulated-profile consent management, and fine-grained RAG document access control.
Audit Log Storage
By default, all audit logs are written to ~/.linus-ai/audit/. Two environment variables
let administrators redirect or replicate logs to any path — ideal for network mounts, WORM drives,
or encrypted partitions.
📁
Primary Directory Override
Set LINUS_AI_AUDIT_DIR to redirect all audit logs to an admin-specified path. Useful for network mounts, WORM drives, or encrypted partitions. Default: ~/.linus-ai/audit/
📤
Export Propagation
Set LINUS_AI_AUDIT_EXPORT_DIRS to a colon-separated list of secondary directories. Every audit record is written to all listed directories in real time — for syslog forwarding, SIEM integration, or offsite backup.
🔒
Immutability Sealing
Completed monthly log files are automatically sealed: chmod 0o400 (read-only), macOS UF_IMMUTABLE chflag, and Linux chattr +i (root required). No modification is possible without first clearing the flag.
🔗
HMAC Chain Verification
Every audit record is HMAC-linked to the previous record. Tamper detection is available via GET /compliance/status — the response includes chain_ok: true/false plus monthly statistics.
Audit log environment variables
# Redirect audit logs to a network-mounted WORM drive
$ export LINUS_AI_AUDIT_DIR=/mnt/worm/linus-ai-audit
# Also mirror every record to a SIEM drop-folder and an offsite backup
$ export LINUS_AI_AUDIT_EXPORT_DIRS=/var/log/siem/linus-ai:/backup/audit
# Verify audit chain integrity
$ curl http://localhost:8080/compliance/status | jq '.chain_ok'
true
Compliance API Endpoints
GET/compliance/profilesAll 14 profile compliance configs (level, regulations, PII scan flag, disclaimer)
GET/compliance/statusAudit chain integrity status + monthly statistics
GET/compliance/consentsConsents granted on this machine (?machine_id=…)
GET/compliance/auditQuery HMAC-chained audit log (?profile_id, ?blocked_only, ?pii_only, ?since, ?limit)
POST/compliance/consentGrant or revoke consent — body: {profile_id, action:"grant"|"revoke", machine_id}
RAG Document Access Control
Register documents with a classification level and ACL, then use the decision engine to
enforce access based on principal clearance and explicit allow/deny rules.
GET/rag/documentsList documents (?user_id filters to accessible subset; ?all=true for full registry)
POST/rag/documents/registerRegister a document with classification and ACL
PUT/rag/documents/{id}/aclReplace ACL for a document (allow/deny at user/dept/division/company/role scope)
PUT/rag/documents/{id}/classificationUpdate classification level (0=PUBLIC … 4=TOP_SECRET)
DELETE/rag/documents/{id}Remove document from registry
POST/rag/access-checkTest access decision for a principal + document (returns PERMIT/DENY + rule)
GET/rag/auditQuery RAG access audit log (?doc_id, ?principal_id, ?decision, ?denied_only, ?stats=true)
GET/rag/principalsList all registered principals
POST/rag/principalsRegister or update a principal (user_id, company, division, department, roles, clearance)
DELETE/rag/principals/{id}Remove a principal
🏷
Classification Levels
0 PUBLIC ·
1 INTERNAL ·
2 CONFIDENTIAL ·
3 RESTRICTED ·
4 TOP_SECRET
Each principal has a clearance integer; access requires clearance ≥ document classification.
⚖
ACL Scopes
Allow or deny at five scopes: user, company, division, department, and role. Deny rules always take precedence over allow rules at the same scope.
🧮
Decision Algorithm
Owner override → explicit DENY → PUBLIC bypass → clearance gate →
TOP_SECRET explicit list → ACL permit (company → division → dept → role → user) → default DENY.
14 Compliance Profiles
Every agent profile ships with a pre-configured compliance tier. Regulated profiles require
explicit user consent before a session begins; consent is stored locally and audited.
| Profile |
Level |
Regulations |
Consent Required |
general, creative, reasoning |
OPEN |
— |
No |
code, engineering |
OPEN |
— |
No (injection check enabled) |
education |
AUDIT |
FERPA, COPPA |
No |
support |
AUDIT |
— |
No |
sales |
AUDIT |
GDPR, CAN-SPAM |
No |
data_science |
AUDIT |
GDPR, CCPA |
No |
medical |
REGULATED |
HIPAA, GDPR |
Yes |
legal |
REGULATED |
Attorney-Client Privilege, GDPR, CCPA |
Yes |
finance |
REGULATED |
SOX, PCI-DSS, FINRA, GDPR |
Yes |
hr |
REGULATED |
EEOC, GDPR, CCPA, FCRA |
Yes |
security |
RESTRICTED |
SOC2, ISO 27001, NIST SP 800-53, CFAA |
Yes (Professional+ licence required) |
Tip: Use GET /compliance/profiles to retrieve the full machine-readable config for all 14 profiles, including PII scan flags and per-profile disclaimer text injected at session start.