Admin Guide — LINUS-AI

Section 1

Deployment Architectures

LINUS-AI supports four canonical deployment topologies, from a single workstation to a multi-node air-gapped cluster. Choose the pattern that matches your scale and security requirements.

🖥

Single-Node Server

One machine runs the REST API. LAN clients connect over HTTP/HTTPS. Simplest to operate; ideal for teams of up to ~20 concurrent users.

⚡

Multi-GPU Single Node

Tensor parallelism splits model weights across N GPUs on one machine. No inter-node networking required. Supports up to 8× GPU with NVLink.

🕸

Multi-Node Cluster

Mesh networking connects a coordinator with worker nodes. Pipeline parallelism distributes transformer layers across machines. mTLS-encrypted transport.

🔒

Air-Gap / Offline

After licence activation, LINUS-AI operates with zero internet connectivity. Models are pulled once, verified by hash, and stored locally.

Single-Node Server Topology

single-node topology

┌────────────────────────────────────────────────────────┐ │ LAN / Corporate Network │ │ │ │ Client A ──┐ │ │ Client B ──┤──► Nginx :443 ──► linus-ai :8080 │ │ Client C ──┘ (TLS termination) │ │ │ ▼ │ │ Model on disk │ │ (GPU / CPU infer) │ └────────────────────────────────────────────────────────┘

Multi-GPU Single-Node Topology

tensor parallel — 4× GPU

┌────────────────────────────────────────────────────────┐ │ linus-ai server (tensor_parallel = 4) │ │ │ │ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │ │ │ GPU0 │ │ GPU1 │ │ GPU2 │ │ GPU3 │ NVLink/PCIe │ │ │shard0│◄►│shard1│◄►│shard2│◄►│shard3│ │ │ └──────┘ └──────┘ └──────┘ └──────┘ │ │ AllReduce sync on every forward pass │ └────────────────────────────────────────────────────────┘

Multi-Node Cluster Topology

coordinator + workers — mesh networking

┌────────────────────────────────────────────────────────┐ │ Cluster (LAN / VPN) │ │ │ │ Clients ──► ┌─────────────────────┐ │ │ │ Coordinator Node │ :8080 REST API │ │ │ 192.168.1.10 │ :9090 mesh │ │ └────────┬────────────┘ │ │ mTLS │ mesh │ │ ┌────────────┴─────────────┐ │ │ ▼ ▼ │ │ ┌─────────────────┐ ┌─────────────────┐ │ │ │ Worker Node │ │ Worker Node │ │ │ │ 192.168.1.11 │ │ 192.168.1.12 │ │ │ │ layers 0–15 │ │ layers 16–31 │ │ │ └─────────────────┘ └─────────────────┘ │ └────────────────────────────────────────────────────────┘

Air-Gap / Offline Deployment

air-gap deployment flow

# Step 1 — on internet-connected machine, generate offline token

$ linus-ai --generate-offline-token --key LNAI-XXXX-XXXX-XXXX-XXXX

✓ Offline activation token saved: linus-ai-offline-token.bin

# Step 2 — transfer token + binary + model files to air-gapped machine

# (USB drive, secure file transfer, etc.)

# Step 3 — activate offline

$ linus-ai --activate-offline linus-ai-offline-token.bin

✓ Licence activated offline (Enterprise · air-gap mode)

✓ No outbound connections required from this point forward

Section 2

System Requirements

Hardware requirements scale with the model size and concurrency you need. GPU acceleration is optional but strongly recommended for production workloads.

Edition	Min RAM	Recommended RAM	GPU VRAM	Storage	Operating System
Community	4 GB	8 GB	Not required (CPU inference)	20 GB	Linux · macOS · Windows 10+
Professional	8 GB	16 GB	8 GB VRAM recommended	50 GB	Linux · macOS · Windows 10+
Team	16 GB	32 GB	16 GB VRAM recommended	200 GB per node	Linux (Ubuntu 22.04+ / RHEL 9+ recommended)
Enterprise	Custom — contact sales@linus-ai.com for sizing guidance

GPU backends supported: NVIDIA CUDA 11.8+ / 12.x, AMD ROCm 5.6+, Apple Metal (M1/M2/M3/M4). CPU inference uses AVX2/AVX-512 SIMD on x86_64, and NEON on ARM64.

Software Prerequisites

Component	Version	Notes
Linux kernel	5.15+	Required for io_uring async I/O
CUDA Toolkit	11.8 or 12.x	Only if using NVIDIA GPU backend
ROCm	5.6+	Only if using AMD GPU backend
glibc	2.31+	Ubuntu 20.04+ / Debian 11+ / RHEL 8+
Nginx	1.18+ (optional)	For TLS termination and reverse proxying
Docker / Podman	24+ (optional)	For containerised deployments

Section 3

Production Deployment

Run LINUS-AI as a managed system service with automatic restarts, structured logging, and an Nginx reverse proxy handling TLS termination.

Systemd Service Unit

Create the unit file at /etc/systemd/system/linus-ai.service:

/etc/systemd/system/linus-ai.service

[Unit]

Description=LINUS-AI Private Inference Engine

Documentation=https://linus-ai.com/docs/admin

After=network.target

Wants=network-online.target

[Service]

Type=simple

User=linus-ai

Group=linus-ai

WorkingDirectory=/var/lib/linus-ai

ExecStart=/usr/local/bin/linus-ai --serve --host 127.0.0.1 --port 8080

Restart=always

RestartSec=5s

TimeoutStopSec=30s

# Environment

Environment=HOME=/var/lib/linus-ai

Environment=LINUS_AI_CONFIG=/etc/linus-ai/config.toml

Environment=LOG_LEVEL=info

# Logging — write to journald

StandardOutput=journal

StandardError=journal

SyslogIdentifier=linus-ai

[Install]

WantedBy=multi-user.target

# Enable and start the service

$ sudo systemctl daemon-reload

$ sudo systemctl enable --now linus-ai

● linus-ai.service — LINUS-AI Private Inference Engine

Active: active (running)

Nginx Reverse Proxy (HTTPS Termination)

Place this config at /etc/nginx/sites-available/linus-ai and symlink to sites-enabled:

/etc/nginx/sites-available/linus-ai

server {

listen 80;

server_name ai.example.com;

return 301 https://$host$request_uri;

}

server {

listen 443 ssl http2;

server_name ai.example.com;

# TLS — managed by certbot

ssl_certificate /etc/letsencrypt/live/ai.example.com/fullchain.pem;

ssl_certificate_key /etc/letsencrypt/live/ai.example.com/privkey.pem;

ssl_protocols TLSv1.2 TLSv1.3;

ssl_ciphers HIGH:!aNULL:!MD5;

ssl_session_cache shared:SSL:10m;

ssl_session_timeout 10m;

# Proxy to linus-ai

location / {

proxy_pass http://127.0.0.1:8080;

proxy_http_version 1.1;

proxy_set_header Upgrade $http_upgrade;

proxy_set_header Connection "upgrade";

proxy_set_header Host $host;

proxy_set_header X-Real-IP $remote_addr;

proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

proxy_read_timeout 300s; # allow time for long responses

proxy_buffering off; # required for SSE streaming

}

# Block public access to metrics endpoint

location /metrics {

allow 10.0.0.0/8;

deny all;

}

Let's Encrypt Certificate (certbot)

certbot — initial issuance & auto-renewal

$ sudo apt install certbot python3-certbot-nginx

$ sudo certbot --nginx -d ai.example.com --email admin@example.com --agree-tos

✓ Certificate issued. Valid for 90 days.

✓ Auto-renewal cron job installed at /etc/cron.d/certbot

# Verify auto-renewal

$ sudo certbot renew --dry-run

✓ Dry run: renewal successful

Docker Compose Alternative

docker-compose.yml

version: "3.9"

services:

linus-ai:

image: ghcr.io/linus-ai/linus-ai:latest

restart: always

ports:

- "127.0.0.1:8080:8080"

volumes:

- ./config:/etc/linus-ai:ro

- linus-ai-data:/var/lib/linus-ai

- /path/to/models:/models:ro

environment:

- LINUS_AI_CONFIG=/etc/linus-ai/config.toml

- LINUS_AI_API_KEY=${LINUS_AI_API_KEY}

- LOG_LEVEL=info

deploy:

resources:

reservations:

devices:

- driver: nvidia

count: all

capabilities: [gpu]

healthcheck:

test: ["CMD", "curl", "-f", "http://localhost:8080/health"]

interval: 30s

timeout: 5s

retries: 3

start_period: 60s

volumes:

linus-ai-data:

Health Check Endpoint

GET /health

$ curl -s http://localhost:8080/health | jq .

{

"status": "ok",

"version": "2.4.1",

"uptime_seconds": 84321,

"model_loaded": "llama3.2",

"gpu_available": true,

"gpu_backend": "cuda",

"active_connections": 3

}

Log Rotation

Create /etc/logrotate.d/linus-ai to manage log file growth:

/etc/logrotate.d/linus-ai

/var/lib/linus-ai/.linus_ai/logs/*.log {

daily

rotate 14

compress

delaycompress

missingok

notifempty

postrotate

systemctl kill -s HUP linus-ai.service

endscript

}

Section 4

Access Control

LINUS-AI provides layered access control: API key authentication, IP allowlisting, per-key rate limiting, and LDAP/SSO integration for Enterprise deployments.

API Key Authentication

Set the LINUS_AI_API_KEY environment variable (or server.api_key in config.toml) to require bearer token authentication on all API requests.

API key setup

# Generate a strong random key

$ openssl rand -hex 32

a3f8c2d1e9b74056c1f2a3b4d5e6f708a9b0c1d2e3f4a5b6c7d8e9f0a1b2c3d4

# Set in environment (recommended: use a secrets manager)

$ export LINUS_AI_API_KEY=a3f8c2d1e9b74056c1f2a3b4d5e6f708...

# Clients include the key as a Bearer token

$ curl http://localhost:8080/v1/chat/completions \

-H "Authorization: Bearer a3f8c2d1e9b74056..." \

-H "Content-Type: application/json" \

-d '{"model":"llama3.2","messages":[...]}'

Multiple API Keys (Team / Enterprise)

Define per-team or per-user keys in config.toml with optional rate limits:

config.toml — multiple API keys

[server]

host = "127.0.0.1"

port = 8080

[[server.api_keys]]

name = "engineering"

key = "eng-a3f8c2d1e9b74056..."

rate_limit = 100 # requests per minute

token_limit = 50000 # tokens per hour

[[server.api_keys]]

name = "data-science"

key = "ds-7b2c9f1a4d8e3b5c..."

rate_limit = 200

token_limit = 200000

[[server.api_keys]]

name = "readonly-bot"

key = "bot-2e5a8f3c1d7b9e4a..."

rate_limit = 10

allowed_endpoints = ["/v1/chat/completions", "/health"]

IP Allowlisting

config.toml — IP allowlist

[server]

allowed_ips = [

"10.0.0.0/8", # private class A

"172.16.0.0/12", # private class B

"192.168.0.0/16", # private class C

"127.0.0.1", # loopback

]

# Requests from IPs outside this list receive HTTP 403

LDAP / Active Directory Integration Enterprise

config.toml — LDAP/AD integration

[auth.ldap]

enabled = true

server = "ldap://dc01.corp.example.com:389"

bind_dn = "cn=svc-linusai,ou=service,dc=corp,dc=example,dc=com"

bind_password = "${LDAP_BIND_PASSWORD}" # from env

search_base = "ou=users,dc=corp,dc=example,dc=com"

search_filter = "(&(objectClass=user)(sAMAccountName={username}))"

group_filter = "(&(objectClass=group)(member={dn}))"

# Only members of this AD group can access the API

allowed_groups = ["CN=AI-Users,OU=groups,DC=corp,DC=example,DC=com"]

# Admin group — can access /metrics and management endpoints

admin_groups = ["CN=AI-Admins,OU=groups,DC=corp,DC=example,DC=com"]

# TLS for LDAPS

tls = true

tls_ca_cert = "/etc/ssl/certs/corp-ca.pem"

SSO via OAuth2 / OIDC Enterprise

config.toml — OIDC / SSO

[auth.oidc]

enabled = true

provider_url = "https://sso.corp.example.com/realms/main"

client_id = "linus-ai-prod"

client_secret = "${OIDC_CLIENT_SECRET}"

redirect_uri = "https://ai.example.com/auth/callback"

scopes = ["openid", "profile", "email", "groups"]

required_groups = ["ai-users"] # claim name from IdP

admin_groups = ["ai-admins"]

Section 5

Multi-Seat Licence Management

Team and Enterprise licences support multiple concurrent seat activations against a single licence key. Manage seats from the CLI or the Enterprise portal.

🪑

Team Licence

Up to 10 named machine seats per key. Each user/machine activates against the shared key. Seat count enforced server-side.

🏢

Enterprise Licence

Unlimited or negotiated seat count. Managed via the Enterprise portal. Supports air-gap activation and licence server (on-prem).

🔄

Seat Transfer

Professional: deactivate old machine, then reactivate on new. Team/Enterprise: managed via portal or support ticket.

Check Seat Usage

linus-ai --license-status

$ linus-ai --license-status

Licence: Team Edition

Key: LNAI-TEAM-XXXX-XXXX-XXXX

Seats: 7 / 10 active

Expires: 2026-12-31

Active machines:

1. build-server-01 activated 2025-09-01

2. dev-workstation-a activated 2025-09-03

3. dev-workstation-b activated 2025-09-03

4. ml-node-01 activated 2025-10-15

5. ml-node-02 activated 2025-10-15

6. staging-server activated 2025-11-01

7. prod-server activated 2025-12-01

Activate a New Seat

activating a new machine

$ linus-ai --activate LNAI-TEAM-XXXX-XXXX-XXXX

✓ Licence activated (Team · 8/10 seats used)

Machine ID: new-workstation-c

Deactivate a Machine

deactivating a seat

# Deactivate from the machine you want to remove

$ linus-ai --deactivate

✓ Machine deactivated. Seat released (7/10 seats now used).

# If the machine is inaccessible, contact support:

# support@linus-ai.com — include your licence key

Air-Gap Seat Activation (Enterprise)

offline seat activation flow

# 1. On the air-gapped machine, generate an activation request

$ linus-ai --generate-activation-request --key LNAI-ENT-XXXX-XXXX-XXXX

✓ Activation request saved: activation-request.json

# 2. Transfer activation-request.json to internet-connected machine

# 3. Submit via Enterprise portal → get activation-token.bin

# 4. Transfer token back to air-gapped machine

$ linus-ai --activate-offline activation-token.bin

✓ Seat activated offline (Enterprise · air-gap mode)

Section 6

Monitoring & Observability

LINUS-AI exposes Prometheus-compatible metrics and structured JSON logs for integration with your existing observability stack.

Prometheus Metrics

Metrics are available at GET /metrics in Prometheus text format.

Metric	Type	Description
`linus_ai_request_count_total`	Counter	Total API requests, labelled by method, path, status
`linus_ai_tokens_per_second`	Gauge	Current inference throughput in tokens/s
`linus_ai_model_load_time_seconds`	Histogram	Time taken to load model into memory at startup
`linus_ai_active_connections`	Gauge	Number of currently active client connections
`linus_ai_gpu_utilization`	Gauge	GPU utilisation percentage (per device, labelled by gpu_id)
`linus_ai_memory_used_bytes`	Gauge	Process RSS memory usage in bytes
`linus_ai_request_duration_seconds`	Histogram	End-to-end request latency
`linus_ai_tokens_in_total`	Counter	Total input tokens processed
`linus_ai_tokens_out_total`	Counter	Total output tokens generated

Prometheus Scrape Config

prometheus.yml — scrape config

scrape_configs:

- job_name: "linus-ai"

scrape_interval: 15s

scrape_timeout: 10s

static_configs:

- targets:

- "10.0.1.10:8080" # prod-server

- "10.0.1.11:8080" # ml-node-01

- "10.0.1.12:8080" # ml-node-02

relabel_configs:

- source_labels: [__address__]

target_label: instance

metrics_path: /metrics

# Add bearer token if metrics auth is enabled

authorization:

credentials: "${LINUS_AI_METRICS_TOKEN}"

Grafana Dashboard: A pre-built Grafana dashboard JSON (with GPU utilisation, token throughput, request latency, and connection panels) is available at github.com/LINUS-AI-PRO/linus-ai/tree/main/monitoring. Import it via Grafana → Dashboards → Import → Upload JSON.

Structured JSON Logs

Logs are written to ~/.linus_ai/logs/linus-ai.log (or /var/lib/linus-ai/.linus_ai/logs/ when running as a service). Each line is a JSON object.

linus-ai.log — sample entries

{"timestamp":"2025-12-01T09:14:22.341Z","level":"info","method":"POST","path":"/v1/chat/completions","model":"llama3.2","tokens_in":142,"tokens_out":318,"duration_ms":1204,"status":200,"client_ip":"10.0.0.42"}

{"timestamp":"2025-12-01T09:14:25.102Z","level":"info","method":"GET","path":"/health","duration_ms":1,"status":200,"client_ip":"127.0.0.1"}

{"timestamp":"2025-12-01T09:14:31.889Z","level":"warn","event":"rate_limit_exceeded","key_name":"engineering","client_ip":"10.0.0.55"}

{"timestamp":"2025-12-01T09:15:00.001Z","level":"error","event":"model_load_failed","model":"llama3-70b","reason":"insufficient_vram","detail":"need 40GB, have 24GB"}

Log Levels

log level configuration

# Set via environment variable

$ export LOG_LEVEL=debug # debug | info | warn | error

# Or in config.toml

[logging]

level = "info"

format = "json" # json | text

path = "~/.linus_ai/logs/linus-ai.log"

max_size_mb = 100

max_backups = 7

compress_old = true

Health Check

health check — full response

$ curl -s http://localhost:8080/health | jq .

{

"status": "ok",

"version": "2.4.1",

"build": "2025-12-01T00:00:00Z",

"uptime_seconds": 84321,

"model_loaded": "llama3.2",

"gpu_available": true,

"gpu_backend": "cuda",

"gpu_devices": ["NVIDIA RTX 4090", "NVIDIA RTX 4090"],

"active_connections": 3,

"tokens_per_second": 87.4

}

Section 7

Backup & Recovery

Back up configuration, the encrypted vault, and your licence file. Model files are large but re-pullable; back them up only if your recovery time objective requires it.

What to Back Up

Path	Size	Criticality	Notes
`~/.linus_ai/config.toml`	< 5 KB	Critical	All runtime configuration
`~/.linus_ai/vault/`	Variable	Critical	AES-256-GCM encrypted conversation store
`~/.linus_ai/licence.bin`	< 1 KB	Critical	Locally cached licence; restore from key if lost
`~/.linus_ai/vault.key`	< 1 KB	Critical	Vault encryption key — required when migrating machines
`~/.linus_ai/models/`	1–100+ GB	Optional	Re-pullable from registry; backup if air-gapped

Vault key warning: The vault encryption key is auto-derived from hardware identifiers on each machine. If you restore the vault on a different machine without also restoring the key file, the vault contents will be unreadable. Always export the key with linus-ai --export-vault-key before migrating.

Backup Script (Bash / cron)

/usr/local/bin/backup-linus-ai.sh

#!/usr/bin/env bash

set -euo pipefail

BACKUP_DIR="/mnt/backups/linus-ai"

DATE=$(date +%Y%m%d-%H%M%S)

DEST="${BACKUP_DIR}/${DATE}"

LINUS_HOME="/var/lib/linus-ai/.linus_ai"

mkdir -p "${DEST}"

# Back up critical files

cp "${LINUS_HOME}/config.toml" "${DEST}/"

cp "${LINUS_HOME}/licence.bin" "${DEST}/"

cp "${LINUS_HOME}/vault.key" "${DEST}/"

cp -r "${LINUS_HOME}/vault/" "${DEST}/vault/"

# Compress

tar -czf "${BACKUP_DIR}/linus-ai-${DATE}.tar.gz" -C "${BACKUP_DIR}" "${DATE}"

rm -rf "${DEST}"

# Prune backups older than 30 days

find "${BACKUP_DIR}" -name "*.tar.gz" -mtime +30 -delete

echo "Backup complete: linus-ai-${DATE}.tar.gz"

# Add to crontab: daily at 02:00

# 0 2 * * * /usr/local/bin/backup-linus-ai.sh >> /var/log/linus-ai-backup.log 2>&1

Vault Key Export / Migration

vault key export & restore

# On the source machine — export vault key

$ linus-ai --export-vault-key --output vault-export.key

✓ Vault key exported (keep this file secure!)

# On the destination machine — import vault key before restoring vault

$ linus-ai --import-vault-key vault-export.key

✓ Vault key imported. Vault data now readable on this machine.

Disaster Recovery Procedure

Install the LINUS-AI binary on the new machine (same version or newer).
If migrating vault data: import the vault key before restoring vault files.
Restore config.toml, vault/, and licence.bin to ~/.linus_ai/.
Re-activate the licence: linus-ai --activate LNAI-XXXX-XXXX-XXXX-XXXX (or use offline token for air-gap).
If using system models directory, re-pull or restore model files.
Start the service: sudo systemctl start linus-ai and verify with /health.

Section 8

Security Hardening

For production deployments, apply the following hardening checklist. Each layer reduces the blast radius of a potential compromise.

👤

Non-root Service User

Run as a dedicated linus-ai system user with no shell and no sudo access. Never run as root.

🔥

Firewall Rules

Expose port 8080 only to trusted subnets. Block the metrics endpoint on public interfaces via Nginx or iptables.

🔑

API Key Rotation

Rotate API keys at least quarterly, or immediately upon suspected compromise. Zero-downtime rotation via config reload.

📋

Audit Log Review

Periodically review structured logs for anomalous patterns: unusual IPs, high error rates, unexpected models, or rate-limit events.

🏠

Principle of Least Privilege

Use systemd security directives to confine the service. The unit should not be able to access network namespaces, kernel modules, or arbitrary paths.

🔒

TLS Everywhere

Always use HTTPS for client-facing traffic. For multi-node mesh, mTLS is auto-configured. Never expose plain HTTP on a public or shared network.

Create the Service User

create dedicated linus-ai system user

$ sudo useradd --system --no-create-home --shell /sbin/nologin \

--home-dir /var/lib/linus-ai linus-ai

$ sudo mkdir -p /var/lib/linus-ai /etc/linus-ai

$ sudo chown linus-ai:linus-ai /var/lib/linus-ai

$ sudo chmod 750 /var/lib/linus-ai

Firewall Rules (ufw)

ufw — restrict access

# Allow HTTPS from the corporate network only

$ sudo ufw allow from 10.0.0.0/8 to any port 443 proto tcp

$ sudo ufw allow from 192.168.0.0/16 to any port 443 proto tcp

# Deny direct access to linus-ai port from all external hosts

# (Nginx reverse proxy handles 443 → 8080)

$ sudo ufw deny 8080

# Allow mesh networking port from cluster subnet only

$ sudo ufw allow from 10.0.1.0/24 to any port 9090 proto tcp

Hardened Systemd Unit

Apply additional systemd security directives to sandbox the service:

/etc/systemd/system/linus-ai.service — hardened

[Unit]

Description=LINUS-AI Private Inference Engine

After=network.target

[Service]

Type=simple

User=linus-ai

Group=linus-ai

WorkingDirectory=/var/lib/linus-ai

ExecStart=/usr/local/bin/linus-ai --serve --host 127.0.0.1 --port 8080

Restart=always

RestartSec=5s

# Privilege restrictions

NoNewPrivileges=yes

CapabilityBoundingSet=

AmbientCapabilities=

SecureBits=keep-caps-locked

# Filesystem isolation

PrivateTmp=yes

PrivateDevices=no # set yes if not using GPU

ProtectSystem=strict

ProtectHome=yes

ReadWritePaths=/var/lib/linus-ai /etc/linus-ai

ReadOnlyPaths=/usr/local/bin/linus-ai

# Network namespace (remove if mesh networking required)

RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX

# System call filtering

SystemCallFilter=@system-service

SystemCallFilter=~@privileged @resources

SystemCallArchitectures=native

# Kernel hardening

ProtectKernelTunables=yes

ProtectKernelModules=yes

ProtectKernelLogs=yes

ProtectControlGroups=yes

ProtectClock=yes

LockPersonality=yes

MemoryDenyWriteExecute=no # JIT inference requires W+X

[Install]

WantedBy=multi-user.target

API Key Rotation (Zero Downtime)

rotating an API key without downtime

# 1. Add the new key alongside the old key in config.toml

# 2. Reload config (no restart required)

$ sudo systemctl kill -s HUP linus-ai.service

✓ Config reloaded. Both old and new keys are now active.

# 3. Update clients to use the new key

# 4. Remove the old key from config.toml, reload again

$ sudo systemctl kill -s HUP linus-ai.service

✓ Old key deactivated.

Section 9

Updating LINUS-AI

Updates are distributed as single-file binaries via GitHub Releases. The update process is designed to be fast and low-risk, with rollback support.

Standard Update Procedure

update to latest release

# 1. Download the new binary

$ VERSION="2.5.0"

$ curl -Lo /tmp/linus-ai-new \

"https://github.com/LINUS-AI-PRO/linus-ai/releases/download/v${VERSION}/linus-ai-linux-x86_64"

# 2. Verify SHA-256 checksum

$ curl -sL "https://github.com/LINUS-AI-PRO/linus-ai/releases/download/v${VERSION}/checksums.txt" \

| grep "linus-ai-linux-x86_64" | sha256sum --check

linus-ai-linux-x86_64: OK

# 3. Keep previous binary as rollback

$ sudo cp /usr/local/bin/linus-ai /usr/local/bin/linus-ai.prev

# 4. Replace the binary

$ chmod +x /tmp/linus-ai-new

$ sudo mv /tmp/linus-ai-new /usr/local/bin/linus-ai

# 5. Restart the service

$ sudo systemctl restart linus-ai

# 6. Confirm the new version

$ linus-ai --version

LINUS-AI v2.5.0 (linux/amd64, cuda 12.3)

Rollback Procedure

rollback to previous version

$ sudo systemctl stop linus-ai

$ sudo mv /usr/local/bin/linus-ai.prev /usr/local/bin/linus-ai

$ sudo systemctl start linus-ai

✓ Rolled back to previous version.

Check release notes before upgrading. Some releases introduce breaking changes to config.toml structure. The release notes at github.com/LINUS-AI-PRO/linus-ai/releases include a "Breaking Changes" section. Always review before upgrading in production.

Symlink-Based Version Management

versioned binaries with symlink

# Store versioned binaries, point symlink at active version

$ sudo mv /tmp/linus-ai-new /usr/local/lib/linus-ai/linus-ai-2.5.0

$ sudo ln -sfn /usr/local/lib/linus-ai/linus-ai-2.5.0 /usr/local/bin/linus-ai

# Rollback = update symlink

$ sudo ln -sfn /usr/local/lib/linus-ai/linus-ai-2.4.1 /usr/local/bin/linus-ai

$ sudo systemctl restart linus-ai

Section 10

Troubleshooting

Diagnose and resolve the most common production issues. Start with journalctl -u linus-ai -n 50 for recent service logs.

Service Won't Start

diagnosing startup failures

# Check recent service logs

$ journalctl -u linus-ai -n 50 --no-pager

# Check service status

$ systemctl status linus-ai

# Common causes:

# - Config parse error: validate with: linus-ai --validate-config

# - Port already in use: ss -tlnp | grep 8080

# - Permission error: check /var/lib/linus-ai ownership

# - Binary not found: which linus-ai

$ linus-ai --validate-config --config /etc/linus-ai/config.toml

✓ Config valid

GPU Not Detected

GPU diagnostics

# NVIDIA

$ nvidia-smi

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 535.54 Driver Version: 535.54 CUDA Version: 12.2 |

# AMD ROCm

$ rocm-smi

# Check CUDA environment

$ nvcc --version

$ ldconfig -p | grep libcuda

# Force CPU fallback while debugging GPU

$ linus-ai --serve --backend cpu

# Check linus-ai GPU detection

$ linus-ai --system-info

GPU: NVIDIA RTX 4090 (24 GB VRAM) — CUDA 12.2 ✓

High Memory Usage

reducing memory pressure

# config.toml — reduce context window and enable KV cache quantisation

[inference]

context_len = 4096 # reduce from 8192

kv_cache_quantize = true # Q8 KV cache — ~50% VRAM reduction

gpu_memory_fraction = 0.85 # leave headroom for OS

batch_size = 1 # reduce concurrent requests

# Or use a more aggressive quantisation level

quantize = "Q4_K_M" # instead of Q8_0 or F16

Slow First Request (Model Load Time)

pre-warming the model at startup

# config.toml — load model at startup, not on first request

[inference]

warmup_on_start = true

model = "llama3.2" # model to pre-load

# Systemd: extend start timeout to allow model loading

[Service]

TimeoutStartSec = 300 # 5 minutes for large models

Mesh Nodes Not Connecting

mesh connectivity diagnostics

# 1. Check firewall — port 9090 must be open between all mesh nodes

$ nc -zv 192.168.1.10 9090

Connection to 192.168.1.10 9090 port [tcp] succeeded!

# 2. Check mesh status on coordinator

$ linus-ai --mesh-status

Cluster: 1 coordinator, 1/2 workers connected

✓ 192.168.1.11 worker online (ping: 0.4ms)

✗ 192.168.1.12 worker offline (last seen: 5m ago)

# 3. mTLS cert rotation — rotate if certificates have expired

$ linus-ai --rotate-mesh-certs

✓ mTLS certificates rotated. Workers will reconnect automatically.

# 4. Check logs on the disconnected worker

$ journalctl -u linus-ai -n 30 --no-pager | grep mesh

Common Error Codes

Code / Event	HTTP Status	Cause	Resolution
`auth_required`	401	No or invalid API key	Include `Authorization: Bearer KEY` header
`forbidden`	403	IP not in allowlist, or key lacks endpoint access	Check `allowed_ips` and `allowed_endpoints` in config
`rate_limit_exceeded`	429	Per-key rate limit reached	Reduce request rate, or increase `rate_limit` for the key
`model_not_found`	404	Requested model not downloaded	Run `linus-ai --pull-model MODEL_NAME`
`context_too_long`	400	Prompt exceeds `context_len`	Shorten prompt or increase `context_len` in config
`insufficient_vram`	503	GPU VRAM insufficient for model	Use a smaller model or lower quantisation; enable CPU offload
`licence_invalid`	503	Licence expired, revoked, or seats exceeded	Renew licence or deactivate unused seats
`mesh_unavailable`	503	Worker nodes not reachable	Check firewall, mTLS certs, and worker service status
`inference_timeout`	504	Response took longer than `inference_timeout_s`	Reduce `max_tokens`, increase timeout, or scale hardware

Still stuck? Collect a support bundle with linus-ai --support-bundle — this generates a redacted archive of logs, config (secrets masked), and system info. Attach it when contacting support@linus-ai.com.

Section 12 · New in v1.4.0

Compliance & Audit Configuration

LINUS-AI v1.4.0 introduces a fully integrated compliance layer with tamper-evident audit logging, regulated-profile consent management, and fine-grained RAG document access control.

Audit Log Storage

By default, all audit logs are written to ~/.linus-ai/audit/. Two environment variables let administrators redirect or replicate logs to any path — ideal for network mounts, WORM drives, or encrypted partitions.

📁

Primary Directory Override

Set LINUS_AI_AUDIT_DIR to redirect all audit logs to an admin-specified path. Useful for network mounts, WORM drives, or encrypted partitions. Default: ~/.linus-ai/audit/

📤

Export Propagation

Set LINUS_AI_AUDIT_EXPORT_DIRS to a colon-separated list of secondary directories. Every audit record is written to all listed directories in real time — for syslog forwarding, SIEM integration, or offsite backup.

🔒

Immutability Sealing

Completed monthly log files are automatically sealed: chmod 0o400 (read-only), macOS UF_IMMUTABLE chflag, and Linux chattr +i (root required). No modification is possible without first clearing the flag.

🔗

HMAC Chain Verification

Every audit record is HMAC-linked to the previous record. Tamper detection is available via GET /compliance/status — the response includes chain_ok: true/false plus monthly statistics.

Audit log environment variables

# Redirect audit logs to a network-mounted WORM drive

$ export LINUS_AI_AUDIT_DIR=/mnt/worm/linus-ai-audit

# Also mirror every record to a SIEM drop-folder and an offsite backup

$ export LINUS_AI_AUDIT_EXPORT_DIRS=/var/log/siem/linus-ai:/backup/audit

# Verify audit chain integrity

$ curl http://localhost:8080/compliance/status | jq '.chain_ok'

true

Compliance API Endpoints

GET/compliance/profilesAll 14 profile compliance configs (level, regulations, PII scan flag, disclaimer)

GET/compliance/statusAudit chain integrity status + monthly statistics

GET/compliance/consentsConsents granted on this machine (?machine_id=…)

GET/compliance/auditQuery HMAC-chained audit log (?profile_id, ?blocked_only, ?pii_only, ?since, ?limit)

POST/compliance/consentGrant or revoke consent — body: {profile_id, action:"grant"|"revoke", machine_id}

RAG Document Access Control

Register documents with a classification level and ACL, then use the decision engine to enforce access based on principal clearance and explicit allow/deny rules.

GET/rag/documentsList documents (?user_id filters to accessible subset; ?all=true for full registry)

POST/rag/documents/registerRegister a document with classification and ACL

PUT/rag/documents/{id}/aclReplace ACL for a document (allow/deny at user/dept/division/company/role scope)

PUT/rag/documents/{id}/classificationUpdate classification level (0=PUBLIC … 4=TOP_SECRET)

DELETE/rag/documents/{id}Remove document from registry

POST/rag/access-checkTest access decision for a principal + document (returns PERMIT/DENY + rule)

GET/rag/auditQuery RAG access audit log (?doc_id, ?principal_id, ?decision, ?denied_only, ?stats=true)

GET/rag/principalsList all registered principals

POST/rag/principalsRegister or update a principal (user_id, company, division, department, roles, clearance)

DELETE/rag/principals/{id}Remove a principal

🏷

Classification Levels

0 PUBLIC · 1 INTERNAL · 2 CONFIDENTIAL · 3 RESTRICTED · 4 TOP_SECRET

Each principal has a clearance integer; access requires clearance ≥ document classification.

⚖

ACL Scopes

Allow or deny at five scopes: user, company, division, department, and role. Deny rules always take precedence over allow rules at the same scope.

🧮

Decision Algorithm

Owner override → explicit DENY → PUBLIC bypass → clearance gate → TOP_SECRET explicit list → ACL permit (company → division → dept → role → user) → default DENY.

14 Compliance Profiles

Every agent profile ships with a pre-configured compliance tier. Regulated profiles require explicit user consent before a session begins; consent is stored locally and audited.

Profile	Level	Regulations	Consent Required
`general`, `creative`, `reasoning`	`OPEN`	—	No
`code`, `engineering`	`OPEN`	—	No (injection check enabled)
`education`	`AUDIT`	FERPA, COPPA	No
`support`	`AUDIT`	—	No
`sales`	`AUDIT`	GDPR, CAN-SPAM	No
`data_science`	`AUDIT`	GDPR, CCPA	No
`medical`	`REGULATED`	HIPAA, GDPR	Yes
`legal`	`REGULATED`	Attorney-Client Privilege, GDPR, CCPA	Yes
`finance`	`REGULATED`	SOX, PCI-DSS, FINRA, GDPR	Yes
`hr`	`REGULATED`	EEOC, GDPR, CCPA, FCRA	Yes
`security`	`RESTRICTED`	SOC2, ISO 27001, NIST SP 800-53, CFAA	Yes (Professional+ licence required)

Tip: Use GET /compliance/profiles to retrieve the full machine-readable config for all 14 profiles, including PII scan flags and per-profile disclaimer text injected at session start.

Admin Guide — LINUS-AI

Contents

Deployment Architectures

Single-Node Server

Multi-GPU Single Node

Multi-Node Cluster

Air-Gap / Offline

Single-Node Server Topology

Multi-GPU Single-Node Topology

Multi-Node Cluster Topology

Air-Gap / Offline Deployment

System Requirements

Software Prerequisites

Production Deployment

Systemd Service Unit

Nginx Reverse Proxy (HTTPS Termination)

Let's Encrypt Certificate (certbot)

Docker Compose Alternative

Health Check Endpoint

Log Rotation

Access Control

API Key Authentication

Multiple API Keys (Team / Enterprise)

IP Allowlisting

LDAP / Active Directory Integration Enterprise

SSO via OAuth2 / OIDC Enterprise

Multi-Seat Licence Management

Team Licence

Enterprise Licence

Seat Transfer

Check Seat Usage

Activate a New Seat

Deactivate a Machine

Air-Gap Seat Activation (Enterprise)

Monitoring & Observability

Prometheus Metrics

Prometheus Scrape Config

Structured JSON Logs

Log Levels

Health Check

Backup & Recovery

What to Back Up

Backup Script (Bash / cron)

Vault Key Export / Migration

Disaster Recovery Procedure

Security Hardening

Non-root Service User

Firewall Rules

API Key Rotation

Audit Log Review

Principle of Least Privilege

TLS Everywhere

Create the Service User

Firewall Rules (ufw)

Hardened Systemd Unit

API Key Rotation (Zero Downtime)

Updating LINUS-AI

Standard Update Procedure

Rollback Procedure

Symlink-Based Version Management

Troubleshooting

Service Won't Start

GPU Not Detected

High Memory Usage

Slow First Request (Model Load Time)

Mesh Nodes Not Connecting

Common Error Codes

Compliance & Audit Configuration

Audit Log Storage

Primary Directory Override

Export Propagation

Immutability Sealing

HMAC Chain Verification

Compliance API Endpoints

RAG Document Access Control

Classification Levels

ACL Scopes

Decision Algorithm

14 Compliance Profiles