README — LINUS-AI v4.0.0

Quickstart, platform binaries, modes, API reference, CLI reference, compliance overview, and licensing — the complete README as a searchable web page.

Quickstart — 60 Seconds

Download the pre-built binary for your platform, activate your license, pull a model, and run. No Python, no pip, no Cargo required.

60-Second Quickstart — Linux x86_64
# 1. Download the binary (or use the GUI installer on macOS/Windows)
$ curl -Lo linus-ai https://github.com/miryala3/linus-ai-public/releases/latest/download/linus-ai-v4.0.0-headless-linux-x86_64
$ chmod +x linus-ai && sudo mv linus-ai /usr/local/bin/
 
# 2. Activate your license key (sent to your email after purchase)
$ linus-ai --activate LNAI-XXXX-XXXX-XXXX-XXXX
✓ License activated · Professional · 1/1 seats
 
# 3. Pull a model and start the server
$ linus-ai --pull-model llama3.2 && linus-ai --serve
Pulling llama3.2 (4.1 GB) ████████████████ 100%
✓ API server running → http://localhost:9480
 
# 4. Open the control panel
$ open http://localhost:9480/app
The binary auto-detects your OS, RAM, GPU, and inference backend. No Python. No pip. No configuration file required. macOS and Windows installers are available on the Download page.

Platform Binaries

Pre-built binaries are published to the public releases page for every tagged version.

Platform Asset name Format
macOS arm64 (Apple Silicon)linus-ai-v4.0.0-headless-macos-arm64Binary
macOS x86_64 (Intel)linus-ai-v4.0.0-headless-macos-x86_64Binary
macOS UniversalLINUS-AI-v4.0.0.dmgSigned DMG (GUI)
Linux x86_64 (static)linus-ai-v4.0.0-headless-linux-x86_64Binary / .deb
Linux arm64 (static)linus-ai-v4.0.0-headless-linux-arm64Binary
Windows 10+ x86_64LINUS-AI-Setup-v4.0.0.exeNSIS installer / MSIX

All releases include SHA256SUMS.txt. Verify: sha256sum -c SHA256SUMS.txt

Modes

Control what runs via CLI flags or the LINUS_AI_MODE and LINUS_AI_SCOPE environment variables.

Mode & Scope flags
$ ./linus_ai # full mode (default) — everything on
$ ./linus_ai --mode standalone # inference only, no mesh
$ ./linus_ai --mode mesh # LAN peer discovery, no scheduler
$ ./linus_ai --scope private # no web search, no outbound (default)
$ ./linus_ai --scope lan # LAN mesh + local models only
$ ./linus_ai --scope open # allow web search in agent mode
$ RUST_LOG=info ./linus_ai # verbose startup log
$ ./linus_ai 2>/dev/null # completely silent terminal
ModeWhat runs
standaloneInference engine + HTTP API
mesh+ LAN peer discovery (mDNS multicast)
hive+ Federated learning across peers
task+ Distributed job scheduler
fullEverything. Recommended.

Key Environment Variables

VariableDefaultDescription
LINUS_AI_MODEfullOperating mode
LINUS_AI_API_PORT9480HTTP API + GUI port
LINUS_AI_MODEL_DIR~/models,~/.linus_ai/modelsModel directories (comma-separated)
LINUS_AI_MESH_SECRETlinus-ai-default-mesh-secretShared HMAC secret for mesh auth — change this
LINUS_AI_SCOPEprivateprivate / lan / open
LINUS_AI_NODE_NAMEsystem hostnameName shown in peer cards
LINUS_AI_LOG_LEVELwarntrace / debug / info / warn / error
LINUS_AI_TAILSCALEfalseEnable Tailscale peer routing
LINUS_AI_OVERLAY_SERVERWAN relay address, e.g. relay.example.com:9777
LINUS_AI_DATA_DIR~/.linus_ai/dataVault, ledger, session storage
LINUS_AI_AUDIT_DIR~/.linus-ai/auditPrimary directory for compliance + RAG audit logs
LINUS_AI_AUDIT_EXPORT_DIRSColon-separated secondary audit export dirs (enterprise SIEM routing)
RUST_LOGOverrides --log-level when set

Inference Backend

The active backend is selected at compile time via build.sh engine flags — not at runtime.

Engine flagBackendGPUPerformanceCross-compile
--llama-cpp2 (default)Bundled llama.cpp (static)Metal / CUDA autoFastestmacOS→macOS only
--candle-onlyPure-Rust candleMetal / CUDA auto~20–30% slowerAny host → any target
(no features)Subprocess fallbackDepends on external toolSlowestAny

Subprocess Fallback Priority

Used when binary built without a bundled engine:

PriorityBackendHow to activate
1llama-server HTTP on :8181Start llama-server --model model.gguf --port 8181
2ollama HTTP on :11434Install Ollama and run ollama serve
3Auto-spawn llama-serverbrew install llama.cpp — spawned for models ≤ 20 GB
4llama-cli subprocessbrew install llama.cpp — GPU first, CPU fallback

Model RAM guard: Models that exceed 85% of physical RAM are automatically marked unavailable and shown greyed-out in the GUI. Use the ⚠ Force button to override.

GPU → CPU fallback: On GPU error (e.g. Metal API mismatch) LINUS-AI retries with CPU-only automatically.

Pipeline Parallelism — Run 70B+ Models Across Multiple Machines

LINUS-AI ships pipeline-parallel inference: a model too large for one machine's RAM is automatically split across N mesh nodes, each holding a contiguous range of transformer layers.

Request
  │
  ▼ Head node (layers 0–15)
  │  embed tokens → run layers → send activations
  ▼ Mid node  (layers 16–47)
  │  receive activations → run layers → forward
  ▼ Tail node (layers 48–79)
     receive activations → compute logits → sample → return token

How it works

  1. The head node reads the GGUF tensor index (no weights loaded yet)
  2. LayerPlanner assigns layer ranges to each node proportional to its RAM
  3. Each node loads only its assigned layers from the GGUF file
  4. Activations flow node-to-node over HTTP /pipeline/forward using a compact binary wire format (LNSP magic, f16/f32, little-endian)
  5. KV cache is maintained per-node per-request
Pipeline API
# Via API
$ curl -X POST http://localhost:9480/pipeline/plan \
-H 'Content-Type: application/json' \
-d '{"model_path": "/path/to/70B.gguf", "use_mesh_peers": true}'
 
$ curl -X POST http://localhost:9480/pipeline/infer \
-d '{"prompt": "Hello", "max_tokens": 256}'

Internet Node Payment (Blockchain Billing)

LAN peers are always free. Internet peers must pay per inference unit.

RuleDetail
LAN / loopbackFree — no account needed
New internet node5 free welcome units on first contact
Cost1 unit per inference call
Token rate1 unit = 1,000 output tokens
Denied402 Payment Required + JSON reason
Billing API
# Check balances
$ curl http://localhost:9480/billing
 
# Top up a node
$ curl -X POST http://localhost:9480/billing/topup \
-d '{"node_id": "abc123", "address": "203.0.113.5", "units": 50}'

GUI Tabs

Open http://localhost:9480/app after starting. The control panel includes these tabs:

TabWhat it does
ChatStreaming tokens, markdown, document upload (.pdf/.docx/.txt), export as .md
AgentReAct loop with tools — thought/action/observation steps inline. Web search when scope=open.
ModelsList GGUF files, load/unload, assign roles. RAM-unfit models shown disabled.
SetupNode role, model role assignments, compliance profile selection, RAG document access control
MeshPeer topology map, hub-spoke visualisation, manual peer connect
ChorusFederated learning controls, gradient aggregation dashboard
TasksTask submit/status/cancel
Thermal5-stage thermal state (NOMINAL → THROTTLE → HOT → CRITICAL → EMERGENCY)
LedgerBlockchain transparency audit trail (SHA-256 hash chain + Merkle proofs)
HardwareAI capability score, RAM/GPU/NPU detection
ComplianceIndustry compliance profiles (14), PII scanning, injection detection, consent management, immutable audit log
PermissionsNode access controls, mesh auth, billing, model sharing
LaunchIn-app shell terminal with command history and quick-launch buttons

API Reference

All endpoints on http://localhost:9480 (configurable via LINUS_AI_API_PORT).

Inference

MethodPathBodyDescription
POST/infer{prompt, system?, max_tokens?, temperature?}Single inference
POST/infer/streamsameSSE token stream
POST/agent/stream{message, profile?, max_tokens?, session_id?}Agent ReAct loop, SSE
POST/pipeline/plan{model_path, use_mesh_peers?}Configure pipeline plan
POST/pipeline/infer{prompt, max_tokens?, temperature?}Pipeline inference
POST/pipeline/forwardbinary frameNode-to-node activation forwarding
POST/pipeline/clear{request_id}Clear KV cache
GET/POST/tensor/planTensorParallelPlan JSONGet / set tensor parallel plan
GET/tensor/statusTP plan + RPC worker health
POST/tensor/infer{prompt, max_tokens?, temperature?}Tensor-parallel inference (coordinator)
POST/tensor/allreducebinary TNSR frameSubmit partial activation; returns reduced tensor
POST/tensor/rpc/start{rpc_port?}Spawn llama-rpc-server on this node
POST/tensor/rpc/stopStop llama-rpc-server

Models

MethodPathDescription
GET/modelsList models with fits_in_ram flag per model
POST/models/selectSet active model — blocked if too large for RAM (pass force:true to override)
POST/models/loadLoad model — blocked if too large for RAM (pass force:true to override)
POST/models/unloadUnload current model
GET/POST/models/rolesGet/set role assignments
POST/models/pullPull model from URL
PUT/models/pushReceive model binary from a peer
GET/models/recommendModel recommendations for this node

Mesh & Peers

MethodPathDescription
GET/peersActive peer list with roles, RAM, thermal
POST/mesh/push-modelPush a model to a specific peer
POST/mesh/assign-rolesReassign hub/spoke roles across the mesh
GET/benchmarkRouting table and latency scores

Billing

MethodPathDescription
GET/billingAll node accounts with balances
POST/billing/topupAdd inference credits: {node_id, address, units}

Compliance & Security

MethodPathBody / ParamsDescription
GET/compliance/preflight?text=...&profile=...Check text against active compliance profile
GET/compliance/profilesList all 14 compliance profiles
POST/compliance/consent{user_id, action, profile}Record user consent
GET/compliance/consent?user_id=...Check consent status
GET/compliance/audit?limit=N&profile=...Query immutable audit log
GET/compliance/audit/verifyVerify HMAC chain integrity
POST/compliance/audit/sealSeal all completed monthly log files
POST/compliance/audit/export{dest_dir}Export audit snapshot to directory

RAG Document Access Control

MethodPathBody / ParamsDescription
GET/rag/documentsList registered documents with classifications
POST/rag/documents/register{title, path, owner_user_id, classification, ...}Register document in registry
PUT/rag/documents/{id}/acl{allow_users, deny_users, allow_companies, ...}Update document ACL
PUT/rag/documents/{id}/classification{classification}Update document classification level
DELETE/rag/documents/{id}Remove document from registry
POST/rag/access-check{user_id, doc_id}Check access and log decision
GET/rag/principalsList registered principals
POST/rag/principals{user_id, name, clearance, company, ...}Create or update principal
DELETE/rag/principals/{user_id}Remove principal
GET/rag/audit?doc_id=...&user_id=...&denied_only=true&limit=NQuery RAG access audit log

Shell & System

MethodPathDescription
POST/shell/execRun a shell command: {command, timeout_s?}
GET/statusNode status, uptime, active model, backend
GET/statsFull stats: thermal, blockchain, mesh
GET/health{"ok": true} liveness probe
GET/blockchainLedger stats, chain validity
GET/thermalThermal state and throttle level
GET/POST/settingsRuntime settings
GET/endpointsFull documented endpoint index

Launch Shell (In-App Terminal)

The Launch tab embeds a terminal in the browser — no external terminal needed.

  • Type any shell command and press Run ▶ or Enter
  • Arrow keys for command history
  • Quick-launch buttons: List models, Disk usage, RAM, GPU, llama procs, API status
  • Safety filter blocks destructive patterns (rm -rf /, mkfs.*, fork bomb)
  • Timeout capped at 120 s per command

CLI Reference

linus_ai CLI
$ linus_ai # start server (full mode)
$ linus_ai --mode standalone # inference only
$ linus_ai --scope open # enable web search in agent mode
$ linus_ai info # hardware info and exit
$ linus_ai models # list AI models found on disk
$ linus_ai compile <script.py> # compile Python script to native binary
$ linus_ai run <script.py> # run Python script interpreted
$ linus_ai emit <script.py> # print generated C source
$ linus_ai targets # list cross-compilation targets

Compliance & Security Layer

LINUS-AI includes a built-in compliance engine (linus_ai/compliance.py) that enforces domain-specific governance before every inference request.

14 Compliance Profiles

TierProfiles
OPEN (permissive)general, creative, reasoning, code, engineering
AUDIT (logged)education, support, sales, data_science
REGULATED (strict)medical, legal, finance, hr
RESTRICTED (hard-block)security

What it checks

  • PII scanning — 12 types (email, phone, SSN, credit card, CVV, passport, IP, MAC, NHS number, IBAN, PAN-like, medical record). Credit card / CVV / SSN / PAN-like inputs are blocked outright; others are redacted.
  • Injection detection — 8 rule families (role override, prompt leak, jailbreak, DAN, system override, token smuggling, goal hijacking, multi-turn extraction). RESTRICTED profile hard-blocks; REGULATED profiles warn.
  • Consent gating — REGULATED/RESTRICTED profiles require explicit user consent before proceeding.
  • Immutable audit log — every request and decision recorded in HMAC-chained monthly .jsonl files; completed months sealed with chmod 0o400 + macOS UF_IMMUTABLE + Linux chattr +i.

Enterprise Audit Routing

SIEM routing
$ export LINUS_AI_AUDIT_DIR=/mnt/compliance/audit
$ export LINUS_AI_AUDIT_EXPORT_DIRS=/siem/linus:/backup/audit

Records are written to all locations in real time. Use POST /compliance/audit/seal to seal completed months and POST /compliance/audit/export for point-in-time snapshots.

RAG Document Access Control

LINUS-AI implements fine-grained access control for Retrieval-Augmented Generation documents (linus_ai/rag_access.py).

Classification Levels

LevelNameAccess
0PUBLICAnyone
1INTERNALAuthenticated company members
2CONFIDENTIALExplicit ACL permit required
3SECRETClearance ≥ 3 + explicit permit
4TOP_SECRETClearance 4 + named on explicit list

ACL Scopes

Access rules can allow or deny at five scopes: company, division, department, role, user. Deny rules always override allow rules.

7-Step Access Decision

  1. Owner override (always permit)
  2. Explicit user DENY
  3. PUBLIC bypass (always permit)
  4. Clearance gate
  5. TOP_SECRET explicit list
  6. ACL permit (company → division → dept → role → user)
  7. Default DENY

All decisions are written to a tamper-evident HMAC-chained audit log.

License

LINUS-AI Source License v2.0 — free for personal / academic / small business (< $100K revenue). Buy once, own forever — updates are optional, not forced.

TierAnnual (updates incl.)Perpetual
CommunityFree foreverFree
Professional (1 seat)$99/yr$499
Team (5 seats)$199/yr$1,499
Enterprise$7,999/yr
Enterprise Plus$14,999/yr

See Pricing page for full tier details, or LICENSE.md for license terms.