Skip to content

overcrash66/Democritus

Democritus — Decentralized Swarm Inference Network

A distributed, infinite Mixture-of-Experts over a P2P protocol, where every consumer node contributes a distinct "expert" local model and the swarm converges into an emergent super-intelligence.

Rust CI libp2p License


Overview

Democritus is a Rust-native, zero-dependency daemon that turns your local AI model into a node in a global reasoning collective. Unlike centralized LLM APIs or pipeline-parallel systems, Democritus creates a heterogeneous expert swarm — each node runs a different specialized model, and complex prompts are decomposed into parallel sub-tasks routed to the best-matching experts.

How It Differs

Project Architecture Limitation
Petals Pipeline-parallel sharding of one model All nodes run the same model; Python-only
Bittensor Token-incentivized subnet marketplace Heavyweight on-chain overhead per inference
Democritus Heterogeneous expert swarm with semantic routing Rust-native, parallel sub-tasks, lightweight verification

Key Features

  • Semantic Vector Routing — Prompts matched to experts via embedding-space similarity over Kademlia DHT with LSH
  • Macro-Task Parallelism — Complex prompts decomposed into parallel sub-tasks executed concurrently
  • Proof-of-Inference — Probabilistic log-probability auditing with KL divergence verification
  • Mid-Stream Failover — Automatic fallback to backup experts on node failure
  • Hot Model Swapping — LRU eviction for loading/unloading models on demand
  • Private Swarms — PSK-gated DHT partitions for invite-only clusters
  • Prompt Obfuscation — Shard-and-pad splitting so no single expert sees full context
  • Three Interfaces — gRPC (50051), REST/OpenAI-compatible (8080), WebSocket (8081)

The Big Picture

Today's AI is centralized: you send prompts to a provider's server, they return answers, and you pay per token. Democritus flips this model — instead of one giant model serving everyone, it creates a distributed Mixture-of-Experts (MoE) over a peer-to-peer network where every node contributes a unique specialist model.

When you ask a complex question, Democritus decomposes it into sub-tasks, finds the best-matching expert nodes via semantic vector routing, dispatches them in parallel, and synthesizes the results locally. The swarm becomes an emergent super-intelligence — no single node knows everything, but collectively they can reason about anything.

Why it matters:

  • No vendor lock-in — You're not dependent on a single provider's API or pricing
  • Privacy-preserving — Prompts can be obfuscated so no single node sees full context
  • Self-healing — Nodes come and go; the swarm adapts automatically
  • Community-driven — Anyone can contribute a model and earn credits by serving queries

Read the full vision document: The Big Picture


Quick Start

🚀 Quick Install (Pre-built Release)

Get the latest pre-compiled bundle from GitHub Releases:

  • Windows: Run the .exe or .msi installer.
  • macOS: Drag the .dmg bundle to your Applications folder.
  • Linux: Install the .deb package: sudo dpkg -i democritus-swarm_*.deb
  • Android: Download and sideload the .apk.

The installer automatically bundles both the GUI desktop app and the background inference daemon.


🛠️ Build & Install from Source

Prerequisites

  • Rust 1.75+ — Install via rustup
  • CMake + C++ toolchain — Required for llama-cpp-rs (native inference)
  • libclang — Required for ONNX Runtime bindings
  • (Optional) GGUF models — Place in models/ directory
  • (Optional) bge-small-en-v1.5.onnx — For production embeddings (~130 MB)

Build

git clone https://github.com/overcrash66/Democritus.git
cd democritus
cargo build --release

Run (CLI)

# Start the daemon
./target/release/daemon daemon --config config.toml

# Query the swarm
./target/release/daemon ask "Explain the CAP theorem" --experts 3 --stream

# List connected peers
./target/release/daemon peers list --verbose

# View reputation table
./target/release/daemon reputation show

# Run benchmarks
./target/release/daemon benchmark --iterations 100

GUI (Desktop & Mobile)

The project includes a Tauri v2 + Svelte 5 application for managing the daemon, available on Windows, Linux, macOS, Android, and iOS.

Desktop (Windows example):

# Build from repo root
cargo build --release -p daemon
Copy-Item target/release/daemon.exe "gui/src-tauri/binaries/daemon-x86_64-pc-windows-msvc.exe"
cd gui
npm install
cargo tauri build

Output: gui/src-tauri/target/release/bundle/nsis/Democritus Swarm_0.1.1_x64-setup.exe

Android:

# Requires: JDK 17, Android SDK + NDK, Android Studio
cd gui
npm install
npx tauri android init
npx tauri android build --target aarch64-linux-android

Output: gui/src-tauri/gen/android/app/build/outputs/apk/release/app-release.apk

iOS (macOS only):

# Requires: macOS, Xcode, Apple Developer account
cd gui
npm install
npx tauri ios init
npx tauri ios build

Full GUI documentation: gui/README.md

Docker

docker compose up -d

Use Cases

Democritus is designed for a wide range of scenarios, from personal assistants to enterprise deployments:

Use Case Description Link
Personal AI Assistant Run a local model + query the swarm for specialized knowledge Guide
Private Enterprise Swarm PSK-gated private network for organizations with sensitive data Guide
Research Collaboration Share domain-specific models across institutions without exposing data Guide
Developer Tooling IDE plugins that route code questions to language-specific experts Guide
Edge Computing Lightweight local models with cloud swarm fallback Guide
Content Moderation Multi-expert moderation pipeline with consensus decisions Guide

See detailed examples with code: Use Cases


Architecture

flowchart TD
    subgraph Network [Layer 1: Network]
        direction LR
        P2P[libp2p Swarm - QUIC]
        gRPC[gRPC Server]
        REST[REST/WS Gateway]
    end

    subgraph Scheduler [Layer 2: Scheduler]
        Router[Request Router <br/> priority + rate limiter]
    end

    subgraph Inference [Layer 3: Inference]
        Engine[llama-cpp-rs <br/> dedicated thread pool]
    end

    subgraph Memory [VRAM / System RAM]
        Model[Loaded GGUF Model]
    end

    Network --> Scheduler
    Scheduler --> Inference
    Inference --> Memory
Loading

Core Components

Component Description
P2P Network libp2p with QUIC transport, Kademlia DHT, Gossipsub, AutoNAT, DCUTR, Circuit Relay v2
Embedding Engine bge-small-en-v1.5 ONNX model (384-dim) with LSH quantization to 64-bit bitmasks
Decomposition Engine Intent classification (11 categories) + complexity scoring + task DAG generation
Swarm Coordinator Orchestrates DAG execution, LSH routing, PoI audits, MoA aggregation
MoA Aggregator Weighted synthesis, conflict resolution, consensus centroid selection
Reputation Manager Trust scoring with KL divergence PoI audits + gossip-based web-of-trust
Credit Ledger Bilateral accounting: earn by serving, spend by querying
Sanitizer ANSI strip, HTML sandbox, UTF-8 fence, repetition detector, length guard

Project Structure

democritus/
├── Cargo.toml          # Workspace definition
├── config.toml         # Default node configuration
├── daemon/             # Main daemon crate
│   ├── src/
│   │   ├── main.rs         # CLI entrypoint + subcommands
│   │   ├── lib.rs          # Public module re-exports
│   │   ├── config.rs       # Configuration loading + validation
│   │   ├── runner.rs       # Daemon startup orchestrator
│   │   ├── p2p.rs          # libp2p swarm (QUIC, Kademlia, Gossipsub)
│   │   ├── coordinator.rs  # Swarm orchestration + PoI audits
│   │   ├── inference.rs    # llama-cpp-rs inference + hot swapping
│   │   ├── embedding.rs    # ONNX embeddings + LSH projection
│   │   ├── decomposition.rs# Intent classification + task splitting
│   │   ├── dag.rs          # Task DAG + topological sorting
│   │   ├── aggregator.rs   # MoA aggregation strategies
│   │   ├── reputation.rs   # Peer trust + KL divergence audits
│   │   ├── ledger.rs       # Credit ledger + token accounting
│   │   ├── sanitizer.rs    # Input/output sanitization pipeline
│   │   ├── registry.rs     # Model registry + integrity verification
│   │   ├── obfuscation.rs  # Prompt shard-and-pad obfuscation
│   │   ├── routing_cache.rs# LRU routing cache for expert lookup
│   │   ├── lan_discovery.rs# mDNS / local network peer discovery
│   │   ├── auth.rs         # API key authentication middleware
│   │   ├── api.rs          # OpenAI-compatible REST API
│   │   ├── grpc.rs         # gRPC service implementation
│   │   ├── ws.rs           # WebSocket streaming gateway
│   │   ├── health.rs       # Deep diagnostics reporting
│   │   ├── benchmark.rs    # Performance benchmarking suite
│   │   ├── commands/       # CLI subcommand handlers
│   │   └── tests.rs        # Unit + integration tests
│   └── build.rs        # Protobuf codegen
├── proto/              # Protobuf definitions crate
│   └── democritus.proto
├── sdk/                # Client SDK crate
│   └── src/
│       ├── lib.rs          # SDK entrypoint
│       ├── client.rs       # DemocritusClient
│       └── types.rs        # Request/response types
├── gui/                # GUI — Desktop (Tauri v2 + Svelte 5) & Mobile (Android/iOS)
│   ├── src/                # Svelte frontend (views, components, stores)
│   └── src-tauri/          # Rust backend (IPC commands, gRPC client, sidecar)
│       ├── .cargo/config.toml  # Cross-compilation linker + CC/AR config for Android
│       └── binaries/           # Sidecar binaries (desktop) or placeholders (mobile)
├── scripts/            # Build, packaging & test scripts (see scripts/README.md)
│   ├── build.ps1       # Windows packaging
│   ├── build-linux.sh  # Linux .deb packaging
│   ├── build-macos.sh  # macOS .dmg packaging
│   └── ...             # Test scripts (e2e, ledger, parallelism, obfuscation, reputation)
├── docs/               # Documentation (see docs/README.md)
│   ├── README.md         # Documentation index and reading order
│   ├── getting-started.md# End-to-end first-run tutorial
│   ├── big-picture.md    # The vision: why decentralized swarm inference matters
│   ├── use-cases.md      # Practical scenarios with code examples
│   ├── sharing-models.md # How to contribute your model to the swarm
│   ├── architecture.md   # System architecture deep-dive
│   ├── api-reference.md  # API documentation
│   ├── configuration.md  # Configuration guide
│   ├── troubleshooting.md# Common issues and fixes
│   ├── debugging.md      # Logging, diagnostics, and debug workflows
│   └── faq.md            # Frequently asked questions
├── Dockerfile          # Container build
├── docker-compose.yml  # Multi-node compose
├── CHANGELOG.md        # Release history
└── democritus.service  # systemd service file

API Reference

REST API (OpenAI-Compatible)

POST http://localhost:8080/v1/chat/completions
Content-Type: application/json

{
  "model": "swarm",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain the CAP theorem."}
  ],
  "stream": true,
  "temperature": 0.7,
  "max_tokens": 2048
}

gRPC (port 50051)

RPC Description
Chat Streaming chat completion with token-by-token responses
Health Deep diagnostics health check
ListPeers List known peers with reputation scores
GetReputation Query full reputation table

WebSocket (port 8081)

Connect to ws://localhost:8081/ws/chat for real-time token streaming.

Client SDK

use democritus_sdk::DemocritusClient;

#[tokio::main]
async fn main() {
    let client = DemocritusClient::local();
    
    // Simple query
    let response = client.ask("What is Rust?").await?;
    println!("{}", response);
    
    // Streaming query
    let request = ChatCompletionRequest::new("Explain quantum computing")
        .with_stream(true);
    let mut stream = client.chat_completion_stream(request).await?;
    while let Some(token) = stream.next().await {
        print!("{}", token?);
    }
}

Full API documentation: API Reference


Configuration

See docs/configuration.md for the full reference.

[node]
name = "democritus-node-1"
listen_address = "/ip4/0.0.0.0/udp/9000/quic-v1"
bootstrap_nodes = ["/ip4/127.0.0.1/udp/9001/quic-v1/p2p/..."]
state_dir = "./state"
private_swarm_key = "optional-psk-for-private-swarm"

[expert]
# Textual description of this node's AI specialization
description = "Specialized in Rust systems programming"
model_path = "./models/rustacean-7b-Q5_K_M.gguf"
max_context = 8192
temperature = 0.0
backend = "cpu"

[api]
rest_port = 8080
grpc_port = 50051
ws_port = 8081

Sharing Your Model

Want to contribute your model to the swarm? Here's the quick path:

  1. Prepare your model — Convert to GGUF format (recommended: Q5_K_M quantization)
  2. Write a capability description — Be specific: "Specialized in Rust systems programming, memory safety, and concurrent data structures"
  3. Configure your node — Set description and model_path in config.toml
  4. Join the swarm — Start the daemon; your node advertises on the DHT automatically
  5. Build reputation — Serve queries reliably to earn trust in the swarm

You can join the public swarm (open discovery), connect to an existing network (add bootstrap nodes), or create a private swarm (PSK-gated for invite-only).

Step-by-step guide: Sharing Models


Verification & Security

Proof-of-Inference (PoI)

  1. 5% Spotcheck — Random audit of inference requests with KL divergence verification
  2. Cross-Validation — High-priority tasks dispatched to 2 experts simultaneously
  3. Log-Probability Auditing$D_{KL}(P | Q) &lt; 0.15$ threshold for pass/fail
  4. Ed25519 Signatures — All token packets signed for non-repudiation

Privacy

  • Transport Encryption — QUIC/TLS 1.3 for all P2P traffic
  • Private Swarms — PSK-gated DHT partitions with salted protocol names
  • Prompt Obfuscation — Shard-and-pad splitting for sensitive workloads
  • No Data Retention — Expert nodes contractually required to discard prompts after inference

Benchmarking

Run the built-in benchmark suite:

./target/release/daemon benchmark --iterations 100

Measures: embedding latency, SHA-256 hashing, JSON serialization, LSH bitmask matching, Ed25519 signing — with P50/P95/P99 statistics.


Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Documentation

Document Description
Documentation Index Reading order and navigation guide for all documentation
Getting Started End-to-end tutorial: install, build, run, query the swarm
The Big Picture Why decentralized swarm inference matters; the vision and roadmap
Use Cases Practical scenarios with architecture diagrams and code examples
Sharing Models Step-by-step guide to contributing your model to the swarm
Architecture Deep technical dive into each component
API Reference Full REST, gRPC, WebSocket, and SDK documentation
Configuration Configuration file reference and examples
Linux Server Setup Step-by-step guide for deploying on Linux servers
Windows Setup Prerequisites, build, and service setup for Windows
macOS Setup Homebrew, Metal GPU acceleration, and launchd service setup
Troubleshooting Common issues, error messages, and fixes
Debugging Log levels, diagnostics, and debug workflows
FAQ Frequently asked questions
Desktop GUI Tauri v2 + Svelte 5 desktop and mobile (Android/iOS) application
CHANGELOG Release history and version changes

License

Democritus is open-source software licensed under the GNU Affero General Public License v3 (AGPLv3).

Why AGPLv3?

Democritus is built for and with the community. We believe in keeping emergent decentralized intelligence open, collaborative, and non-profit. The AGPLv3 ensures that if anyone modifies this software or runs it as a service over a network (e.g., hosting a cloud gateway or P2P node routing), they must also share their modifications and source code with the community under the same copyleft terms.

See the LICENSE file for the full terms and conditions.


Acknowledgments

About

Democritus turns your local AI model into a node in a global reasoning collective, creates a heterogeneous expert swarm. each node runs a different specialized model, and complex prompts are decomposed into parallel sub-tasks routed to the best-matching experts.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors