MLE — Morpho-Logic Engine

A novel energy-based reasoning AI architecture, CPU-native, gradient-free, built on hyperdimensional computing.

███╗   ███╗ ██╗     ███████╗
████╗ ████║ ██║     ██╔════╝
██╔████╔██║ ██║     █████╗      Morpho-Logic Engine v0.1.0
██║╚██╔╝██║ ██║     ██╔══╝      Energy-Based Reasoning AI
██║ ╚═╝ ██║ ███████╗███████╗
╚═╝     ╚═╝ ╚══════╝╚══════╝

🧠 What is MLE?

MLE is a new class of reasoning engine that replaces neural network backpropagation with energy-based dynamics operating on hyperdimensional binary vectors. It draws from:

Kanerva's Sparse Distributed Memory — memory indexed by proximity in Hamming space
Holographic Reduced Representations (Plate 1995) — circular convolution for semantic binding
Modern Hopfield Networks (Ramsauer et al. 2020) — energy-based pattern completion
Binary Spatter Codes — ultra-fast XOR binding for CPU-native computation

The result is a system that can reason about concepts, solve analogies, compose meanings, and traverse knowledge graphs — all without GPU, without gradients, without training — using pure bitwise operations optimized for CPU SIMD instructions.

🏗️ Architecture

┌─────────────┐    ┌──────────────┐    ┌──────────────┐    ┌─────────────┐
│   Query     │───▶│   Routing    │───▶│   Binding    │───▶│   Energy    │
│  Encoder    │    │  (JIT Beam)  │    │  (Compose)   │    │  (Relax)    │
│             │    │  Top-500     │    │  XOR / FFT   │    │  Hopfield + │
│  str→4096b  │    │  LSH+Expand  │    │  Conv Circ.  │    │  Bit-flip   │
└─────────────┘    └──────────────┘    └──────────────┘    └─────────────┘
       │                                                          │
       │           ┌──────────────┐    ┌──────────────┐          │
       └───────────│   Response   │◀───│   Decode     │◀─────────┘
                   │              │    │   NN + Role   │
                   └──────────────┘    └──────────────┘

5 Modules

Module	File	Description
`memory`	`sparse_address_table.py`	Sparse Address Table: 4096-bit binary vectors, SoA layout, LSH index (32 tables × 8-bit signatures), multi-probe search, activation tracking
`routing`	`recursive_jit_router.py`	Recursive JIT Routing: LSH init → beam-500 refinement → neighbor expansion → convergence check. Multi-hop chaining for chain-of-thought
`binding`	`semantic_binding.py`	Dual binding: Binary (XOR) O(N/64) exact recovery + HRR (FFT) O(N log N) approximate. Triple encoding, analogy queries, sequence binding via permutation
`energy`	`energy_model.py`	Composite energy: compatibility + binding coherence + sparsity + smoothness. Hopfield (continuous, attention-based) + Binary relaxation (simulated annealing). Hybrid mode for best results
`inference`	`reasoning_engine.py`	Full pipeline: encode → route → bind → relax → decode. Association, analogy, composition, structured queries. Multi-step reasoning with convergence detection

SIMD-Optimized Core (`utils/simd_ops.py`)

All core operations are backed by a GCC-compiled C library with -march=native for automatic SIMD vectorization:

// Compiles to AVX-512 VPOPCNTQ or AVX2 POPCNT automatically
int hamming_single(const uint64_t *a, const uint64_t *b, int n) {
    int cnt = 0;
    for (int i = 0; i < n; i++)
        cnt += __builtin_popcountll(a[i] ^ b[i]);
    return cnt;
}

Operation	Throughput	Notes
Hamming distance (single)	~100M ops/s	64 × `POPCNT` per pair
Hamming batch (100K vectors)	25-41M vecs/s	Vectorized XOR + popcount
Top-500 selection	O(N log K)	Max-heap in C
Binary bind (XOR)	~95K ops/s	64 × `XOR` per op
HRR bind (FFT)	~10K ops/s	`numpy.fft.rfft`

Fallback: Pure NumPy LUT-based popcount when GCC isn't available — portable across all platforms.

📐 Core Concepts

4096-bit Binary Vectors

Every concept, relation, and memory address is a 4096-bit binary vector stored as 64 × uint64 words (512 bytes):

# Each vector: 4096 bits = 512 bytes = 64 × uint64
# Storage layout: Structure of Arrays (SoA) for cache locality
# addresses: (N, 64) uint64 — contiguous, cache-aligned
# contents:  (N, 64) uint64 — separate for SIMD batch ops

Key property: Random vectors have Hamming distance ≈ 2048 (50%). Semantic similarity is encoded as deviation from this baseline. Vectors with distance << 2048 are "similar"; vectors at ~2048 are orthogonal.

Sparse Address Table

Memory entries are indexed by binary vectors. Access uses Hamming distance as the proximity metric:

from mle import SparseAddressTable

sat = SparseAddressTable(capacity=100_000)
sat.store(address_vec, content_vec, metadata={'name': 'cat'})
results = sat.query_nearest(query_vec, k=10)  # [(index, distance), ...]

LSH Index: 32 hash tables with 8-bit random-bit-sampling signatures. Multi-probe search (1-bit and 2-bit flips) for high recall. Sub-linear search time for large memories.

Binding Operations

Binary (XOR): bind(A, B) = A ⊕ B. Self-inverse (exact recovery), quasi-orthogonal to inputs, O(N/64).

HRR (FFT): bind(A, B) = IFFT(FFT(A) · FFT(B)). Circular convolution, approximate recovery, similarity-preserving, O(N log N).

from mle.binding import BinaryBinding

# Encode: "king IS_A man"
triple = BinaryBinding.encode_triple(king_vec, is_a_vec, man_vec)

# Decode: recover man from triple
decoded = BinaryBinding.unbind(BinaryBinding.unbind(triple, king_vec), is_a_vec)
# decoded == man_vec (exact with XOR!)

# Analogy: king:man :: queen:?
query = BinaryBinding.create_analogy_query(king_vec, man_vec, queen_vec)
# query ≈ woman_vec (find nearest in codebook)

Energy-Based Reasoning

No backpropagation. No gradients stored. Reasoning is energy minimization:

E(state) = α·E_compat + β·E_binding + γ·E_sparse + δ·E_smooth

Component	Formula	Purpose
Compatibility	-Σ wᵢ · sim(state, contextᵢ)	State agrees with activated memories
Binding coherence	Σ hamming(unbind(bᵢ, rᵢ), fᵢ) / N	Stored relations remain intact
Sparsity	‖activations‖₁	Focused, not diffuse, activation
Smoothness	hamming(current, previous) / N	Stable reasoning trajectory

Two-phase minimization:

Hopfield update: ξ_new = X @ softmax(β · X^T @ ξ) — fast coarse convergence via attention over patterns
Binary relaxation: bit-flip search with simulated annealing — fine discrete refinement

🚀 Quick Start

from mle import MorphoLogicEngine

# Initialize
engine = MorphoLogicEngine(beam_width=500, energy_mode='hybrid')

# Build knowledge
engine.add_concept("cat")
engine.add_concept("dog")
engine.add_concept("animal")
engine.add_relation("cat", "is_a", "animal")
engine.add_relation("dog", "is_a", "animal")

# Reason
result = engine.reason("cat", max_steps=3)
print(result['response']['nearest_concepts'])
# → [('cat', 0.99), ('animal', 0.75), ...]

# Associations
assocs = engine.associate("cat", top_k=5)
# → [('cat_is_a_animal', 0.74), ('dog', 0.52), ...]

# Analogy: king:man :: queen:?
analogy = engine.solve_analogy("king", "man", "queen")
print(analogy['codebook_ranking'][:3])

# Composition: water + animal → ?
comp = engine.compose("water", "animal")
print(comp['response']['nearest_concepts'][:3])

📊 Benchmarks

Measured on a 2-vCPU machine (cpu-basic), single-threaded:

SIMD Throughput

Corpus Size	Batch Hamming	Top-500
1,000	0.04ms (28M/s)	0.06ms
10,000	0.29ms (35M/s)	0.32ms
100,000	4.56ms (22M/s)	4.79ms

Routing Latency

Memory Size	Avg Latency	P99	Candidates
1,000	3.8ms	5.4ms	953
10,000	2.5ms	3.2ms	3,335
50,000	2.7ms	3.5ms	2,679

Memory Efficiency

1,024 bytes/entry (512 address + 512 content)
1,000 entries = 1 MB
100,000 entries = 100 MB

Binding Performance

Binary (XOR): 95,000 ops/sec
HRR (FFT): 10,500 ops/sec

🧪 Tests

pip install numpy scipy
python -m mle.tests.test_full_system

7/7 test groups passing:

✅ SIMD Operations (correctness + performance)
✅ Memory & LSH (storage, retrieval, 100% cluster recall)
✅ Routing (beam width, convergence, scalability)
✅ Binding (XOR exact recovery, HRR approximate recovery, triple encoding)
✅ Energy Convergence (monotonic decrease, Hopfield attention concentration)
✅ Reasoning (association, query, analogy, composition, structured queries)
✅ Integration (500+ concept KB, batch queries, memory efficiency)

🎯 Demo

python -m mle.demo

Runs a full demonstration with 40+ concepts, 42 relations, and tests for concept queries, associations, analogies, compositions, structured queries, and multi-step reasoning.

📁 Project Structure

mle/
├── __init__.py                    # Package init, public API
├── demo.py                        # Interactive demonstration
├── utils/
│   ├── __init__.py
│   └── simd_ops.py               # SIMD C library + NumPy fallback
├── memory/
│   ├── __init__.py
│   └── sparse_address_table.py   # SparseAddressTable + HammingLSH
├── routing/
│   ├── __init__.py
│   └── recursive_jit_router.py   # RecursiveJITRouter
├── binding/
│   ├── __init__.py
│   └── semantic_binding.py       # HRRBinding + BinaryBinding + BindingEngine
├── energy/
│   ├── __init__.py
│   └── energy_model.py           # EnergyFunction + Relaxation + Hopfield
├── inference/
│   ├── __init__.py
│   └── reasoning_engine.py       # ReasoningEngine (full pipeline)
└── tests/
    ├── __init__.py
    └── test_full_system.py       # Comprehensive test suite

🔬 Theoretical Foundations

Paper	Contribution to MLE
Kanerva (1988) "Sparse Distributed Memory"	Binary vector addressing, Hamming distance proximity
Plate (1995) "Holographic Reduced Representations"	Circular convolution binding, FFT implementation
Gayler (2003) "Vector Symbolic Architectures"	XOR binding (BSC), majority-vote bundling
Ramsauer et al. (2020) "Hopfield Networks Is All You Need"	Modern Hopfield energy, exponential capacity, attention ≡ update rule
Frady et al. (2021) "SDM and Transformers"	SDM Hamming threshold ≈ transformer attention
Thomas et al. (2023) "Efficient HDC with Static Optimization"	Optimal BSC dimensions, analytical thresholds
Langford et al. (2024) "Linear Codes for HDC"	GF(2) factorization, 100% XOR recovery

🛤️ Roadmap

Persistent storage: Serialize memory to disk (mmap for instant loading)
Learned embeddings: Pre-encode concepts from text corpora (word2vec → binary projection)
Multi-threaded SIMD: Parallel batch Hamming with OpenMP
Graph walk reasoning: Follow relation chains for multi-hop inference
Incremental learning: Hebbian-style weight updates from experience
Benchmark suite: Standardized reasoning tasks (bAbI, CLUTRR, etc.)

📜 License

MIT

🙏 Acknowledgments

Inspired by the vision of frugal, explainable AI that reasons rather than retrieves. Built on decades of research in hyperdimensional computing, energy-based models, and sparse distributed memory.

Harry00/MLE-Morpho-Logic-Engine

MLE — Morpho-Logic Engine

🧠 What is MLE?

🏗️ Architecture

5 Modules

SIMD-Optimized Core (`utils/simd_ops.py`)

📐 Core Concepts

4096-bit Binary Vectors

Sparse Address Table

Binding Operations

Energy-Based Reasoning

🚀 Quick Start

📊 Benchmarks

SIMD Throughput

Routing Latency

Memory Efficiency

Binding Performance

🧪 Tests

🎯 Demo

📁 Project Structure

🔬 Theoretical Foundations

🛤️ Roadmap

📜 License

🙏 Acknowledgments

No reviews yet

Model Info

Community

Rating Guidelines

Harry00/MLE-Morpho-Logic-Engine

MLE — Morpho-Logic Engine

🧠 What is MLE?

🏗️ Architecture

5 Modules

SIMD-Optimized Core (utils/simd_ops.py)

📐 Core Concepts

4096-bit Binary Vectors

Sparse Address Table

Binding Operations

Energy-Based Reasoning

🚀 Quick Start

📊 Benchmarks

SIMD Throughput

Routing Latency

Memory Efficiency

Binding Performance

🧪 Tests

🎯 Demo

📁 Project Structure

🔬 Theoretical Foundations

🛤️ Roadmap

📜 License

🙏 Acknowledgments

No reviews yet

Model Info

Community

Rating Guidelines

SIMD-Optimized Core (`utils/simd_ops.py`)