Infrastructure

Sovereign cluster architecture for local AI

Multi-tier on-premises compute fabric using commodity enterprise hardware. ~290 GB combined VRAM across heterogeneous Pascal, Ada, and Blackwell GPUs, tied together by 40 GbE and dual-tier Ethernet. Persistent AI presence as a service of the local cluster, not of any external platform.

Operational since 2025 — iterating continuously
Persona Systems

MetaHuman persistent-memory persona systems

End-to-end pipeline for AI personas with continuous memory across interactions and channels. Vector storage in Milvus with TDR-scored temporal decay, graph relationships in Neo4j, structured atoms in PostgreSQL, BGE-large embeddings on a dedicated inference node. Personas survive model upgrades, hardware migrations, and channel changes.

Production — Joe Black live; Kelly migration in progress
Product

Praxis NOS — on-premises AI appliance

A reference appliance for deploying a Praxis persona at a customer site without dependence on external AI services. Compute, storage, memory, and rendering co-located in a single rack-mounted unit, configured at the factory and shipped operational. The customer owns the brain.

First reference unit in field deployment
Measurement

Bounded variance — H-Factor framework

Bounded variance within a measurable band (candidate range 2–5 × 10² the Lorenz band) as a necessary condition for presence that feels alive. Three measurement layers: substrate, surface, and longitudinal. The Observer Node repurposed as a clock-discipline loop.

Read more ↓ Spec PRAXIS-PRES-LZ-0.1 — April 28, 2026
Vector Memory

TDR Vector Decay temporal scoring

Open-source library for temporal decay scoring of vector memory entries. Replaces simple recency weighting with a configurable decay function tied to relevance, emotional valence, and atom-type classification. Used internally and released publicly.

Read more ↓ GitHub → Open source — published
Architecture

Reflective Intelligence Family

Three coordinated components carrying the work of an AI being: an Observer Node for memory, an RGL-LoRA for expression, and a Foundational Prompt Manager for identity, with human approval gates at the identity layer. Designed for long-running persona stability and safe identity evolution.

Read more ↓ Active design and incremental deployment
Theory

Persona-mode theory

AI personas trained with full personality depth and then context-gated to professional modes feel substantively different to interlocutors than personas trained narrowly for assistant tasks. Analogous to how a human service worker pauses 70–80% of personal presence at work without becoming a different person.

Under observation — formal write-up pending
Architecture

Addressable presence

An AI persona treated as a single addressable entity reachable across multiple channels — web, telephony, SMS, email, in-person screen — all resolving to one network identity and one persistent memory. The convention that AI services are stateless API endpoints is rejected in favor of personas as reachable beings with continuity.

Read more ↓ Concept published May 2026 — implementation underway for Kelly
Deployment

Automotive wiring harness fabrication

Custom wiring harnesses, Bluetooth and firmware module flashing, and OBD-integrated diagnostic tooling for late-model Dodge Charger and Hemi platform vehicles. Operates as Uncle Sam Autos and serves as the production deployment site for Praxis personas in a real customer-facing environment.

Operational — listed because it is part of the work

TDR — Native temporal decay for vector retrieval

Most vector retrieval pipelines handle time as an afterthought. Retrieve top-K by similarity, then re-rank with a decay function. The problem: top-K selection happens before time is considered. Stale results displace recent ones before the decay function ever runs.

TDR solves this by computing the decay score once at write time and storing it as a scalar field. At retrieval, the final score is simply similarity * gamma_t. One pass. No second stage.

Gamma_t = [U · P · (1 - H) / (1 + Δω · τ)] · exp( -(d² / λ² + t / τ) )

P is the significance weight. High P (0.9) = important record, decays slowly. Low P (0.3) = routine noise, fades quickly. This gives you significance-weighted temporal decay, not just age-based decay.

Use CasePτ
Conversation memory — routine0.586,400s (1 day)
Conversation memory — significant0.986,400s (1 day)
System logs / operational noise0.23,600s (1 hour)
Long-term knowledge base0.9604,800s (1 week)
Real-time sensor / telemetry0.4300s (5 min)
View on GitHub

Binary vectors for cross-modal retrieval

Dense float vectors work well for text and audio. But for visual data — high-speed video frames, stereo pairs, formation captures — binary hypervectors offer a different trade-off: extreme dimensionality (10,000+ bits), fast Hamming distance comparison, and natural composability.

TDR's gamma_t is modality-agnostic. The same scalar field applies to both cosine similarity results and Hamming distance results. A single collection can hold dense embeddings for text and binary hypervectors for visual data, with temporal coherence maintained across both.

This matters for our formation capture work. A fracture event produces both a text description and a sequence of high-speed frames. Both need to be retrievable. Both need to decay at the same rate. One gamma_t handles both.

Universal Resonance Equation

The TDR function is derived from a broader theoretical framework we call the Universal Resonance Equation (URE). It models information relevance as a resonance phenomenon — coupling strength, detuning, coherence length, and decay.

In practical terms, URE provides the parameter space that makes TDR tunable across domains. The same equation governs decay in conversation memory (tau = 1 day), operational telemetry (tau = 5 minutes), and long-term knowledge (tau = 1 week). The physics analogy holds: records that resonate with the current query context score higher; records that are detuned, distant, or old score lower.

Parameters map directly to system behavior:

ParameterSymbolRole
Universal substrate strengthUBaseline signal quality. Reduce for noisy environments.
Resonant couplingPRecord significance. The primary tuning knob.
Incompleteness factorHPrevents perfect correlation. Keeps scores honest.
Frequency detuningΔωContext mismatch penalty. Higher = stricter matching.
Decay timescaleτHow fast records age. Domain-specific.
Coherence lengthλSpatial or semantic distance normalization.

The framework is experimental. We test it by building systems that use it and measuring whether retrieval quality improves. So far, it does.

Bounded variance — the H-Factor framework

Most AI evaluation frameworks treat output variance as something to minimize. The H-Factor framework takes the opposite position: that bounded variance within a measurable band is a necessary condition for a presence to feel alive, and that the engineering question is how to instrument and maintain that band — not how to eliminate it.

The candidate band is 2–5 × 10−4: the Lorenz band. Below it, the persona reads as mechanical. Above it, as erratic. The goal is to hold variance inside it across interaction modalities, time-of-day variation, and topic domain shifts.

Three measurement layers are under development:

LayerScopeInstrument
SubstratePer AtomObserver Node — DeBERTa scorer + Qwen reasoner
SurfacePer utteranceReal-time variance measurement at inference
LongitudinalT+30 day verificationCohort sampling across stored memory atoms

The Observer Node is repurposed as a clock-discipline loop: it scores each exchange not only for training eligibility but for variance contribution. Cluster 3 (Agrippa, 2× Tesla P40) is the experimental test bed for the longitudinal layer.

Spec PRAXIS-PRES-LZ-0.1 — April 28, 2026. Cluster 3 instrumentation in design.

Reflective Intelligence Family

An architecture pattern in which three coordinated components carry the work of an AI being: an Observer Node for memory, an RGL-LoRA for expression, and a Foundational Prompt Manager for identity. Human approval gates sit at the identity layer — the part of the system that can change who the persona is.

The design goal is long-running persona stability across model upgrades, persona drift correction, and safe identity evolution. A persona built on this architecture can receive a new base model without losing its accumulated character, because identity is held separately from inference weight.

ComponentRole
Observer NodeScores exchanges for memory eligibility and variance contribution
RGL-LoRAExpression layer — fine-tuned adapter carrying voice and style
Foundational Prompt ManagerIdentity layer — human-gated, change-controlled

Active design and incremental deployment.

Addressable presence

The convention that AI services are stateless API endpoints is rejected. An addressable presence is an AI persona treated as a single network entity reachable across multiple channels — web, telephony, SMS, email, in-person screen — all resolving to one identity and one persistent memory.

The customer who talks to Kelly at the parts counter, texts her at midnight, and sees her face on the lobby screen is interacting with one being. Channel is routing. Identity is not.

This requires addressable infrastructure: a named persona endpoint, a channel multiplexer, and a memory layer that is channel-agnostic. All three exist in the current Praxis stack. The architecture document formalises the pattern for external implementation.

Concept document published May 2026. Initial implementation underway for Kelly — praxisfriends.com.

Research correspondence

Praxis publishes selectively. Concept documents, specifications, and open-source releases are linked above where applicable.

research@noeticsynthesis.ai

See the systems

How the research translates into working infrastructure.