Independent research lab

Intelligence, measured. Intelligence, embodied.

Deep Pearl AI studies two sides of the same question: how do we know an AI system actually works — and how do we build systems that keep working in the physical world, on real hardware, with real people depending on them.

NACRE SIM LAYER 00/44 Δr 0.00px T+0.0s

Evaluation: Multivac & Multivac Physics
Edge intelligence: ALEX-1 & EMOTE4D
Status: Active research, live deployments

The work

Four ventures, one thesis

Frontier models are judged by benchmarks they can game, then deployed on hardware they were never built for. We work both ends of that gap. Each instrument below is a working model of its research — open the drawer to see it run.

Multivac

Paper under review

Fig. 1 · The peer matrix

Independent, blind evaluation of frontier language models. Fresh questions models haven't memorized, judged by a cross-family peer matrix — no vendor grades its own homework.

Open Multivac

Multivac Physics

Live

Fig. 2 · Sensitive dependence

Graduate-level physics problems through the same blind peer matrix — measuring whether models reason about the physical world or pattern-match around it.

Open Physics

ALEX-1

Active prototype

Fig. 3 · The membrane

A home assistant that runs on the device, keeps personal data local, and learns from everyday use — voice, emotion, and smart home control without a cloud dependency at its core.

Open ALEX-1

EMOTE4D

In pilot

Fig. 4 · Pixels stop here

On-device computer vision that detects falls in eldercare without sending video anywhere. Skeleton-only processing, sanity gates between every stage, live on Raspberry Pi hardware.

Open EMOTE4D

How we work

First principles, honestly reported

No one grades their own homework

Evaluations are blind and judged across model families. A benchmark's design shapes its rankings, so we design against our own bias first.

Production hardware is the truth

A number measured in the lab is a hypothesis. Nothing counts until it holds on the device, in the room, under the lighting it will actually face.

Trust is engineered, not assumed

Compositional systems fail when one stage trusts another blindly. We put explicit sanity gates between stages and design honest failure modes.

Research

What we're studying

Six threads, from 3D perception to brain-inspired learning to agentic control — each feeding back into the ventures.

Currently learning

representation learning
interpretability
adaptive computation
edge deployment
robotic integration
perception and 3D

Explore the research agenda