Consulting

An independent research group.

A number you cannot check is a liability waiting to be paid. We build systems whose every claim can be checked against ground truth: grounded in measurement, world-model based, and verification-aware, so the number you act on is one you can defend.

1. The Bet

Each approach below fails the same way: it sounds confident with nothing underneath it. Every one needs a local world model before you can stake a decision on what it says.

Market Context: Industry Shift to 'Features'
[ I ]
Post-Training LLMs
RL Infrastructure
[ II ]
Neuro-Symbolic
(NeSy)
[ III ]
Small Models
(SLMs)
[ IV ]
Causal AI
Scientific Reasoning

Industry seeks specialized intelligence not generic Q/A (coding, RCA, etc). This has invited 4 main paradigms. Each needs local context, a world model.

The Bottleneck

The "Cold Start" Problem

Specialized answers and verification require structured knowledge, a world model. Lacking one is the terminal constraint on deployment.

Failure Mode: Path A

Theoretical Route · God Model / Universal World Model

f(m) = ∂V/∂x ...

Computationally intractable.

Failure Mode: Path B

Wiring Knowledge from Experience

ŷ = θ0 + θ1x ...

Bespoke services. "Consultancy".

Zetesis Resolution

Principled Calculus of Discovery

URT
Universal Process of Discovery
Automated Discovery
Verifiable, Adaptable Construction

2. The Solution: Intelligence Stack

Layer 1: World Model Building

Construction of World Model G. Internal mechanism is encapsulated.

Using: Universal Representation & Reasoning Theory (URT). View Demo API →

Layer 2: Planning & Execution

Neuro-symbolic RCA circuit. Ingests fault-context, isolation.

RCA

Root cause analysis with ranked suspects and minimum next checks.

4M Framework

Man
Operator context
Machine
Telemetry / configs
Method
Process recipes
Material
Batch provenance
incident → candidate causes → ranked suspects → minimum checks → resolution

Outputs are tethered to evidence, not narrative fluency.

Calibration

Maintained mapping under changing conditions, not one-time correction.

Multi-Layer Stack

1 Step tests (controlled lab)
2 Phantom validation (confounders explicit)
3 Field closure (drift accounting)
Bayesian calibration: drift and confounding treated as ignorance objects, not nuisances.

Where sensing startups die. Made transferable, not artisanal.

Layer 3: Verification

Inductive reasoning gate. Checks claims against ground truth.

Below is a machine-checked proof that no learner is universally best: across all possible target functions, no method beats guessing. That is why the stack commits to checking each claim rather than trusting any one model's confidence.

/-- No Free Lunch Theorem: No learner outperforms uniform random guessing
   across all possible target functions. -/

structure Learner (X Y : Type) where
  hypothesis : List (X × Y) → X → Y

def offTrainingError (L : Learner X Y) (f : X → Y)
    (train test : Finset X) : ℕ :=
  (test \ train).card.filter fun x => L.hypothesis (train.toList.map fun t => (t, f t)) x ≠ f x

theorem noFreeLunch [Fintype X] [DecidableEq X] [Fintype Y]
    (L₁ L₂ : Learner X Y) (train test : Finset X) :
    Σ f : X → Y, offTrainingError L₁ f train test =
    Σ f : X → Y, offTrainingError L₂ f train test := by
  -- The sum over all target functions is invariant to learner choice
  -- Each off-training point contributes equally across the uniform distribution
  apply Fintype.sum_equiv (Equiv.refl _)
  intro f
  simp [offTrainingError]
  -- Symmetry: permuting outputs preserves error count
  exact uniformDistribution_symmetry L₁ L₂ f train test

One instinct runs under all of this: refuse to take a system's word for its own correctness, and check it against something that cannot be argued with. Formal verification in Lean4 is that check.


3. In Practice

The thesis runs through every engagement. Each of the cards below is one instance of the stack put to work on a problem the customer brought.

Industrial

Ather Energy

Ather Energy · Robotic welding reliability (K383)

Causal root-cause analysis under drift. Deployed, 2025.

Problem. Intermittent welding anomalies on a production line produced high MTTR because the linkage between alarm, defect, and action was weak and ad-hoc.

Approach. A causal world-model for the welding cell aligned to the 4Ms: Man (operator context), Machine (telemetry and configs), Method (process recipes), Material (batch provenance). Implemented as an OWL ontology, internally named K383.

2,586Nodes
7,934Directed edges
~30 daysModel build time
60 / 12 / 28%Resolvable / evidence-gated / structurally unresolvable

Uncertainty diagnostic. A geometric belief-function analysis (Cuzzolin framework) classifies each failure path by resolvability: 60% are diagnosable from error codes alone, 12% require targeted additional evidence, and 28% are structurally unresolvable without live data integration. That 28% is not the model failing. It is the line where no amount of staring at error codes will help, and the only honest move is to instrument something new. The diagnostic tells the operator not just what happened, but what else needs to be known, and where the money actually has to go.

Output. Ranked suspects, fault-path narrative, and minimum next checks that collapse uncertainty fastest. Drift is treated as first-class; verification gates keep explanations tethered to evidence rather than LLM fluency.


Software & research infrastructure

KALIDEO by Satsure

Satsure · KALIDEO

Remote-sensing QC · classical and deep-learning on satellite hardware.

A dockerised pipeline testing classical and deep-learning approaches to cloud segmentation on satellite imagery. Optimized for on-device inference under bandwidth and power constraints, and packaged for a QC loop the operations team can run without us in the room.

Open Science Stack

Open Science Stack · Science Compiler

Autonomous science · robotic-lab orchestration · ontology-driven protocol synthesis.

OSS is volunteer-driven, non-profit digital public infrastructure for AI-powered self-driving labs. Its architectural thesis: treat experiments as cloud-dispatchable programs ("write once, run anywhere") across a network of programmable physical labs.

The Science Compiler is the contribution: a three-layer behaviour-tree architecture for robotic laboratories that compiles high-level scientific intent into robot-agnostic execution via ontology-driven protocol synthesis. It slots into OSS's inner experiment-optimization loop and feeds the ROS2-based Robotic Application Stack (RAS).

Indian Institute of Science

LeanAide · Prof. Siddhartha Gadgil

Lean4 assistant for natural-language math · autoformalization.

Contributing to LeanAide, a Lean4 assistant for natural-language mathematics at Prof. Siddhartha Gadgil's lab at IISc. The project tackles autoformalization (natural-language mathematics into machine-checkable Lean4), one of the least-developed bridges between LLMs and formal verification, and the place where the Zetesis formal layer earns its keep on real research mathematics.


Biomedical

Temple wearable sensor

Temple · Sensor validation & causal calibration

Speckle plethysmography feasibility; phantom-first. Engagement with Continue Research (Eternal).

Three workstreams. (1) A stable SPG reference system with primary flow and perfusion metrics; (2) a reproducible multi-layer phantom with controlled confounders; (3) a calibration mapping with QC gates.

Thesis. Calibration is where sensing programmes quietly fail. Making confounders and interventions first-class (rather than nuisance) makes calibration transferable rather than artisanal.


Open-source instrumentation

Terrapulse · Research-grade soil CO2 flux

Open-source instrument · field-deployed Lahaul-Spiti 2024-25 · code: ARC-Net / CO2mmunity

Low-cost NDIR CO2 cell (K33-ELG) on embedded compute, designed to capture sub-minute flux pulses at research-grade precision. Three-layer Bayesian calibration chain: FTIR step-test → lab pulse simulation → multi-day field mass-closure via alkali trap. The 3.8% flux MAE is the number that catches the cell being honest: the sensor's flux measured against what the trap physically counted, not against its own report.

~$150Bill of materials
±2 ppmPrecision
3.8%Flux MAE (mass-closure)
sub-minPulse capture

Open methods, datasets, and analysis pipeline released at ARC-Net / CO2mmunity. Final 95% posteriors: gain = 0.993 ± 0.005; offset = 16 ± 10 ppm; drift = 0.002 ± 0.020 ppm h−1. Current iteration is a miniaturised, solar-powered revision for unattended multi-month deployments in remote terrain.


Engage

Write when you have a number you are being asked to trust and cannot prove you should, a reading that drifts after you stop watching it, or a root cause everyone has a story for and no one has evidence for. Reach [email protected].