Proof Before Parts — AI-Assisted Hardware Startup Case Study

The Project

A hardware startup with a genuine problem: GPU failures in data center server racks are expensive, poorly predicted, and largely invisible until a compute job crashes mid-run. The vision was a passive, clip-on sensor package — electromagnetic emissions, current draw, thermal gradients — that could detect degradation signatures before any firmware alarm fired.

The physics were real. The business case was strong. The academic literature validated the approach. But three things had to happen in the right order before anyone spent money on hardware: the team needed to understand the research well enough to make defensible design decisions; the legal framework had to be watertight before prototype work began; and the sensor math had to be validated in software first — to confirm which signals were worth buying before the rack was instrumented.

None of those are traditional engineering tasks. They are the cognitive overhead that kills early-stage hardware companies. The AI compressed all three into a single working arc — from research onboarding through contract execution through a working, reproducible validation notebook.

"The notebook validated sensor selection and prototype criteria before spending a dollar on hardware. That is the whole game at pre-seed: prove the physics first, in software, at zero cost."

What the AI Was Fed

The AI wasn't used as a search engine. It was loaded with the actual project artifacts — papers, sensor specs, term sheets, technical architecture documents — and held that context across every phase.

Research Papers

GPU memory corruption at extreme scale — multi-cluster empirical study
Physics-informed modeling of GPU aging via NBTI reaction-diffusion
GPU resilience characterization across AI and HPC systems
Aging-aware GPU compilation methods
GPU burn stress testing methodology
Sensor vulnerability systematic review across industrial IoT
Supercapacitor reliability and degradation modeling (PINN baseline)

Technical Architecture

Sensor node hardware specification — channels, timing, quality flags
Example sensor data outputs — format and sample rate definitions
Physics-informed aging model — NBTI to macroscopic LSTM fusion spec
Reliability problem deck — stakeholder presentation framing
Full-stack theoretical architecture — FPGA through cloud MTBF service
POC execution plan — phase gates and binary success metrics

Business & Legal Context

Stealth-mode company proposal — go-to-market, revenue model, hardware BOM
Pre-meeting briefing and post-meeting decision log
Draft NDA — first-pass terms, scope, confidentiality obligations
Equity and stock option structure — vesting, cliff, strike price context
Recorded technical discussion transcript — sensor tradeoff alignment

The Driving Questions

Which sensor modalities give the most signal — in what priority order?
What sample rate and timing precision does MdRQA actually require from hardware?
Can we validate the math on synthetic data before committing to a design?
What do contracts need to cover before prototype work starts?
Does this method generalize beyond one hardware domain?

Four Pivots

The project didn't unfold in a straight line. Four times, a question that looked like a simple next step became a genuine change of direction. Each time the AI held the technical context well enough to make the pivot clean rather than disruptive.

Pivot 1 — Research Digestion

From "what should we read" to "here is what it means for your product"

Seven GPU reliability papers were loaded as project context and the AI was asked one question: what does this mean for a passive side-channel sensor product? The synthesis ran across all papers simultaneously. Power intensity — not temperature — is the dominant predictor of double-bit memory errors. Per-GPU lifetime features matter more than real-time snapshots. The fault taxonomy from large-scale empirical studies maps directly to the sensor modalities already under consideration: EMI, current, thermal. A month of reading compressed into actionable product decisions in a single session.

Pivot 2 — Methods and Hardware Requirements

From "we are using MdRQA" to "here is exactly what that demands from hardware"

MdRQA — Multivariate delay-embedded Recurrence Quantification Analysis — was the chosen mathematical framework for turning multi-sensor time series into interpretable health metrics. The founders understood the concept. What they needed was the translation layer: what does MdRQA actually demand from a sensor node?

The core insight: MdRQA sensitivity to inter-channel timing jitter scales with sample rate. At 10 Hz, a ±50ms NTP tolerance is borderline acceptable. At 1 kHz, 50ms is 50 samples of phase error — which collapses DET% and destroys the early-warning signal. This converted "which sensors" into a concrete hardware specification: what clock reference is mandatory at each operating frequency. The parameter sweep methodology emerged from the same analysis.

Pivot 3 — Contracts Before Continuation

From "let's keep building" to "not until the paperwork is right"

Prototype work was producing results. Then the correct call was made: pause and formalize the relationship before going further. A founding team without clear written agreements on IP ownership, payment terms, equity, and scope is one ambiguous conversation away from a dispute that ends the company.

The AI drafted a three-layer 30-page package from scratch: an NDA locking down what had already been shared; a stock option agreement defining vesting, cliff, strike price, and acceleration triggers; and a scope of work with phase-gated milestones, explicit deliverable definitions, and IP assignment language. Prototype paused — not because technology stalled, but because this is the correct order of operations.

Pivot 4 — Software Validation Before Hardware Spend

From "which sensors should we buy" to "prove the math first, then buy"

The original framing was: which sensors should we procure for the POC? That question was flipped: what does the math require sensors to look like — and can we prove the full pipeline on synthetic data first?

The result was a self-contained Jupyter notebook: pure NumPy and SciPy, auto-installing, no external module required. It generated synthetic multi-channel sensor data across four fault phases, ran parameter sweeps, visualized recurrence plots, computed sliding-window health metrics, and fired slope-based early warning alerts with a validated 37-second lead time before the synthetic fault boundary. Sensor selection became a constrained optimization with a known answer. All nine outputs are shown below.

The Notebook Outputs

Every chart below is a direct output of the validation notebook — synthetic data, pure Python, zero hardware. These nine images are the deliverable that replaced a multi-month sensor selection study worth thousands of dollars in potential wrong turns.

Step 1 — Synthetic sensor data generation

Four-channel synthetic GPU sensor data (EMI ×2, Current, Temperature) generated across four labeled fault phases: healthy baseline, degradation onset, thermal event, and recovery. The phase boundaries are the ground truth against which every downstream alert is validated.

Synthetic GPU Sensor Data — 4 channels × 200 seconds · EMI CH1, EMI CH2, Current, Temperature · Healthy → Degradation → Thermal Event → Recovery · Phase boundaries are ground truth for all alert validation

Step 2 — Healthy baseline recurrence plot

The Recurrence Plot (RP) for the healthy window (0–60s) shows the structure of a deterministic system: long diagonal lines indicating the GPU revisits similar states in a regular, predictable pattern. REC = 3.03%, DET = 30.78%. These numbers become the reference against which all fault windows are compared. The bar chart on the right shows the full RQA metric set — MaxL at 400 indicates the system runs in a single coherent attractor for the entire healthy window.

Healthy GPU Baseline (0–60s) · RP: REC=3.03%, DET=30.78%, MaxL=400 · Long diagonal lines = deterministic attractor = healthy system · Bar chart shows complete RQA metric set used as reference baseline

Step 3 — Thermal event recurrence plot

The same analysis on the thermal event window (100–140s). The RP structure collapses: diagonal lines shorten and scatter. DET drops from 30.78% to 10.31% — a 20-percentage-point fall. LAM changes simultaneously. These two moves together are the fault signature the hardware pipeline is designed to detect in advance.

Thermal Event (100–140s) · RP: REC=2.47%, DET=10.31% · Diagonal structure collapses compared to healthy baseline · DET drop of 20+ percentage points is the primary early warning signal

Step 4 — Sliding window GPU health monitor

MdRQA computed on a rolling window across the full 200-second signal. Four panels: raw sensor signals, REC% and DET%, entropy metrics (EntrL and EntrV), and laminarity (LAM%). The DET% decline from ~28% to ~9% is a smooth, trackable trend — not a binary threshold crossing. This is the characteristic that makes MdRQA useful for early warning rather than after-the-fact diagnosis.

Cell 6 — GPU Health Monitor: Sliding MdRQA · 4-panel display: raw signals, DET+REC, entropy metrics, laminarity · Fault progression visible as smooth metric evolution · DET decline begins well before fault boundary

Step 5 — Dual window: fast event detection vs slow degradation tracking

Two window sizes running simultaneously. The fast window (10s) catches acute events — it swings rapidly when a fault occurs. The slow window (60s) tracks structural degradation — it begins declining well before the fast window fires. Running both is not redundant: they detect different failure signatures and together provide both lead time and confirmation.

Dual-Window MdRQA — Fast Events vs Slow Degradation · Fast (10s) DET in crimson, Slow (60s) DET+EntrL in blue/purple · Slow window shows structural decline beginning ~37 seconds before the thermal event phase boundary

Step 6 — Slope-based early warning alert

The derivative of the slow-window DET signal becomes the early warning trigger. When DET slope crosses below (mean − 2σ) of the healthy baseline slope, the alert fires. In this validation run, the alert fires at t=63s — 37 seconds before the thermal event boundary at t=100s. The alert is generated from the rate of structural change in the sensor signal, not from any threshold on raw sensor values.

Slope-Based Early Warning — DET Trend Detection · Alert fires at t=63s · Thermal event boundary: t=100s · Lead time validated: 37 seconds · Threshold: mean − 2σ of healthy baseline DET slope · Raw signals shown for reference

Step 7 — Jitter degradation of early warning lead time

How much does inter-channel timing jitter reduce the warning lead time? The result is encouraging: even at 2,000ms of simulated jitter — far beyond any real NTP drift — the early warning system continues to fire with positive lead time at 10 Hz. The method is remarkably tolerant of timing imprecision at low sample rates. The specification tightens only when sample rate increases.

Cell 9 — Hardware Sync Requirement: Jitter vs Early Warning Lead Time

Cell 9 — Hardware Sync Requirement: How Jitter Degrades Early Warning Lead Time · Lead time remains positive across all tested jitter levels at 10 Hz operating frequency · Confirms NTP-level synchronization is acceptable starting point for a 10 Hz POC

Step 8 — True sync sensitivity with correlated transients

A more rigorous test using synthetic data with intentional correlated transient events injected across channels. DET% and REC% measured as inter-channel jitter increases from 0 to 1,000ms. The non-monotonic behavior reveals that at low jitter levels the cross-channel correlation structure is preserved — confirming that NTP-level synchronization at 10 Hz is an acceptable starting point for hardware POC work.

Cell 10 — True Sync Sensitivity: Correlated Transients · DET% (blue) and REC% (red) vs inter-channel jitter (ms) · Non-monotonic response confirms cross-channel structure survives NTP-level timing error at 10 Hz

Step 9 — Sync requirement tightens with sample rate

The critical hardware specification chart. At 100 Hz, DET% is nearly flat across all jitter levels. At 1 kHz, DET loss exceeds 20% above 5ms jitter. At 10 kHz, catastrophic DET loss occurs above 10ms jitter. The rule is explicit: jitter tolerance = 1 sample period at your chosen operating frequency. Choose your sample rate, and the timing specification follows directly from the math — no hardware testing required to establish this requirement.

Cell 11 — Sync Requirement Tightens with Sample Rate · 100 Hz, 1 kHz, and 10 kHz compared side by side · Right panel is the hardware spec chart: DET Loss % vs Jitter · Red dashed line = 20% DET loss threshold · Rule: jitter must stay below 1 sample period at operating frequency

What the Numbers Mean

MdRQA produces five metrics from every window of sensor data. Understanding what each measures — and which fault type it tracks — is the prerequisite to building a real alert system. All five were validated on synthetic data before any hardware conversation was reopened.

Metric	Physical meaning	Healthy value	Fault signature	Alert role
REC	% of time the system revisits the same state (RP density)	3.03%	Spikes or collapses sharply	RAD calibration target; density baseline
DET	% of recurrence points forming diagonal lines (determinism)	30.78%	Drops ~20 pts during thermal event	Primary early warning — slope alert fires 37s before fault
MeanL	Mean diagonal line length — predictability horizon	3.73	Drops as predictability collapses	Corroborates DET; filters false positives
EntrL	Shannon entropy of diagonal line length distribution	1.03	Drops — structure simplifies before collapse	Regime change detector — leads DET collapse
LAM	% of recurrence forming vertical lines (laminarity)	38.48%	Rises — system trapped in degraded attractor	Independent fault signature — bearing / throttle trapping

# Five metrics from every window — pure NumPy/SciPy, no external RQA library

# Auto-installs. Runs on any Python 3.10+ machine from a clean state.

def mdrqa(X, EMB=1, TAU=1, NORM='euc', RAD=0.3, ZSCORE=True):

    # delay-embed → distance matrix → threshold → recurrence plot

    # extract REC, DET, MeanL, EntrL, LAM from line structure

    return {'REC': REC, 'DET': DET, 'MeanL': MeanL, 'EntrL': EntrL, 'LAM': LAM}

def find_rad(X, target_rec=3.0):

    # Binary search — 50 iterations → auto-selects RAD on healthy baseline

"Five numbers. That is the whole product at this stage. Every alert, every early warning, every lead time estimate comes from five numbers computed on a rolling window of sensor data. The notebook proves those five numbers work — before you buy a single sensor to generate them."

What Was Delivered

The contract package — 30 pages, three agreements

Non-Disclosure Agreement

Mutual confidentiality covering hardware sensor designs, analytics implementations, customer lists, and investor materials. Proprietary information defined with specificity. Survival clause covering all IP exchanged prior to signing. 3-year term with hardware trade-secret carveout.

Stock Option Agreement

Grant size, vesting schedule, cliff period, and strike price methodology. Acceleration triggers on change of control. Classification guardrails distinguishing consultant from employee status. Dilution protection consistent with SAFE-note cap table structure.

Scope of Work & Deliverables

Phase-gated payment milestones tied to explicit deliverables: sensor specification, validation notebook, POC execution plan, data format template. IP assignment clause for all work product. Out-of-scope statement protecting both parties. Acceptance criteria in measurable terms.

The validation notebook — 17 cells, fully self-contained

⚙️

Auto-install cell

Single cell installs all dependencies via subprocess. No environment setup, no requirements file, no module errors. Runs from a clean Python 3.10+ state.

Self-contained

🎛️

Single config block

All tunable parameters in one cell: data source flag, channel names, sample rate, RAD, window sizes, alert sigma. Edit once, run everything below it.

Validated

📐

MdRQA core math

Complete implementation of delay embedding, distance matrix, recurrence threshold, and all five metric extractors in pure NumPy/SciPy. No external RQA dependency.

Validated

📊

4-phase synthetic data

Physics-grounded 4-channel signal generator across healthy, degradation, thermal event, and recovery phases with ground-truth boundaries for alert validation.

Validated

🔎

Parameter sweep + auto RAD

Metric table across candidate RAD values. Binary search auto-selects the radius targeting 3% REC on the healthy baseline. Reproducible and transferable to real data.

Validated

⚡

37-second early warning

Slope of slow-window DET crosses the alert threshold 37 seconds before the synthetic thermal event boundary. Alert derived from structural change, not raw sensor thresholds.

Validated

⏱️

Jitter sensitivity analysis

Full quantification of how timing error degrades DET% at 100 Hz, 1 kHz, and 10 kHz. The hardware timing spec is derived directly from the signal math — no hardware needed to establish the requirement.

Validated

🔧

Domain generalization

Same five metrics extended to pump health (flow, pressure, vibration, current, bearing temp) and a pressure-differential emissions leak detector. The method is hardware-domain agnostic.

Demonstrated

📋

Sensor data format template

CSV column specification with hardware timestamp, sensor channels, quality flags (sync_ok, sample_index), and row-drop logic. Ready to hand to a firmware team as an output spec.

Delivered

What the AI could not do — the honest list

It never touched a rack. Every physical constraint — sensor placement, cable routing, connector type — requires a human in the room with the hardware.
It initially over-indexed on sample rate requirements from academic literature. The founder caught this; the AI recalibrated once the actual hardware specification was loaded. The correction was fast — but the first answer was wrong.
It cannot sign a contract, advise on jurisdiction-specific enforceability, or replace review by a licensed attorney. The 30-page package is a strong first draft — not a substitute for legal counsel on high-stakes terms.
It cannot attend a customer site, observe real rack behavior, or validate that synthetic fault signatures match actual hardware degradation curves. Ground truth only comes from the POC.
It cannot fundraise. The investor relationship, the narrative, the judgment calls — those belong to the founders. The AI made them faster and better-prepared. It replaced none of the decisions.

Where It Stands

Three work streams that would typically require a research firm, a startup attorney, and a senior signal processing engineer are substantially complete — driven by a founding team and an AI that held context across every session.

📚

Research & Definitions

Seven academic papers synthesized into a coherent product narrative. Physics of semiconductor aging, memory error taxonomy, MdRQA mathematical framework, and sensor signal chain requirements all documented at investor-conversation quality.

Complete

📝

Legal Framework

30-page contract package across three agreements: NDA, stock option agreement, and scope of work. Payment terms, IP assignment, equity structure, and prototype phase gates all defined. Ready for attorney review and execution.

Pending execution

🔬

Prototype Validation

Self-contained MdRQA notebook validating the full sensor-to-alert pipeline on synthetic data. Hardware timing requirements quantified. Sensor format template delivered. Prototype paused pending contract execution — the technology did not stall.

Paused — contracts pending

What this replaced

No research firm. No signal processing consultant. No startup attorney for first-draft contracts. No data scientist to build the validation notebook. The AI handled all of those functions — not by replacing domain expertise, but by giving the founding team enough fluency to make every decision themselves, with the research and math visible and auditable at all times.

The notebook replaced what would typically be a multi-month sensor selection study. Running synthetic data through the full alert pipeline before committing to hardware is not just cheaper — it is the correct engineering sequence. Sensor selection becomes a constrained optimization once you know what the math requires. The hardware POC becomes confirmation of a known answer, not a discovery process.

Proof BeforeParts

The Project

What the AI Was Fed

Research Papers

Technical Architecture

Business & Legal Context

The Driving Questions

Four Pivots

The Notebook Outputs

Step 1 — Synthetic sensor data generation

Step 2 — Healthy baseline recurrence plot

Step 3 — Thermal event recurrence plot

Step 4 — Sliding window GPU health monitor

Step 5 — Dual window: fast event detection vs slow degradation tracking

Step 6 — Slope-based early warning alert

Step 7 — Jitter degradation of early warning lead time

Step 8 — True sync sensitivity with correlated transients

Step 9 — Sync requirement tightens with sample rate

What the Numbers Mean

What Was Delivered

The contract package — 30 pages, three agreements

The validation notebook — 17 cells, fully self-contained

What the AI could not do — the honest list

Where It Stands

What this replaced

Your Hardware.Our Toolkit.

Proof Before
Parts

Your Hardware.
Our Toolkit.