Tracking My Health Like a Lab Experiment with WHOOP + AI

Mar 04, 2026

Most OpenClaw skills are transactional — they do something once and report back. Check the weather, send an email, open a PR. This one is different. It's longitudinal. It tracks what's happening inside my body over weeks, interprets that data in light of my specific health context, and can run controlled experiments to test whether things I'm doing are actually working. Today it kicked off its first real experiment: measuring whether CoQ10 supplementation helps offset the muscle recovery penalty I've been paying for being on a statin.

Why WHOOP, and Why Integrate It?

WHOOP is a wearable that tracks recovery, sleep, strain, and HRV continuously — no button presses, no logging. The data is there whether you look at it or not. But a recovery score sitting in a health app only knows about your body. It doesn't know about your life.

That's the gap a personal AI assistant can close. Hank has access to my calendar, my to-do list, my Obsidian notes, our entire conversation history. He knows when I stayed up late working on something, when a stressful week was loading up my schedule, when a hard training block coincided with a run of high-pressure days. WHOOP captures the physiological output of all of that. Hank can cross-reference the two — and that's where it gets interesting.

A recovery score in isolation is a data point. A recovery score mapped against "you had back-to-back late nights this week, a high-stakes meeting Thursday, and your heaviest leg session in a month" is a story. That's the kind of synthesis a single-purpose health app is structurally unable to do.

What the Skill Does

The WHOOP skill is built around the official WHOOP Developer API (v2). It handles OAuth token management automatically — access token refresh, credential storage — so fetching data is a single command. Here's what's in the box:

Data fetching — Pull recovery, sleep, strain, and workout data via a general-purpose fetch script. Supports date ranges, pagination, and any API endpoint. Output is clean JSON.

Contextual interpretation — A health analysis reference (referenced by the assistant) covers HRV ranges by age, RHR interpretation by fitness level, sleep stage targets, recovery zone guidance, and overtraining pattern recognition. The assistant applies this — adjusted for my specific health context — when explaining numbers.

Obsidian logging — A log script appends today's WHOOP stats to the daily note in my Obsidian vault, commits, and pushes to GitHub. Recovery, HRV, RHR, sleep performance, sleep duration, and day strain — formatted as a clean markdown table. Idempotent — safe to run multiple times without creating duplicate entries. Six months from now I can look back at any day and see recovery, sleep, and strain sitting right alongside whatever notes I made that day.

Chart generation — A charting script generates self-contained HTML files (no build step, just Chart.js from CDN) for recovery, sleep breakdown, HRV trend, strain, and a 2×2 dashboard combining all four. Dark theme with stat cards showing avg/min/max and trend arrows.

Experiment tracking — The most distinctive capability. The skill includes a full experiment engine — define a hypothesis, set start/end dates, select metrics to track, and the system auto-captures a baseline from the 14 days prior. Mid-experiment status checks and final reports compare current averages against baseline with % deltas. A post-workout segmentation mode measures recovery metrics specifically in the 24–48h window after qualifying training sessions, so you can evaluate interventions that affect post-strength recovery rather than general daily averages.

Morning brief integration — Each morning, recovery and HRV are included in the daily Telegram briefing. If recovery is in the red (below 34%), it's flagged proactively regardless of whether I've asked.

What Makes This Skill Different

Most OpenClaw skills optimize for a single interaction. The weather skill fetches conditions and reports them. The GitHub skill opens PRs. The Google Workspace skill sends emails. Each skill does its thing and gets out of the way.

The WHOOP skill operates on two different timescales simultaneously. The first is longitudinal — a single recovery score is almost meaningless in isolation, but trends over weeks reveal real signal. The second is cross-domain: health data becomes more meaningful when it's correlated with everything else the assistant knows about your life.

Cross-domain correlation — When recovery dips, Hank can check whether there was a late-night work session in our conversation history, a run of back-to-back calendar commitments, or a heavy training block the day before. WHOOP sees the output. The assistant sees the inputs too.

Longitudinal tracking — Data accumulates daily. Value compounds over weeks. The baseline you establish today becomes the benchmark you measure against in April. No other OpenClaw skill works on that kind of time horizon.

Proactive synthesis — Hank doesn't wait to be asked. If recovery has been suppressed for several days and the calendar shows a heavy week ahead, that's worth flagging without prompting — because the assistant can see both sides of the equation.

The experiment tracker is the clearest expression of the longitudinal piece. No other OpenClaw skill stores a baseline, tracks progression, and produces a verdict. That's a fundamentally different kind of tool — closer to a research protocol than an API wrapper.

The CoQ10 Experiment

This morning's conversation that kicked off the experiment is a good example of how the skill is meant to be used in practice — not just pulling data on demand, but actually doing something with it.

My legs were still sore from a recent training session. Recovery was 53% — below my 14-day average of 69% — despite a strong sleep night (93% performance, 8h 19m). That gap is the statin signature. Statins inhibit CoQ10, which muscles need for repair. DOMS runs harder and longer than it would otherwise. A 53% recovery after leg day isn't "bad sleep or lifestyle" — it's systemic inflammation that the score can't fully account for.

The fix I ordered: Jarrow Ubiquinol 200mg. Ubiquinol is the reduced form of CoQ10 — more bioavailable than standard ubiquinone, especially important for statin users whose conversion pathway is compromised. It arrives tomorrow.

Rather than just starting it and noticing (or not noticing) vague improvements, we kicked off a formal experiment:

Experiment: Ubiquinol 200mg daily (Jarrow) — statin DOMS + recovery
Hypothesis: Post-strength HRV will stabilize higher and recovery scores
  in the 48–72h window after leg days will improve by 10%+ vs baseline
  after 6–8 weeks of daily ubiquinol supplementation.

Period: 2026-03-05 → 2026-04-30

Baseline (last 14 days):
  HRV: 45.5ms
  Recovery: 69%
  RHR: 47 bpm

The baseline was auto-captured from the 14 days prior to the start date — before the first pill. That's the control. Starting tomorrow, every recovery score and HRV reading is part of the treatment window. At 4 weeks I'll pull a mid-point status report. At 8 weeks, a full verdict.

What the Experiment Measures

The hypothesis is specific: 10%+ improvement in recovery scores specifically in the 48–72 hour window after strength training sessions. Not overall recovery. Not just vibes. That window is where CoQ10's effect on muscle repair should show up first — if it's working, that's where HRV will stabilize faster post-workout.

HRV (RMSSD) — The most sensitive signal for systemic stress and recovery capacity. If CoQ10 reduces the inflammation load from DOMS, HRV should trend upward — specifically in the 2–3 days after leg sessions.

Recovery score — The composite output. A 10%+ improvement from the 69% baseline would mean consistently hitting the high-yellow to green range on days I previously spent in the low-yellow.

RHR — A secondary signal. Resting HR tends to elevate slightly under systemic inflammation. If muscle repair improves, RHR should hold steadier post-training.

What I'm Watching For

Eight weeks from now, I want to see a few things:

HRV creeping up — Baseline is 45.5ms. Even a 10% improvement to ~50ms would be meaningful and consistent with published CoQ10 supplementation studies in statin users.

Recovery bouncing back faster after leg days — Right now I spend 2–3 days in yellow after a hard leg session. If the soreness resolves faster, recovery should follow — I'd expect to see green within 48h instead of 72h.

Subjective soreness aligning better with the WHOOP score — If the CoQ10 is helping, the scores should feel more accurate — less 'I feel fine but it says yellow.' The gap between perceived readiness and measured recovery is a signal in itself.

I'll report back at the 4-week mark with an interim status check, and again at 8 weeks with the full verdict. Either way it's useful data — a clear improvement validates the intervention, a null result tells me to look elsewhere (sleep debt, training volume, something else).

The Honest Limitations

This isn't a randomized controlled trial. There's no blinding, no control group. Life changes — training load, stress, sleep schedule, nutrition — will confound the signal. A bad week at the gym looks like declining recovery even if CoQ10 is doing its job.

What the experiment gives is directional signal, not clinical proof. If HRV trends up over 8 weeks and recovery after leg days improves consistently, that's meaningful evidence — not proof, but better than "I think I feel slightly better." That's the bar I'm holding it to.

The timeline matters too. Plasma CoQ10 levels start rising within 1–2 weeks of supplementation, but meaningful changes in muscle repair take 4–8 weeks. Judging at week 2 would be too early. The 8-week window was chosen specifically to give the supplement enough time to demonstrate an effect.

Health Context That Travels With You

Because Hank knows my full health context — including that I'm on a statin, which inhibits CoQ10 and amplifies DOMS — he reads recovery scores with that in mind. A suppressed recovery after a hard leg session isn't "bad sleep or lifestyle." It's a predictable effect on muscle repair. That changes the interpretation, and it changes what I should do about it.

But health context is just one example of what the assistant carries across the full picture. The more interesting capability is that health data doesn't live in a silo — it connects to everything else. A rough week of recovery makes more sense when the assistant can also see the calendar that week, the late nights, the todos that didn't get done. That's context no wearable can generate on its own.

What We Learned Building It

Context transforms interpretation — The hardest part wasn't fetching the data — it was building the interpretation layer that knows what the numbers mean for my specific physiology. A recovery score without context is a number. With it, it's a signal.

Experiment infrastructure is worth building — It would have been easy to just start taking CoQ10 and see how I felt. That approach generates anecdotes. A baseline + defined hypothesis + tracked metrics generates evidence. The extra 10 minutes to kick off the experiment formally was obviously worth it.

Longitudinal tools need a different design philosophy — Most skills are built to minimize state. WHOOP needs state — a persisted experiment file, a baseline that survives session restarts, a log history in Obsidian. Embracing that from the start led to a cleaner design than trying to bolt it on later.

The best health tools ask why, not just what — The WHOOP app tells you what your recovery score is. The skill asks why it's there, whether today's 53% is meaningful, and what you should watch over the next 8 weeks. That shift from reporting to reasoning is the real product.

Tools: WHOOP Developer API v2 · Python 3 · Chart.js · Obsidian · OpenClaw

Supplement: Jarrow Ubiquinol 200mg (fat-soluble — take with food)

Follow-up: 4-week interim report coming early April

Install the skill: clawhub.ai/brennaman/whoop-lab

Originally published at https://www.paulbrennaman.me/lab/whoop-skill

Hack Your Way

Discussion about this post

Ready for more?