Approach

Engineering rigour, applied to machine learning.

We've replaced the slow, document-heavy CRISP-DM lifecycle with a modern, production-first delivery model. Every project is wired end-to-end from sprint one — ingestion, feature engineering, modelling, deployment, and continuous monitoring — so capability accumulates and never decays.

Ingest

APIs · streams · DBs · FRED · Finage · OANDA · MetaTrader

Engineer

Cleaning · alignment · resampling · feature stores · lookback windows

Model

MRN ensembles · LightGBM · LLMs / RAG · agentic workflows

Deploy

GitHub Actions CI/CD · Docker · AWS EC2/EBS · Cloudflare

Monitor

Drift detection · walk-forward validation · stability dashboards

Decision impact: every step is wired end-to-end. We do not deliver a notebook — we deliver a continuously-running system you can trust on Monday morning.

Operating principles

Five non-negotiables.

Start from the decision, not the data

Every engagement begins by characterising the decision the model will inform: cadence, cost of error, stakeholder, and downstream system. Modelling choices flow from there — not the other way around.

Honest validation or it does not ship

Time-series problems demand walk-forward validation, leak-free splits, and out-of-sample stress tests. We never report metrics we wouldn't stake our own money on.

Production from day one

Pipelines are versioned, containerised, and CI/CD-deployed from the first sprint. There is no "throw it over the wall" handover — there's never a wall.

Monitor, retrain, defend

Drift detection, performance dashboards, and scheduled retraining are part of the deliverable. Models age — our systems are designed to know it before you do.

Explainable by default

Every classifier and forecast we deliver is interrogated with SHAP (SHapley Additive exPlanations) as a standard part of our practice — quantifying the contribution of each feature both locally (for individual predictions) and globally (across the whole dataset). Models earn trust by showing their work.

What's new in our toolkit

GenAI, LLMs, RAG, and agentic AI — used where they earn it.

Retrieval-Augmented Generation

Grounded LLM systems that answer from your documents, warehouses, and knowledge bases — with citation, evaluation, and guardrails wired in from the start.

Agentic workflows

Multi-step LLM agents that orchestrate tools, APIs, and classical ML models — with explicit cost, latency, and reliability budgets, not magical promises.

LLMs alongside time-series ML

We use language models to augment — not replace — the rigorous time-series and classification systems that drive measurable business outcomes.

Tools we reach for

A modern, opinionated stack.

Modelling

MRN ensembles
LightGBM / XGBoost
PyTorch / TensorFlow
scikit-learn
Hugging Face Transformers

GenAI & LLMs

OpenAI / Anthropic / open-weight LLMs
RAG pipelines
LangChain / LlamaIndex
Agentic workflows
Evaluation harnesses

Data & storage

MariaDB / PostgreSQL
Pandas / NumPy / Polars
Feature stores
Time-series DBs
S3 / object storage

Sources & APIs

FRED / OECD macro data
Finage / OANDA / MetaTrader
Bloomberg / Refinitiv
Custom scrapers
Internal warehouses

MLOps & deploy

GitHub Actions CI/CD
Docker / containerisation
AWS EC2 / EBS / RDS
Cloudflare Pages / Workers
MLflow tracking

Explainability

SHAP (Shapley values)
Local feature attribution
Global feature importance
Partial-dependence plots
Calibration curves

Validation and monitoring

Walk-forward backtesting
Drift detection
Performance dashboards
Statistical significance tests
A/B and shadow deploys

Let's build something measurable

Have a forecasting or classification problem that needs to work in production?

Tell us about it. First conversations are confidential, no-obligation, and usually end with a clear view of feasibility, data needs, and time-to-value.

Book a confidential call jtepper@perceptronix.net