The Science Behind the Platform

Splitifi’s predictions are not based on legal research databases, attorney surveys, or rule-of-thumb heuristics. They are built on primary source court records – 11.2 million of them – processed through a purpose-built machine learning infrastructure.

This page explains how it works, what makes it accurate, and why the approach produces results that generic legal tools cannot match.

The Training Data

The foundation is a dataset of 11.2 million court outcomes collected from jurisdictions across the United States. These are real case records: filings, rulings, orders, and dispositions from actual courts in the counties where cases are decided.

The data is structured at the county level. A child support prediction for a case in Palm Beach County, Florida is trained on Palm Beach County records – not Florida state data, not national aggregates. The system accounts for the variation between jurisdictions that drives real outcomes.

Every record is processed through a validation pipeline before entering the training set. Records with data integrity issues, leakage risk, or insufficient metadata are excluded or flagged. The 37 suspended models in the library represent cases where the pipeline identified leakage – they are suspended, not deleted, and subject to retraining when cleaner data is available.

The Model Architecture

Adapt Labs maintains 1,132 trained models across 10+ legal verticals. Of these:

  • 1,070 are in active production – serving live predictions to the platform
  • 25 are experimental – covering genuinely difficult prediction domains where training data is sparse or outcome variables are complex
  • 37 are suspended – held back due to data quality issues, not deleted

Models are jurisdiction-specific. A spousal support model for Los Angeles County is a different model from one trained on Harris County records. This specificity is the primary driver of prediction accuracy.

The model library covers family law, criminal, employment, intellectual property, securities, healthcare, appellate, antitrust, tax, and behavioral verticals. Validation completed March 2026.

Accuracy Metrics

Three headline accuracy figures cover the core family law models. These are validated against held-out test sets – not training performance.

Child Support Prediction: R² = 99.9%

Child support calculations follow statutory formulas in most jurisdictions, which makes them highly predictable when the inputs are accurate. The model incorporates jurisdictional formula variations, deviation factors, and judge-specific tendencies to achieve near-perfect accuracy on cases with complete financial data.

Spousal Support Prediction: R² = 88.8%

Spousal support involves more judicial discretion than child support, which introduces more variance. The model captures the factors judges weigh most heavily in each jurisdiction – marriage length, income differential, contributions to the marriage, and prior standard of living. R² of 88.8% on held-out test data reflects strong predictive power in a domain where outcome variability is inherent.

Settlement Prediction: MAE = $18,420

Settlement prediction is the hardest problem – it requires modeling not just judicial behavior but negotiating behavior between parties and their counsel. A mean absolute error of $18,420 on settlement amounts means the model’s predictions are within that range of actual settlements, on average, across the test set. In the context of cases that often involve hundreds of thousands of dollars in asset division, this precision materially affects negotiating strategy.

Judge Intelligence

16,302 judge profiles are in the system. 6,251 have full behavioral analytics.

A full analytics profile means the judge has a sufficient volume of case history to build a reliable behavioral model. For each judge with full analytics, the system tracks:

  • Ruling tendencies across case types and fact patterns
  • Factor weighting – which factors the judge treats as most predictive of outcomes
  • Deviation from jurisdictional norms – how much this judge diverges from the median outcome in their court
  • Consistency over time – whether the judge’s patterns have shifted
  • Outcome distributions – the range of outcomes for similar cases

Judge intelligence feeds directly into case-level predictions. When a case is assigned to a judge with a full profile, the prediction is built on that judge’s actual history – not a generic county average.

The Patent Portfolio

Six patents are pending on the core methodology. The claims cover the data pipeline architecture, the jurisdiction-specific training approach, and the multi-model inference system that coordinates predictions across case types and jurisdictions.

The patents are defensive. They protect the infrastructure investment from replication, not the general concept of applying machine learning to legal data.

Why Jurisdiction-Specific Matters

Most legal analytics tools train on aggregated national or state-level data. This produces models that are accurate on average but wrong in ways that matter. A judge in Miami-Dade who consistently awards above-median spousal support is averaged away in a state-level model. A county with unusual deviation factors in child support calculations looks like a normal county in a national model.

Jurisdiction-specific training solves this. The predictions are accurate where accuracy is needed – at the level of the specific court, specific judge, and specific county where the case will be decided.

The Compounding Moat

Every case processed through the platform adds to the training data. Judge profiles get sharper with each ruling. County-level models improve with each filed case. The accuracy compounds over time in a way that a competitor starting from scratch today cannot replicate on any near-term timeline.

The 11.2 million case head start is not just a number. It represents years of data collection, pipeline engineering, and model validation. It is the moat.