
Where the Model Loses — Honest Analysis of Where Ratethat.Dog Gets It Wrong
A 25.5% top-pick strike rate means the model is wrong 74.5% of the time. Here's where it loses most reliably — handicaps, low-confidence races, mid-grade Dunstall Park — and what to do about it.
Why does this post exist?
Because the strongest argument for trusting a model is showing where it doesn't work. Most rating providers publish strike rate when it suits them and stay quiet when it doesn't. We're going the other way: this is where the ratethat.dog top composite pick under-performs, drawn from honest Hindsight data.
If you're going to build a betting system using these ratings, you need to know the soft spots. Skipping them is one of the cleanest forms of edge improvement available.
Where does the model lose most reliably?
- **Handicap races** — top-pick strike rate drops by 4-6 percentage points vs scratch races at the same grade. The handicap distance adjustment introduces noise the model can't fully capture.
- **Low race confidence races** — below race confidence 50, top picks under-perform their model probability meaningfully. Below 30, results are barely better than random.
- **Dunstall Park A4-A6 over 480m before the model rebuild** — historically poor segment, now corrected by the grade-split rebuild but worth remembering as a case study in where the model failed.
- **Trial races (T-grade)** — excluded from analysis entirely; the model deliberately doesn't predict them because the dogs aren't racing competitively.
- **Late-card races at minor cards** — small fields, irregular grading, marginal ratings. Strike rates dip relative to the main-meeting cards.
Why does the model lose in handicap races?
Because handicap distance adjustments interact with form in ways that aren't fully captured by the rating components. A dog given a 6m start may run faster relative to its handicap-adjusted position than its raw form suggests — or it may not, depending on how the dog handles the unusual start. The variance is real and the model can't always read it.
Practical fix: filter out handicap races from your systems by default. We covered the rationale in the handicap explainer. Top systems on ratethat.dog typically exclude handicaps for ROI purposes.
Why does the model struggle at low race confidence?
Because low race confidence means the race itself is messy — wide composite gaps, weak grading, handicap penalties, or all three. In those races, even a strong model can't assert its edge over six dogs that are functionally close in ability.
Practical fix: add a race confidence filter to every system. "Composite top pick AND race confidence ≥ 60" is a stricter version of "composite top pick alone" and consistently lifts ROI in backtests.
What does the model do well?
Standard distance (380-480m) at the major UK tracks (Hove, Monmore, Sheffield, Yarmouth, Newcastle, Romford) — this is where the 25.8% strike rate comes from. Field Speed dominates the composite and the rating shines. You can validate this yourself against Historic Results and the per-race Hindsight breakdowns.
Hot Dogs — composite-60+ filter consistently delivers 28.34% strike rate across grades and distances. Sprint Trap 6 reversal — the model correctly handles the geometry flip at sub-300m races. Race-confidence-aware system building — when you skip the messy races, the underlying picks earn their keep.
What should I do with this information?
Use it as a system filter. The cleanest improvements aren't about picking better dogs — they're about skipping the races where any model picks worse. Exclude handicaps, exclude race confidence below 55-60, exclude trial races, and treat Dunstall Park mid-grade with caution before mid-March 2026 (when the rebuild shipped).
Honest about where it loses, the model gets used better. Knowing the soft spots is the difference between trusting the rating and applying the rating.
Frequently asked questions
How often is the ratethat.dog top pick wrong?
74.5% of the time across all races (since the top pick wins 25.5%). Strike rate varies by distance band, grade and venue — standard distance at major tracks performs best.
Where does the model perform worst?
Handicap races, low race confidence races (below 50), and historically Dunstall Park A4-A6 over 480m before the March 2026 rebuild.
Can I see model performance broken down by race type?
Yes — the Hindsight page shows prediction-vs-actual analysis with breakdowns by track, grade, distance and race confidence band.
Should I exclude weak segments from my system?
Yes. Filtering out handicaps and low-confidence races is one of the highest-impact changes you can make to a saved system. ROI typically improves by several percentage points.
Why publish where the model loses?
Because it's the only honest way to claim where it wins. Strike rate in isolation isn't credible without the soft spots disclosed alongside.
