Whiskey Island Birding · whiskeyislandbirding.com · printed
accuracy
How the model has compared to actual eBird checklist diversity at Wendy Park. 79 days, 2024-04-26 → 2025-06-05.
Disclosure.The live 90-day score window doesn’t yet overlap the eBird history snapshot (ends Jun 2025). Rows below are scored by backtest.py against the same v0.2.0 rubric and joined to actual eBird counts for spring 2024 + 2025. Pearson r is computed live from that joined data.
pearson r — predicted vs actual
+0.707
Over 79 surveyed days. Days with zero checklists excluded.
predicted vs actual
DEFINITELY_GOGOMARGINALSKIP
calibration by verdict bin
verdict
n
mean spp
median
range
definitely go
15
77.6
78.0
36–115
go
33
66.1
66.0
30–106
marginal
28
43.8
37.5
17–82
skip
3
31.7
34.0
20–41
A useful model walks mean species down monotonically as the verdict gets gloomier. A SKIP bucket that out-birds GO suggests the veto layer is over-eager.
where the model misses (per-day)
under-predictions (model said low, actual was high)
2024-05-18—gopred 6.7 · act 106
2025-04-29—marginalpred 5.4 · act 82
2025-05-21—marginalpred 4.9 · act 74
2025-05-22—marginalpred 4.7 · act 64
2025-05-23—marginal
pred 4.8 · act 66
over-predictions (model said high, actual was low)