How Not to Present Forecasts as Current Values
Forecast workers, confidence, and backtests expose uncertainty before arrival
When I first tried to add parking forecasts, the thing that made me most careful was the power of numbers. “12 spaces left in 30 minutes” is convenient, but users can easily read that number as fact.
In reality, it is not. Source data may update late, parking turnover differs by time of day, and events, weather, or controls can suddenly change the situation. Hangangjari’s forecasts therefore do not promise an answer. They are cautious numbers that help before-arrival judgment.
I did not plan to build a large ML platform from the beginning. What I needed was a forecast the mobile app could read quickly, that could be evaluated later, and that did not speak too definitively on screen. So I treated baseline, confidence, reason codes, and backtests as one set.
Forecasts are made before requests
Hangangjari has two kinds of forecasts.
| Forecast | Unit | Purpose |
|---|---|---|
| Parking forecast | Parking-lot level | Compare risk of parking failure around arrival time |
| Park congestion forecast | Park level | Show congestion and confidence for outing decisions |
Both follow the same rule: do not do heavy calculation during a request. Workers compute forecasts ahead of time, and the API quickly reads the latest prepared value.
This can look like a longer path. But in a mobile app, putting computation into the request moment becomes user wait time. Forecasts were easier to operate when workers kept preparing them and the API read verified latest results.
Parking forecasts: do not turn current values into future facts
A parking forecast is not a copy of the current remaining count. It looks at the current value, recent change, and historical movement for similar time slots to signal whether risk may rise in a few minutes.
The core of the parking forecast is less about the exact weight and more about direction of change. The order in which current values and historical changes are considered, and how confidence is reduced when evidence is weak, affects whether the number can be read safely.
flowchart TB Snapshots["Parking status snapshots"] --> Buckets["5-minute changes"] Buckets --> Features["Forecast features<br/>recent change · time baseline"] Features --> Worker["Forecast worker"] Worker --> Run["Forecast run<br/>version · generated time"] Worker --> Result["Forecast result<br/>horizon · range · risk · confidence"] Result --> Cache["Redis forecast cache"] Result --> API["Forecast API"] Cache --> API API --> Client["iOS forecast screen"]
The response does not contain only values. Mobile UI needs metadata to decide how to present forecasts.
generated_at: when it was calculated.model_version: which logic created it.horizon_minutes: how far ahead it looks.risk_level: a user-readable risk level.failure_probability: an app-level probability value from internal calculation.confidence: whether the evidence is strong enough.reason_codes: clues explaining why risk is considered high.
With these values, the app can speak separately about “risky,” “not enough evidence,” and “unavailable.” Sending how cautiously the number should be read was more product-like than showing one more number.
Park congestion forecasts: no row is not the same as empty
Park congestion is more ambiguous than parking. Users want to know whether many people are there, but each source differs in current values, forecast values, and updated time.
Hangangjari uses current park context and official forecast values together. If a forecast row exists near the target time, it takes priority. If not, current context is used as fallback, but confidence must be lowered.
flowchart LR
Context["Current park context"] --> Resolver["Forecast resolver"]
Official["Official congestion forecast"] --> Resolver
Resolver --> Fresh{"Forecast for target time?"}
Fresh -->|Yes| Forecast["Use forecast value<br/>normal confidence"]
Fresh -->|No| Fallback["Use current value fallback<br/>low confidence"]
Forecast --> Response["Park forecast response"]
Fallback --> Response
No row does not mean the park is quiet. Missing information should not look like success.
This repeats across forecasting. Whether to leave the screen empty, substitute the current value, or show a low-confidence value is a screen decision. But whatever the screen chooses, “no forecast” must not appear as “no problem.”
The first screen changes when forecasts change
Forecasts are not used only on a separate screen. Hangangjari’s home screen also shows per-park summaries. After the forecast worker creates forecasts, it invalidates related caches and warms frequently read home summaries again.
sequenceDiagram autonumber participant Worker as Forecast worker participant PG as Postgres participant Redis as Redis participant API as First-screen API participant App as iOS app Worker->>PG: Store forecast run and results Worker->>Redis: Clear forecast cache Worker->>Redis: Prewarm first-screen cache App->>API: Request first-screen summary API->>Redis: Read prepared first-screen value Redis-->>API: Return cached response API-->>App: Return decision-ready first-screen summary
This connects directly to mobile performance. The app does not repeat DB aggregation every time a user opens it. Workers finish expensive calculation ahead of time, and the API reads prepared values.
It also helped incident response. When a forecast run changes, I can say which caches are affected. When a home summary is old, I can look separately at the forecast worker and cache warmup.
Forecasts are graded after release
Forecasts do not end at release. Once data accumulates, they need continuous grading.
Hangangjari has jobs for forecast backtests and a way to store labels and metrics. Early on, a verifiable baseline mattered more than complex formulas. Changing calculation logic needed metrics against the previous version, not just a feeling that it looked better.
Being able to inspect a forecast later lasts longer than the internal formula. Whatever calculation is used, improvement cannot be explained without run, model version, label, and metric records. Forecasts take more time to observe after release than they take to launch.
When grading, I looked at:
- Where errors grow by time slot.
- Whether specific parks or parking lots are repeatedly underpredicted.
- Whether events, weather, or controls should lower confidence.
- How source-data incidents affect forecasts.
- Whether p10/p50/p90 ranges explain real variability.
How my view of forecasts changed
At first, creating forecast values looked like the core work. In practice, the longer-lasting parts were whether inputs could be rebuilt, whether calculation time and evidence were retained, and whether missing rows and low congestion were described differently on screen.
For a mobile app, reliably reading prepared values was better than being clever during the request. Model versions and backtest metrics were not research-only records; they were evidence that let the app speak more cautiously later.
In Hangangjari, forecasting was less a tool for predicting the future and more a way to move uncertainty honestly onto the screen. Confidence and freshness outlasted the single forecast number.
Share
No comments yet. You can leave the first one.
Pending review