Push Outbox and Suppression Audit
I recorded sent notifications and stopped notifications with their reasons in the same system
Push notifications reach users outside the app. They speak to people who did not open the app, so the first question is not “can we send this?” but “is it okay to send this?”
Hangangjari notifications pass through three stages.
Create candidates. Decide whether to send or suppress them by policy. Put them in an outbox, send them, and record the result in an audit trail.
When operating notifications, the question that appears most often is: “why did this notification go out, and why did that one not go out?” To answer that, storing only successful sends is not enough. Reasons for not sending also have to become records.
Source changes first become notification candidates
Push does not send raw source events directly. First, it turns changes into facts that can be considered as notifications. For example, a parking lot’s remaining spaces falling below a threshold, a park crowding change, or an event cancellation can become a fact.
Then each fact is matched with subscriptions. The system checks which parks or lots the user cares about, which notification types are enabled, and whether quiet hours apply.
flowchart LR Source["Source change"] --> Fact["Notification fact"] Fact --> Match["Candidate matching"] Subscription["Subscriptions/settings"] --> Match Match --> Policy["Send decision"] Policy --> Decision["push_now or suppress"] Decision --> Outbox["Push outbox"]
First, there are rules for not sending
Suppression is not failure. It is a choice that protects users.
Hangangjari has explicit suppression rules.
| Example | Reason |
|---|---|
| Suppress broad parking-lot state-change notifications | Users are closer to wanting remaining-space threshold alerts |
| Suppress threshold alerts for lots without remaining-space values | There is no actionable number |
| Suppress broad park state-change notifications | Too broad, with unclear user action |
| Suppress unclear crowding changes | A safe message body cannot be made |
These rules are not about “sending fewer notifications.” They are closer to “send only notifications users can act on.”
User settings and impact are checked together
Even after a candidate passes suppression rules, it does not immediately enter the outbox. The system checks the user’s notification mode together with how important the fact is.
It asks roughly:
- Did the user turn notifications off?
- Is this change high urgency?
- In parking-first mode, is this an important parking signal?
- In outing-brief mode, is this an important park/event signal?
- Otherwise, is it important enough for smart mode?
The result is push_now or suppress, and it must leave a reason code.
flowchart TD
Candidate["Matched candidate"] --> Off{"Notifications off?"}
Off -->|Yes| Suppress["Record suppression reason"]
Off -->|No| Rule{"Stop rule applies?"}
Rule -->|Yes| SuppressRule["Suppress with reason"]
Rule -->|No| Tier{"Important or T0?"}
Tier -->|Yes| Push["push_now"]
Tier -->|No| Mode["Mode-specific decision"]
Not-sent reasons are also one-line records
If a notification was not sent, that reason also has to be recorded.
“Why didn’t it go?” is a common notification question. I need to distinguish no candidate, no matching subscription, a stop rule, an outbox entry that failed delivery, and so on.
The delivery-decision record stores subscription, fact, target, decision, reason code, policy mode, and policy version. It also prevents the same fact/subscription pair from being processed twice.
This record is needed to tune whether notifications are working.
In practice, when tuning notifications, stopped notifications are often more important to inspect than sent notifications. Too much suppression may mean the facts are vague or subscription matching is off. Too little suppression may make the app noisy. Suppression is not “failure”; it is an intentional choice.
APNs delivery happens from the outbox
The outbox is a buffer that keeps APNs delivery out of the API request path.
sequenceDiagram
autonumber
participant Worker as Push worker
participant Outbox as Outbox table
participant Dispatcher as Push dispatcher
participant APNs as APNs
participant Repo as Delivery repository
Worker->>Outbox: Enqueue notification for delivery
Dispatcher->>Outbox: Fetch notification ready to send
Dispatcher->>APNs: Send message to APNs
alt Success
Dispatcher->>Repo: Record delivery success
else Invalid device token
Dispatcher->>Repo: Record terminal failure and token-cleanup candidate
else APNs auth error
Dispatcher->>Repo: Defer remaining work
else Retryable failure
Dispatcher->>Repo: Store next attempt time and backoff
end
When sending from the outbox, the system distinguishes:
- Expired items are not sent.
- Messages without prepared translations become terminal failures.
- Invalid device tokens become cleanup candidates.
- When APNs authentication fails, remaining items are deferred instead of being forced through.
- Retryable failures reflect attempt count and backoff.
I watch reasons, not only counts
For push, success count is not enough. Sent count, suppressed count, retry count, invalid device tokens, long-waiting outbox items, and backlog by priority all need to be seen together.
Suppression metrics are especially useful because they show whether notifications are too cautious or too noisy. If too many are suppressed, facts may be vague or subscription rules may be misaligned. If too few are suppressed and too many are sent, users will turn notifications off.
So a dashboard that only watches outbox backlog is insufficient. It also needs to show how much was filtered before the outbox, which reason codes are increasing, and where invalid tokens and APNs auth errors diverge.
Notification quality did not end at send count
Push had to ask “is it okay to send?” before “can we send?” Suppression is a choice that protects users, and reasons for not sending need to be audited with reason codes.
The outbox separates API requests from APNs delivery failure. Missing translations, invalid device tokens, APNs auth errors, and retryable failures are different problems. If they are merged into one failure count, the next action becomes unclear.
The final question in notification work was not “how many did we send?” It was whether I could explain why we sent, why we did not send, and whether that choice helped users.
That explanation became possible only when stopped notifications and sent notifications were recorded together.
Share
No comments yet. You can leave the first one.
Pending review