Keeping Stale Responses from Looking Fresh: Redis Cache

I separated fast responses from last-success values so cache does not mislead users

In the Hangangjari backend, Redis is not just a performance tool. It lets the API read current values quickly, reuse responses that have already been built, and return the last confirmed value when an external source or DB composition fails momentarily.

But cache must not deceive users. Before fast responses, the first rule is that stale values must not look current.

Adding Redis improves numbers. Latency goes down and DB query count drops. But for users, a fast wrong answer can be worse than a slow correct one. In Hangangjari, I cared more about when a value was confirmed at its source than about the cache hit itself.

Redis keeps only rebuildable values

Hangangjari has caches with different purposes.

CacheExamplePurpose
Status cacheLatest parking-lot valueFast lot-level lookup
Forecast cacheForecast overview/timelineReuse computed responses
Home summary hot cacheFirst-screen summaryFast response with short TTL
Home summary stale cacheLast successful summaryFallback when rebuild fails
Source status cacheSource-check resultReduce repeated lookup cost

Redis is not where reference data lives. Reference data and history live in Postgres; Redis keeps only query results that can be rebuilt.

Drawing this line simplifies incident response. If Redis is empty, recoverable data has not disappeared. Rebuildable query results have disappeared.

If Postgres reference data is damaged, the situation is completely different. I chose not to treat the two stores as equally weighty.

flowchart LR
  API["API read"] --> Redis["Redis cache"]
  Redis -->|Hit| Response["Response"]
  Redis -->|Miss| Postgres["Postgres rows"]
  Postgres --> Build["Build payload"]
  Build --> Redis
  Build --> Response

If parking cache breaks, the API falls back to DB

Parking values are read frequently by lot. The app, widgets, home summary, and forecast inputs all need the latest parking values.

In code, RedisStatusCache owns the current status cache role. It stores JSON payloads by lot ID key. When a cache payload is malformed or Redis is unavailable, the entry is ignored and the query falls back to DB.

I kept two rules here.

  • A cache incident should not become an API incident.
  • Even a cache hit needs a freshness check.
sequenceDiagram
  autonumber
  participant Query as Parking query
  participant Cache as Status cache
  participant DB as Postgres

  Query->>Cache: Check parking status cache
  alt Cache hit
    Cache-->>Query: Return latest parking status
    Query->>Query: Judge freshness by source time
  else Miss or cache unavailable
    Query->>DB: Check last stored status
    DB-->>Query: Stored status or none
    Query->>Query: Judge stale or empty state
  end

Staleness is judged by the time confirmed at the source, not by the time stored in cache. That keeps the meaning of the value stable even if Redis TTL is longer or shorter.

I separated fast cache from last-success cache

home-summary is an expensive response to build. It has to gather parking, forecasts, outing data, and source-check results.

So it uses two kinds of cache.

  • Hot cache: a short-TTL cache for quickly reusing normal responses.
  • Stale cache: a longer-lived backup cache for the last successful payload.

When rebuilding succeeds, both hot and stale cache are updated. If hot cache is empty and rebuilding fails, the stale cache is checked.

flowchart TD
  Request["home-summary request"] --> Hot{"Hot cache hit?"}
  Hot -->|Yes| ReturnHot["Return hot response"]
  Hot -->|No| Rebuild["Rebuild from DB/use cases"]
  Rebuild -->|Success| StoreBoth["Store hot + backup cache"]
  StoreBoth --> ReturnFresh["Return fresh response"]
  Rebuild -->|Failure| Stale{"Backup cache hit?"}
  Stale -->|Yes| ReturnStale["Return backup response"]
  Stale -->|No| Error["Raise error"]

This fallback is not for hiding an outage. The freshness state and source-check results inside the payload must remain intact so users can interpret it as “the last confirmed value.”

The purpose of stale cache is not “look successful.” It is to show the last trustworthy response while also saying that the response is old.

Without that distinction, fallback stops helping users and starts hiding the problem.

It only reduces repeated requests inside the app

Server-side Redis cache and HTTP cache have different jobs. A home-summary response receives private cache headers and an ETag. This lets the same app instance get 304 when it repeatedly asks for the same first-screen value in a short period.

It is not a public cache shared by all users. The response has to consider app access conditions and request context, so it is treated as private.

I look at which responses a changed source affects

Hangangjari cache is not manually edited data. It is rebuilt query output. So invalidation is thought of by pattern.

When forecasts change, forecast cache and home summary cache are affected. When outing source-check results change, home summary is affected too. The parking value cache is refreshed as the polling job writes the latest snapshots.

Invalidation is less about “delete exactly one key” and more about “which responses must reflect this source change?”

Fallback is useful only when it says the value is old

It was safer to keep rebuildable query results in Redis, not reference data. Even after a cache hit, I had to check the source confirmation time to avoid presenting old values as current.

Separating hot cache from stale cache lets the server handle fast responses and outage fallback at the same time.

But stale fallback must preserve freshness state in the payload, and cache incidents should be absorbed by DB fallback where possible and then left as metrics. HTTP private cache and server Redis cache also have different purposes.

After adding Redis, the conclusion stayed simple. Cache is a tool for making responses fast, not a reason for users to feel safe. Fallback becomes a useful experience only when an old value can say it is old.

Share

Share

Image preview