A walk through how a social platform builds, stores, and serves a personalized stream of posts from the accounts each user follows — and the tradeoffs that keep it fast at scale.
Before sketching boxes, pin down the small set of operations the feed actually has to support and the budgets it has to meet.
A user creates a text, image, or video post. The post becomes visible to everyone who follows that user.
Users build their own audience graph. The follow edge drives which posts appear in whose feed.
Open the app and see a ranked, paginated list of recent posts from followed accounts within a few hundred milliseconds.
Hundreds of millions of users, average tens to hundreds of follows each, with a long tail of accounts that have millions of followers.
Feed load should feel instant on mobile. Aim for p99 well under one second end to end, including network.
New posts from people you follow should show up in seconds, not minutes — without a manual refresh in most cases.
The hard part is not storing posts — it is deciding when to assemble a user's feed. That decision splits the system into a write path and a read path.
What happens at the moment someone hits "post." The new entry has to be persisted, indexed, and somehow made discoverable by every follower's future feed read.
What happens when a user opens the app. The system has to return a ranked page of posts quickly, drawing on whatever was pre-computed and whatever still has to be assembled on demand.
When a post is created, the system actively pushes a reference to that post into the inbox of every follower. By the time anyone opens the app, their feed is already assembled.
Nothing happens to other users' feeds when a post is published. Instead, when a viewer opens the app, the system pulls the latest posts from each followed account and merges them on the fly.
Neither model is universally better. The right answer depends on the shape of the social graph and the read-to-write ratio.
| Dimension | Fanout on write (push) | Fanout on read (pull) |
|---|---|---|
| Read latency | Fast — feed is already assembled, single lookup | Slow — must query and merge many sources per request |
| Write cost | High — one post triggers writes to every follower's inbox | Low — one post is one insert, nothing more |
| High-follower accounts | Painful — a post by a celebrity creates millions of fanout writes | Cheap — followers all read from the same source list |
| Inactive followers | Wasteful — feeds are built for users who may never log in | Efficient — no work is done until someone actually asks |
| Storage overhead | High — every user keeps a personalized cached list | Low — only the underlying posts and follow graph |
| Best when | Reads vastly outnumber writes; most accounts have modest follower counts | Hot accounts have enormous followings; most users read rarely |
Real systems mix both. Ordinary accounts use fanout on write, since their follower lists are small. A handful of high-fanout accounts are treated specially and pulled at read time.
A handful of focused services, each with a clear job. They communicate through queues and caches so the slow paths never block the fast ones.
Accepts new posts, validates them, stores them in the primary post database, and emits an event for downstream consumers. This service does not care about followers.
Subscribes to post events. For each new post, it loads the author's follower list and decides whether to push the post ID into each follower's feed cache or defer the work.
A per-user list of recent post IDs, kept in a fast key-value store. This is the structure the read path hits first. Sized to roughly the visible window of the feed.
Handles the read request. Pulls the cached list, hydrates each post ID into a full post, merges in any pull-mode sources, applies ranking and pagination, returns the page.
Each user gets a bounded, ordered list of post IDs — not the posts themselves. The post bodies live elsewhere and are hydrated on demand.
Feeds are constantly being prepended to. Offset-based pagination breaks the moment new posts arrive — items shift and pages start repeating or skipping. A cursor pins the read to a stable position in the stream.
Client asks for "page 2, 20 per page." If three new posts arrived between page 1 and page 2, the first three items of page 2 are the same items the client already saw at the end of page 1.
Each page response carries an opaque cursor pointing just past the last returned item. The next request asks for "items older than this cursor." New posts prepended at the top do not disturb the cursor.
Once the candidate set of posts is assembled, the system has to decide what order to show them in. The choice shapes user behavior more than almost any other design decision.
Newest first, oldest last. The merge is a simple sort by timestamp.
A scoring model assigns each candidate post a relevance score from features like recency, author affinity, predicted engagement, content type, and viewer history.
A feed system is mostly correct on the happy path. Production complexity lives in the messy interactions between deletes, privacy, and the social graph.
The post ID may still sit in millions of feed caches. Either tombstone the post and filter on hydration, or rely on the hydration step to drop missing IDs gracefully.
Past posts from the blocked user must disappear from the viewer's feed. Filter at read time — never trust pre-built feeds to reflect the latest relationship state.
A user makes their account private, or restricts a post to a subset of followers. The feed pipeline must re-check visibility per viewer, not just at fanout time.
A repost references another post. Both the repost and the original should not appear twice. Dedupe by underlying content, and decide whose voice anchors the feed entry.
Already-cached post IDs from that author should fade out. Either purge on unfollow, or filter at read time against the current follow set.
Caches can be lost, partially written, or fall out of sync with the post store. A periodic rebuild job restores feeds for active users from the source of truth.
A news feed is a long-running optimization problem balancing freshness, cost, and latency across a graph with a brutal long tail.
Decide where the work happens. Every other choice flows from that one decision.
Push by default. Pull for accounts whose fanout would otherwise melt the write path.
Keep the hot per-user list small. Resolve post bodies on demand, against the latest privacy rules.
Anything mutable at the head needs an order-stable pointer, not an integer offset.
Blocks, privacy, and unfollows change faster than caches. Trust the read-side check, not the write-side snapshot.
Caches will drift. A background reconciliation job is part of the design, not an afterthought.