Building an in-app messaging system involves multiple layers working together: event detection on your backend, real-time delivery infrastructure, persistent storage, and UI components on the frontend. Understanding how these pieces fit together helps you make better architectural decisions, whether you're building from scratch or integrating a third-party service.
Understanding the architecture
At its core, an in-app messaging system follows a predictable flow. Something happens in your application that triggers a notification. Your backend processes this event, determines who should be notified and how, then stores the message and queues it for delivery. Real-time infrastructure pushes the notification to any connected clients, where UI components render it appropriately. When users interact with the message, those state changes sync back to your server.

Event triggers and message orchestration
Everything starts with an event. This could be a user action, such as posting a comment, a system event, like payment processing, or a scheduled task, like a weekly digest. Your backend needs to capture these events and transform them into notifications.
In the simplest case, you might directly create a notification record when an event occurs. When a user comments on a post, your API endpoint creates a notification for the post author. But this quickly becomes insufficient for real applications. What if the author has muted notifications from this user? What if they've received five comments in the last minute? Should you batch them? What if it's 3 AM in their timezone?
This is where message orchestration becomes critical. Instead of hard-coding notification logic throughout your application, you centralize it in a notification service or workflow engine. This orchestration layer handles complex decision-making, including user preferences, delivery timing, message batching, channel selection, and failover logic.
Direct vs. orchestrated notifications
// Simple direct notification
async function handleNewComment(comment) {
await createNotification({
userId: post.authorId,
type: "comment",
message: `${comment.author} commented on your post`,
});
}
// Orchestrated workflow
async function handleNewComment(comment) {
await notificationService.trigger("new-comment", {
recipient: post.authorId,
actor: comment.authorId,
data: { postId: post.id, commentId: comment.id },
preferences: ["in-app", "email"], // Check user preferences
rules: {
batch: { window: "5m", key: "post.id" }, // Batch by post
quiet_hours: { start: "22:00", end: "08:00" },
rate_limit: { max: 10, window: "1h" },
},
});
}
User segmentation and targeting
Modern orchestration systems also handle user segmentation and targeting. You might want to send a feature announcement only to users on a specific plan, or in a particular timezone, or who have used a related feature. The orchestration layer evaluates these conditions and ensures messages reach the right users at the right time.
Real-time delivery infrastructure
Getting messages from your server to users' browsers in real-time is one of the most complex parts of the system. You need persistent connections that can survive network hiccups, scale to thousands of concurrent users, and deliver messages with minimal latency.
Connection management at scale
WebSockets provide the gold standard for real-time communication. They establish a persistent, bidirectional connection between client and server, allowing instant message delivery in both directions. When a new notification is created, your server can immediately push it to all connected clients for that user. The challenge lies in managing these connections at scale. Each WebSocket connection consumes server resources, and you need to handle scenarios like users having multiple tabs open, connection drops, and server deployments that require graceful reconnection.

Message delivery guarantees
The infrastructure must also handle message delivery guarantees. What happens if a user's connection drops right as you're sending a notification? You need acknowledgment mechanisms, retry logic, and potentially fallback channels. Many systems implement an "at-least-once" delivery guarantee, where messages might be delivered multiple times but never lost, with deduplication handled on the client side.
Server-side WebSocket implementation
Most production systems use native WebSocket implementations or libraries like Socket.IO to handle the complexity of connection management. Here's how a Node.js server might set up the real-time infrastructure using raw WebSockets:
const express = require("express");
const http = require("http");
const WebSocket = require("ws");
const jwt = require("jsonwebtoken");
const app = express();
const server = http.createServer(app);
// Create WebSocket server
const wss = new WebSocket.Server({
server,
verifyClient: (info) => {
// Basic origin check
return info.origin === process.env.CLIENT_URL;
},
});
// Store active connections by user
const userConnections = new Map();
const tenantConnections = new Map();
wss.on("connection", (ws, req) => {
// Extract token from query parameters or headers
const url = new URL(req.url, `http://${req.headers.host}`);
const token =
url.searchParams.get("token") ||
req.headers.authorization?.replace("Bearer ", "");
try {
const decoded = jwt.verify(token, process.env.JWT_SECRET);
ws.userId = decoded.userId;
ws.tenantId = decoded.tenantId;
console.log(`User ${ws.userId} connected`);
// Store connection for targeted delivery
if (!userConnections.has(ws.userId)) {
userConnections.set(ws.userId, new Set());
}
userConnections.get(ws.userId).add(ws);
if (ws.tenantId) {
if (!tenantConnections.has(ws.tenantId)) {
tenantConnections.set(ws.tenantId, new Set());
}
tenantConnections.get(ws.tenantId).add(ws);
}
// Handle incoming messages
ws.on("message", (data) => {
try {
const message = JSON.parse(data);
handleClientMessage(ws, message);
} catch (err) {
console.error("Invalid message format:", err);
}
});
// Handle disconnection
ws.on("close", () => {
console.log(`User ${ws.userId} disconnected`);
// Remove from connection maps
if (userConnections.has(ws.userId)) {
userConnections.get(ws.userId).delete(ws);
if (userConnections.get(ws.userId).size === 0) {
userConnections.delete(ws.userId);
}
}
if (ws.tenantId && tenantConnections.has(ws.tenantId)) {
tenantConnections.get(ws.tenantId).delete(ws);
if (tenantConnections.get(ws.tenantId).size === 0) {
tenantConnections.delete(ws.tenantId);
}
}
});
} catch (err) {
console.error("Authentication failed:", err);
ws.close(1008, "Authentication failed");
}
});
function handleClientMessage(ws, message) {
switch (message.type) {
case "notification_ack":
markNotificationAsDelivered(ws.userId, message.notificationId);
break;
case "ping":
ws.send(JSON.stringify({ type: "pong" }));
break;
}
}
Broadcasting notifications
When your orchestration layer determines that a notification should be sent, the server broadcasts it immediately to the appropriate recipients. This happens through targeted delivery using the connection maps rather than broadcasting to all connected clients:
async function deliverNotification(notification) {
// Store the notification in your database first
const savedNotification = await Notification.create({
id: notification.id,
userId: notification.userId,
type: notification.type,
title: notification.title,
body: notification.body,
data: notification.data,
status: "pending",
createdAt: new Date(),
});
// Determine delivery targets based on notification type
const deliveryTargets = await determineDeliveryTargets(notification);
for (const target of deliveryTargets) {
if (target.type === "user") {
// Send to specific user across all their sessions
const userSessions = userConnections.get(target.userId);
if (userSessions) {
const message = JSON.stringify({
type: "notification",
data: {
id: savedNotification.id,
type: savedNotification.type,
title: savedNotification.title,
body: savedNotification.body,
data: savedNotification.data,
timestamp: savedNotification.createdAt,
},
});
userSessions.forEach((ws) => {
if (ws.readyState === WebSocket.OPEN) {
ws.send(message);
}
});
}
} else if (target.type === "tenant") {
// Send to all users in an organization
const tenantSessions = tenantConnections.get(target.tenantId);
if (tenantSessions) {
const message = JSON.stringify({
type: "announcement",
data: {
id: savedNotification.id,
type: savedNotification.type,
title: savedNotification.title,
body: savedNotification.body,
priority: notification.priority || "normal",
},
});
tenantSessions.forEach((ws) => {
if (ws.readyState === WebSocket.OPEN) {
ws.send(message);
}
});
}
}
}
// Update notification status
await savedNotification.update({ status: "delivered" });
}
The key advantage of this approach is instant delivery without polling. When a user comments on a post, the post author receives the notification immediately, regardless of which page they're viewing or how long they've been idle. The room-based targeting ensures notifications reach the right users while maintaining privacy and security boundaries.
Storage and state management
Unlike ephemeral push notifications, in-app messages require persistent storage. Users expect to see their notification history, reference old messages, and have their read/unread states sync across devices. This means every notification needs to be stored with sufficient metadata for querying and state management.
Notification data model
Your notification data model needs to capture not just the message content, but also its lifecycle. A typical notification record includes the recipient user ID, message type and content, creation timestamp, read/unread status, interaction history (including clicks, dismissals, and archiving), expiration date, and any custom data required for rendering or actions. You'll also need indexes for efficient querying: fetching all notifications for a user, filtering by read/unread status, sorting by timestamp, and potentially full-text search.
{
"id": "notif_abc123",
"userId": "user_456",
"type": "comment",
"title": "New comment on your post",
"body": "Sarah said: \"Great article!\"",
// State tracking
"status": "unread",
"createdAt": "2024-01-15T10:30:00Z",
"readAt": null,
"clickedAt": null,
"archivedAt": null,
// Delivery tracking
"deliveredChannels": ["in-app", "email"],
"deliveredAt": {
"in-app": "2024-01-15T10:30:01Z",
"email": "2024-01-15T10:32:00Z"
},
// Context for actions
"data": {
"postId": "post_789",
"commentId": "comment_123",
"actorId": "user_sarah"
},
// Cleanup
"expiresAt": "2024-02-15T10:30:00Z"
}
Cross-device synchronization
State synchronization becomes complex when users have multiple sessions. If someone reads a notification on their phone, it should appear as read on their desktop immediately. This requires your real-time infrastructure to broadcast state changes to all of a user's connected clients, or at a minimum, to invalidate caches and refetch data.
Data retention and cleanup
Data retention and cleanup are often overlooked but critical for long-term system health. Old notifications should be archived or deleted based on your retention policy. This isn't just about database size; it's also about privacy regulations, such as GDPR, that may require you to delete user data upon request. Implement background jobs that regularly clean up expired notifications, archive old but essential messages, and compact your indexes.
Frontend integration
On the client side, your notification system needs to manage several responsibilities simultaneously. It must establish and maintain real-time connections, handling reconnection gracefully when networks fail. It needs to fetch and cache notification data, providing instant UI updates while syncing with the server. Different message formats require different rendering components, from simple toast notifications to complex interactive cards. User interactions must be captured and synced back to the server, updating read states and triggering any associated actions.
Connection management
The implementation typically starts with a connection manager that handles the WebSocket or SSE connection. This manager deals with authentication, reconnection logic with exponential backoff, message queuing during disconnections, and connection state synchronization across tabs. Modern browsers provide APIs, such as BroadcastChannel or SharedWorker, to help coordinate across multiple tabs, preventing duplicate connections and ensuring consistent state.
class NotificationManager {
constructor(userId, token) {
this.userId = userId;
this.token = token;
this.reconnectAttempts = 0;
this.messageQueue = [];
this.subscribers = new Set();
this.connect();
}
connect() {
// Include token in query parameters for authentication
this.ws = new WebSocket(
`wss://api.app.com/notifications?token=${this.token}`,
);
this.ws.onopen = () => {
console.log("WebSocket connected");
this.reconnectAttempts = 0;
this.flushMessageQueue();
// Send ping to keep connection alive
this.startHeartbeat();
};
this.ws.onmessage = (event) => {
const message = JSON.parse(event.data);
this.handleMessage(message);
};
this.ws.onclose = () => {
console.log("WebSocket disconnected");
this.stopHeartbeat();
this.scheduleReconnect();
};
this.ws.onerror = (error) => {
console.error("WebSocket error:", error);
};
}
startHeartbeat() {
this.heartbeatInterval = setInterval(() => {
if (this.ws.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify({ type: "ping" }));
}
}, 30000); // Ping every 30 seconds
}
stopHeartbeat() {
if (this.heartbeatInterval) {
clearInterval(this.heartbeatInterval);
}
}
scheduleReconnect() {
const delay = Math.min(1000 * 2 ** this.reconnectAttempts, 30000);
this.reconnectAttempts++;
setTimeout(() => this.connect(), delay);
}
handleMessage(message) {
switch (message.type) {
case "notification":
this.addNotification(message.data);
// Acknowledge receipt
this.ws.send(
JSON.stringify({
type: "notification_ack",
notificationId: message.data.id,
}),
);
break;
case "announcement":
this.addAnnouncement(message.data);
break;
case "status_update":
this.updateNotificationStatus(message.data);
break;
case "pong":
// Heartbeat response - connection is alive
break;
}
// Notify all UI subscribers
this.subscribers.forEach((callback) => callback(message));
}
flushMessageQueue() {
while (this.messageQueue.length > 0) {
const message = this.messageQueue.shift();
this.ws.send(JSON.stringify(message));
}
}
}
State management integration
Your UI components consume this notification stream through a state management layer, whether that's React Context, Redux, or another solution. The state layer maintains the current list of notifications, unread counts, and filter states, while handling optimistic updates for better perceived performance.
Security and privacy
Every aspect of your notification system must be built with security in mind. Authentication is the first line of defense: users should only receive their own notifications. This typically involves authenticating WebSocket connections with JWT tokens or session cookies, with the server validating these tokens before establishing the connection.
Authentication and authorization
But authentication alone isn't enough. You also need to consider data isolation in multi-tenant applications, where notifications must be scoped to the correct organization. Rate limiting prevents abuse and protects your infrastructure from denial-of-service attacks. Encryption ensures sensitive notification content can't be intercepted, both in transit (TLS) and potentially at rest for sensitive data.

Input validation becomes critical when notifications include user-generated content. A notification displaying "John commented on your post" seems innocent, but what if John's username contains malicious JavaScript? Every piece of user-controlled data must be properly escaped before rendering. Similarly, notification actions (like "click to view") must be validated to prevent unauthorized access to resources.
Data protection and privacy controls
Privacy considerations extend beyond security. Users should have control over their notification data: the ability to delete notifications, export their notification history, and control retention periods. Consider whether notifications might leak sensitive information if someone gains access to a user's device while they're logged in. Some applications implement additional privacy features, such as automatic notification clearing after logout or encrypted notification content that requires re-authentication to view.
The complexity of building a robust, scalable in-app messaging system explains why many teams choose to integrate existing solutions rather than building from scratch. But understanding these technical foundations helps you make better architectural decisions, whether you're evaluating third-party services or designing your own implementation.