Skip to content

Saga Pattern (Distributed Transactions)

In a monolithic application, if an order fails at the last second, the database can easily rollback everything natively using a single SQL Transaction (e.g., START TRANSACTION; ... ROLLBACK;).

However, in a Microservices Architecture, the system is intentionally spread across multiple separate databases isolated by business domain (e.g., Order DB, Payment DB, Inventory DB). You physically cannot run a single SQL transaction across 3 different servers. If the Payment succeeds but the Inventory Service suddenly crashes, you are left with a massive problem: The user was charged, but no item was reserved.

To securely solve this, we use the Saga Pattern. A Saga is a carefully choreographed sequence of local transactions where each service strategically updates its own database and triggers the next step. If one step fails, the Saga executes a series of Compensating Transactions (undo commands physically designed to reverse the previous successful steps).


1. How Sagas Work (Orchestration vs Choreography)

There are two ways to implement a Saga:

  1. Choreography: Every microservice silently reacts to events published by other microservices in a Message Queue. (Complex to debug, "dance" like).
  2. Orchestration: A central "brain" (the Orchestrator API) explicitly commands each microservice exactly what to do, and handles the rollback logic natively. (Much easier to historically track, debug, and understand).

We will use the Orchestrator model for our example below.


2. Architecture Diagram (Saga with Orchestration)

Let's look at a classic E-commerce Order Flow. If Step 3 (Inventory) fails, the Orchestrator safely intercepts the error and executes "Compensating Transactions" to systematically refund the user.


3. Architectural Code Example (Orchestrator)

Here is an amazingly easy-to-understand Node.js example of an Order Orchestrator. Notice how carefully it tracks the IDs and handles failures by explicitly calling the "Undo" routes globally.

javascript
const express = require("express");
const axios = require("axios"); // Used to communicate with other microservices over HTTP

const app = express();
app.use(express.json());

// Fake Microservice Architecture URLs
const ORDER_API = "http://order-service/api";
const PAY_API = "http://payment-service/api";
const INV_API = "http://inventory-service/api";

// This is our central SAGA ORCHESTRATOR function
app.post("/api/checkout", async (req, res) => {
  const { userId, productId, amount } = req.body;

  // Track our progress so we know exactly what to undo if something fails
  let orderId = null;
  let paymentId = null;

  try {
    // ---------------------------------------------------------
    // STEP 1: CREATE PENDING ORDER
    // ---------------------------------------------------------
    console.log("Step 1: Creating Pending Order...");
    const orderRes = await axios.post(`${ORDER_API}/create`, { userId });
    orderId = orderRes.data.id;

    // ---------------------------------------------------------
    // STEP 2: PROCESS PAYMENT
    // ---------------------------------------------------------
    console.log("Step 2: Processing Payment...");
    const payRes = await axios.post(`${PAY_API}/charge`, { userId, amount });
    paymentId = payRes.data.id;

    // ---------------------------------------------------------
    // STEP 3: UPDATE INVENTORY
    // ---------------------------------------------------------
    console.log("Step 3: Reserving Inventory...");
    await axios.post(`${INV_API}/reserve`, { productId });

    // ---------------------------------------------------------
    // STEP 4: FINALIZE ORDER (Success!)
    // ---------------------------------------------------------
    console.log("Step 4: Approving Order...");
    await axios.post(`${ORDER_API}/approve`, { orderId });

    return res.status(200).json({ message: "Checkout completely successful!" });
  } catch (error) {
    console.error(
      "❌ SAGA FAILED at intermediate step! Initiating System Rollbacks..."
    );

    // ---------------------------------------------------------
    // COMPENSATING TRANSACTIONS (The Rollback Phase)
    // ---------------------------------------------------------

    // If the Payment previously succeeded but the Inventory critically failed, we MUST refund the payment
    if (paymentId) {
      console.log(
        `↪️ Compensating: API call to Refund Payment ${paymentId}...`
      );
      await axios.post(`${PAY_API}/refund`, { paymentId });
    }

    // If an Order was natively created but the overarching flow failed, we MUST cancel the order logically
    if (orderId) {
      console.log(`↪️ Compensating: API call to Cancel Order ${orderId}...`);
      await axios.post(`${ORDER_API}/cancel`, { orderId });
    }

    return res.status(500).json({
      error:
        "Checkout failed (Out of stock or system error). All charges securely reversed.",
    });
  }
});

app.listen(8080, () =>
  console.log("Saga Orchestrator successfully running...")
);

4. Key Takeaways

  1. Eventual Consistency: Sagas inherently lack the strict ACID (Atomicity) locking mechanism characteristic of a traditional relational database. For a brief few milliseconds, the database might verify the user's money is gone before the inventory verifies the shoes are safely mathematically reserved. This represents eventual consistency and actively operates perfectly normally in distributed systems.
  2. Compensating Transactions MUST Physically Succeed: If your "Undo / Refund" API call randomly fails due to a network glitch, you are dangerously left in a permanently broken state (user visibly charged, order visibly cancelled). Because of this fatal edge case, compensating transactions are usually placed tightly into highly-durable Message Queues (like Kafka or RabbitMQ) so they mathematically cannot be lost, and are repeatedly retried seamlessly in the background until they succeed.

Released under the ISC License.