Fintech Application Architecture Patterns

·19 min read·Specialized Domainsintermediate

Why these patterns matter for building reliable, compliant, and scalable financial software today

A schematic ledger with rows of transactions and timestamped entries, representing the core record keeping in fintech systems

Building software for finance feels different. A bug is not just an inconvenience, it’s a ledger entry that can’t be rolled back, a compliance flag, or a customer support call at 2 a.m. I’ve shipped fintech features that were a few lines of code away from moving real money, and the gravity of that reality shaped how I think about architecture. Patterns in fintech aren’t about buzzwords; they are about choices that keep systems stable, auditable, and adaptable as regulations and business models evolve.

In this post, I’ll walk through architecture patterns that show up again and again in fintech, why they are chosen, and what they look like in practice. You’ll see code examples for event-driven payments, sagas for orchestration, and safe pattern matching for transactions. We’ll also discuss tradeoffs, where patterns shine, and where they add more complexity than value. If you’re an engineer building or maintaining financial applications, you’ll leave with a mental toolbox and references to dig deeper.

Where fintech architecture sits in the modern stack

Fintech is a broad category, but most products boil down to a few primitives: accounts, ledgers, payments, risk scoring, and customer data. These primitives power banking apps, payment processors, lending platforms, investment tools, and crypto exchanges. The architecture is often a blend of classic microservices, domain-driven boundaries, and event-driven flows, because money movement is inherently asynchronous and cross-domain.

You’ll see two broad camps:

  • Monoliths with modular boundaries: Often used by smaller teams or early-stage products. A well-structured monolith with clean domain boundaries can ship fast and stay consistent. It’s easier to reason about transactions and data integrity when you’re not distributed.
  • Microservices or service-oriented architectures: Common in larger teams where multiple squads own domain boundaries (e.g., payments, compliance, reporting). They need strong contracts, observability, and resilience patterns to avoid cascading failures.

Who typically uses these patterns? Backend engineers building core banking or payment services, platform engineers supporting regulated workloads, and data engineers ensuring that transactional data is both analyzable and compliant. In real projects, patterns like CQRS, event sourcing, sagas, and the transactional outbox are almost mandatory when you move beyond simple CRUD.

Compared to alternatives, event-driven patterns often replace polling and tight coupling. Orchestration (via a workflow engine or saga orchestrator) is chosen over pure choreography when you need explicit retry and compensation semantics. In data-heavy domains, event sourcing with a read model is chosen over classic OLTP when auditability and historical replay matter more than raw write throughput.

Core patterns used in fintech, in practice

The transactional ledger

At the heart of most fintech systems is a double-entry ledger. Even if your domain is payments or investments, a ledger gives you a canonical source of truth for balances and movements.

Here is a minimal ledger in a Node.js + PostgreSQL stack. We’ll use a strict transaction boundary and double-entry accounting rules to ensure balances never break.

-- Ledger tables
CREATE TABLE accounts (
  id UUID PRIMARY KEY,
  user_id UUID NOT NULL,
  type VARCHAR(32) NOT NULL, -- 'cash', 'savings', 'pending_escrow'
  currency CHAR(3) NOT NULL,
  balance BIGINT NOT NULL, -- minor units (e.g., cents)
  version INT NOT NULL DEFAULT 0,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE transactions (
  id UUID PRIMARY KEY,
  account_id UUID NOT NULL REFERENCES accounts(id),
  amount BIGINT NOT NULL, -- positive for credit, negative for debit
  description VARCHAR(255) NOT NULL,
  reference_id VARCHAR(128), -- external reference like payment_id
  posted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_transactions_account ON transactions(account_id);
// src/core/ledger.ts
// A minimal domain function that enforces double-entry rules
// and ensures account balance consistency via optimistic locking.

import { Pool } from 'pg';

export async function transferFunds(
  pool: Pool,
  fromAccountId: string,
  toAccountId: string,
  amount: number, // in minor units
  description: string,
  referenceId?: string
) {
  const client = await pool.connect();
  try {
    await client.query('BEGIN');

    // Read current balances with version check
    const fromRes = await client.query(
      `SELECT balance, version FROM accounts WHERE id = $1 FOR UPDATE`,
      [fromAccountId]
    );
    const toRes = await client.query(
      `SELECT balance, version FROM accounts WHERE id = $1 FOR UPDATE`,
      [toAccountId]
    );

    if (fromRes.rows.length === 0 || toRes.rows.length === 0) {
      throw new Error('ACCOUNT_NOT_FOUND');
    }

    const fromAccount = fromRes.rows[0];
    const toAccount = toRes.rows[0];

    // Enforce business rules (e.g., no overdraft)
    if (fromAccount.balance < amount) {
      throw new Error('INSUFFICIENT_FUNDS');
    }

    // Apply double-entry: debit from, credit to
    await client.query(
      `INSERT INTO transactions (id, account_id, amount, description, reference_id)
       VALUES ($1, $2, $3, $4, $5)`,
      [crypto.randomUUID(), fromAccountId, -amount, description, referenceId]
    );

    await client.query(
      `INSERT INTO transactions (id, account_id, amount, description, reference_id)
       VALUES ($1, $2, $3, $4, $5)`,
      [crypto.randomUUID(), toAccountId, amount, description, referenceId]
    );

    // Update account balances with optimistic locking
    await client.query(
      `UPDATE accounts
       SET balance = balance - $1, version = version + 1
       WHERE id = $2 AND version = $3`,
      [amount, fromAccountId, fromAccount.version]
    );

    await client.query(
      `UPDATE accounts
       SET balance = balance + $1, version = version + 1
       WHERE id = $2 AND version = $3`,
      [amount, toAccountId, toAccount.version]
    );

    await client.query('COMMIT');
  } catch (err) {
    await client.query('ROLLBACK');
    throw err;
  } finally {
    client.release();
  }
}

This example uses a database transaction to keep ledger changes atomic and consistent. In real projects, you also add idempotency keys (to avoid duplicate payments) and a separate audit trail for regulatory compliance. For larger scale, many teams move to a distributed ledger or a specialized accounting engine, but the core principles stay the same.

Event-driven payments and the transactional outbox

Payments are rarely synchronous. You integrate with PSPs (Payment Service Providers), bank rails, or blockchain networks, all of which are external and flaky. An event-driven architecture fits naturally here: publish a PaymentRequested event, then consume and process it asynchronously. The challenge is ensuring the event is published only if the local database transaction commits. That’s where the transactional outbox pattern shines.

Here’s a practical example using PostgreSQL and Node.js. We write events to an outbox table within the same transaction as business changes, then relay them to a message broker (e.g., Kafka or RabbitMQ) with a background worker.

-- Outbox table
CREATE TABLE outbox (
  id UUID PRIMARY KEY,
  aggregate_type VARCHAR(64) NOT NULL, -- e.g., 'payment'
  aggregate_id VARCHAR(128) NOT NULL,
  event_type VARCHAR(64) NOT NULL,
  payload JSONB NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  published_at TIMESTAMPTZ
);

CREATE INDEX idx_outbox_unpublished ON outbox(published_at) WHERE published_at IS NULL;
// src/outbox/publisher.ts
// Relay outbox events to a message broker safely.

import { Pool } from 'pg';
import { Kafka } from 'kafkajs'; // or any broker client

export async function publishOutboxEvents(pool: Pool, kafka: Kafka) {
  const producer = kafka.producer();
  await producer.connect();

  const client = await pool.connect();
  try {
    // Select a batch of unpublished events
    const res = await client.query(
      `SELECT id, event_type, payload FROM outbox
       WHERE published_at IS NULL ORDER BY created_at LIMIT 100`
    );

    for (const row of res.rows) {
      await producer.send({
        topic: `fintech.events.${row.event_type}`,
        messages: [{ value: JSON.stringify(row.payload) }],
      });

      await client.query(
        `UPDATE outbox SET published_at = NOW() WHERE id = $1`,
        [row.id]
      );
    }
  } finally {
    client.release();
    await producer.disconnect();
  }
}

In the payment domain, you might emit PaymentInitiated, PaymentAuthorized, PaymentCaptured, and PaymentFailed events. Consumers update read models, trigger webhooks, and orchestrate retries. This pattern decouples domains and improves resilience; the payment processor can retry safely without re-running business logic.

Orchestration vs. choreography for multi-step flows

Multi-step flows like KYC checks, multi-leg payments, or funding sequences often need either orchestration (a central coordinator) or choreography (each service reacts to events). In practice, fintech tends to favor orchestration for flows that require strict SLA and compensation logic. A saga orchestrator can track state, execute retries, and run compensating actions if a step fails.

Consider a funding flow: 1) validate user, 2) reserve funds, 3) call PSP to fund, 4) confirm ledger update, 5) notify user. If step 3 fails, you need to release the reserved funds. Using a saga makes this explicit.

Below is a compact saga orchestrator in TypeScript. It’s not production-grade, but it shows the structure you often build with workflow engines like Temporal or Camunda.

// src/orchestrator/fundingSaga.ts

type StepResult = 'ok' | 'fail' | 'retry';
type SagaStep = (ctx: FundingContext) => Promise<StepResult>;

interface FundingContext {
  userId: string;
  amount: number;
  PSPRequestId: string;
  reserved: boolean;
  funded: boolean;
  attempts: number;
}

const steps = {
  validateUser: async (ctx: FundingContext): Promise<StepResult> => {
    // Call user service and compliance service
    // Return 'fail' if blocked
    return 'ok';
  },
  reserveFunds: async (ctx: FundingContext): Promise<StepResult> => {
    // Call ledger service to reserve (mark as pending)
    ctx.reserved = true;
    return 'ok';
  },
  callPSP: async (ctx: FundingContext): Promise<StepResult> => {
    // External PSP call
    // Simulate transient failure
    if (ctx.attempts++ < 2) return 'retry';
    ctx.funded = true;
    return 'ok';
  },
  confirmLedger: async (ctx: FundingContext): Promise<StepResult> => {
    // Move reserved to posted
    return 'ok';
  },
  notifyUser: async (ctx: FundingContext): Promise<StepResult> => {
    // Send email or push
    return 'ok';
  },
};

const compensations = {
  reserveFunds: async (ctx: FundingContext) => {
    // Release reserved funds
    ctx.reserved = false;
  },
};

async function runSaga(ctx: FundingContext) {
  const sequence: Array<keyof typeof steps> = [
    'validateUser',
    'reserveFunds',
    'callPSP',
    'confirmLedger',
    'notifyUser',
  ];

  const executed: Array<keyof typeof steps> = [];

  for (const stepName of sequence) {
    const step = steps[stepName];
    let result: StepResult = 'fail';

    try {
      result = await step(ctx);
    } catch (e) {
      result = 'fail';
    }

    if (result === 'ok') {
      executed.push(stepName);
      continue;
    }

    if (result === 'retry') {
      // Simple retry policy
      try {
        await new Promise((res) => setTimeout(res, 500));
        const retryResult = await step(ctx);
        if (retryResult === 'ok') {
          executed.push(stepName);
          continue;
        }
      } catch {
        // fall through to compensation
      }
    }

    // Compensation: run backwards
    for (let i = executed.length - 1; i >= 0; i--) {
      const name = executed[i];
      const comp = compensations[name as keyof typeof compensations];
      if (comp) {
        await comp(ctx);
      }
    }
    throw new Error(`Saga failed at ${stepName}`);
  }

  return ctx;
}

This pattern is ubiquitous in fintech. Orchestration gives you a clear audit trail and explicit failure handling. Choreography can work for simpler flows but becomes hard to debug when failures cross domain boundaries.

CQRS and read models for reporting and compliance

Fintech requires fast writes (for payments) and complex reads (for statements, audit, analytics). Command Query Responsibility Segregation (CQRS) separates these concerns. You write events to a ledger and project them into read-optimized models.

Here’s a simplified projection into a user statement view:

// src/read-models/statementProjection.ts

interface StatementEntry {
  userId: string;
  accountId: string;
  date: string;
  amount: number;
  description: string;
  balanceAfter: number;
}

// Example projection from PaymentCaptured event
function projectPaymentCaptured(
  event: { userId: string; accountId: string; amount: number; description: string; timestamp: string },
  currentBalance: number
): StatementEntry {
  const nextBalance = currentBalance + event.amount;
  return {
    userId: event.userId,
    accountId: event.accountId,
    date: event.timestamp,
    amount: event.amount,
    description: event.description,
    balanceAfter: nextBalance,
  };
}

In production, you’ll store projections in a separate database (often read-optimized, like a columnar store for analytics). You’ll also handle schema evolution carefully because events are immutable. This pattern is key for regulatory reporting and customer-facing statements that must be consistent and fast.

Idempotency and deduplication

Fintech APIs are called repeatedly. Network retries, mobile apps, and webhook handlers all send duplicate requests. Idempotency is a requirement, not a nice-to-have. Use idempotency keys at the API boundary and in event processing.

A typical approach:

  • Accept an idempotency key in the API header.
  • Store it with the resource creation (e.g., payment record) in a unique index.
  • On duplicate request, return the stored result.
CREATE TABLE payments (
  idempotency_key VARCHAR(128) PRIMARY KEY,
  payment_id UUID NOT NULL,
  amount BIGINT NOT NULL,
  currency CHAR(3) NOT NULL,
  status VARCHAR(32) NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_payments_id ON payments(payment_id);
// src/payments/create.ts
// Idempotent payment creation

export async function createPayment(
  pool: Pool,
  idempotencyKey: string,
  amount: number,
  currency: string
) {
  const client = await pool.connect();
  try {
    await client.query('BEGIN');

    // Try insert, if conflict return existing
    const res = await client.query(
      `INSERT INTO payments (idempotency_key, payment_id, amount, currency, status)
       VALUES ($1, $2, $3, $4, 'pending')
       ON CONFLICT (idempotency_key) DO UPDATE
       SET status = EXCLUDED.status
       RETURNING payment_id, status`,
      [idempotencyKey, crypto.randomUUID(), amount, currency]
    );

    await client.query('COMMIT');

    const row = res.rows[0];
    return { paymentId: row.payment_id, status: row.status };
  } catch (err) {
    await client.query('ROLLBACK');
    throw err;
  } finally {
    client.release();
  }
}

Idempotency keys also protect event consumers. With exactly-once semantics difficult to achieve, idempotent processing is the pragmatic choice.

API contracts and versioning

Fintech services are used by mobile apps, partner integrations, and internal systems. Changing APIs without care will break integrations and cause compliance issues. Use semantic versioning for public APIs and prefer additive changes. For breaking changes, maintain multiple versions in parallel.

  • Prefer PUT or POST with explicit fields over overloading GET parameters.
  • Use strongly typed schemas (OpenAPI) and enforce them at the gateway.
  • Version endpoints (e.g., /v1/payments -> /v2/payments).
  • Deprecate gradually and communicate timelines.

For event contracts, use a schema registry. Avro or JSON Schema are common. Evolution rules: add fields with defaults, never remove fields without a migration plan, and mark fields as deprecated.

Strengths, weaknesses, and tradeoffs

Strengths:

  • Event-driven patterns scale well for payments and notifications, and they provide natural decoupling.
  • Ledger models produce auditability and correctness, which are critical for compliance.
  • Orchestration (sagas) makes complex flows understandable and compensations explicit.
  • CQRS improves read performance and separates concerns for reporting.

Weaknesses:

  • Event systems introduce operational complexity: you need observability, dead-letter queues, and idempotent consumers.
  • CQRS can lead to eventual consistency, which is sometimes unacceptable (e.g., immediate balance checks). You’ll need hybrid models with read-through caches or strong consistency checks.
  • Orchestration can become a single point of failure if not designed with redundancy. The orchestrator itself must be resilient and observable.
  • Ledger-first designs require careful modeling for refunds, chargebacks, and fees. It’s easy to get the sign convention wrong.

When to choose these patterns:

  • If you process payments or move money, start with a ledger and outbox. Event-driven flows will follow naturally.
  • For multi-step flows with external dependencies, use a saga orchestrator or a workflow engine. Temporal (https://temporal.io/) is a popular choice for building reliable workflows.
  • For reporting-heavy products, add CQRS projections to dedicated read stores.
  • For consumer-facing APIs, prioritize idempotency and clear versioning.

When to skip them:

  • Early prototypes with no money movement can use a simple monolith and avoid the overhead.
  • Single-user tools or internal dashboards may not need CQRS or a saga. Complexity should map to business value.
  • High-frequency trading systems might prefer a single, highly optimized service with strict latency guarantees rather than distributed events.

Real-world project structure and workflow

A realistic fintech backend often looks like this. This is from a payment service I helped build. We started with a monolith and later split out payment processing and reporting.

/src
  /core
    /ledger          # account and transaction domain
    /payments        # payment orchestration and PSP adapters
    /events          # outbox and event definitions
  /read-models       # projections for statements and dashboards
  /orchestrator      # saga workflows and compensations
  /adapters
    /psp             # external provider clients (e.g., Stripe, Adyen)
    /db              # data access and migrations
  /api
    /v1              # stable public API
    /v2              # new API with additive changes
  /worker
    outboxRelay.ts   # publishes outbox events
    statementProj.ts # builds read models

Workflow:

  • Define domain events in /core/events.
  • Use the outbox pattern to publish events reliably.
  • Build projections in /read-models for fast queries.
  • Implement orchestration in /orchestrator for complex flows.
  • Add PSP adapters in /adapters/psp to abstract provider differences.
  • Version API endpoints and maintain backward compatibility.
  • Write integration tests that simulate failures and retries.
  • Deploy with blue/green or canary to reduce risk.
  • Monitor with structured logs, metrics, and distributed tracing.

A small end-to-end example: payment flow with event and saga

Imagine a payment flow that requires a KYC check, a risk score, and a PSP call. If the PSP fails, we want to retry and eventually compensate by releasing funds.

Step 1: API receives payment request with idempotency key. Step 2: Persist payment intent and outbox event within the same transaction. Step 3: Orchestrate KYC, risk, PSP capture via a saga. Step 4: On success, update ledger and emit PaymentCaptured. Step 5: On failure, run compensations.

// src/api/payments.ts
import { Router } from 'express';
import { Pool } from 'pg';
import { Kafka } from 'kafkajs';
import { createPayment } from '../payments/create';

export function paymentsRouter(pool: Pool, kafka: Kafka) {
  const router = Router();

  router.post('/v1/payments', async (req, res) => {
    const { amount, currency, idempotencyKey } = req.body;

    try {
      const result = await createPayment(pool, idempotencyKey, amount, currency);

      // Enqueue orchestration via event
      const producer = kafka.producer();
      await producer.connect();
      await producer.send({
        topic: 'fintech.events.PaymentInitiated',
        messages: [{ value: JSON.stringify({ idempotencyKey, amount, currency }) }],
      });
      await producer.disconnect();

      res.status(202).json({ paymentId: result.paymentId, status: result.status });
    } catch (e) {
      res.status(500).json({ error: 'payment_failed' });
    }
  });

  return router;
}
// src/worker/paymentOrchestrator.ts
// A simple orchestrator listening to PaymentInitiated events.

import { Kafka } from 'kafkajs';
import { runSaga } from '../orchestrator/fundingSaga';

export async function startPaymentOrchestrator(kafka: Kafka) {
  const consumer = kafka.consumer({ groupId: 'payment-orchestrator' });
  await consumer.connect();
  await consumer.subscribe({ topic: 'fintech.events.PaymentInitiated' });

  await consumer.run({
    eachMessage: async ({ message }) => {
      const payload = JSON.parse(message.value?.toString() || '{}');
      const ctx = {
        userId: 'user-123', // fetch from user service
        amount: payload.amount,
        PSPRequestId: '',
        reserved: false,
        funded: false,
        attempts: 0,
      };

      try {
        await runSaga(ctx);
        // Emit final event
        const producer = kafka.producer();
        await producer.connect();
        await producer.send({
          topic: 'fintech.events.PaymentCaptured',
          messages: [{ value: JSON.stringify({ amount: payload.amount, currency: payload.currency }) }],
        });
        await producer.disconnect();
      } catch (e) {
        // Emit failure event for retries or manual review
        const producer = kafka.producer();
        await producer.connect();
        await producer.send({
          topic: 'fintech.events.PaymentFailed',
          messages: [{ value: JSON.stringify({ error: String(e), payload }) }],
        });
        await producer.disconnect();
      }
    },
  });
}

This is simplified, but it mirrors real patterns: async flow, idempotency at the API, outbox for reliable events, and an orchestrator for compensations. In practice, you’d add circuit breakers, metrics, and alerting for PSP calls.

Common mistakes I’ve seen

  • Skipping idempotency. Duplicate payments will happen. If your API or webhook doesn’t handle them, reconciliation becomes a nightmare.
  • Treating events as commands. Events should reflect facts that have happened, not instructions. Keep them immutable and append-only.
  • Overusing event sourcing too early. It’s powerful but adds complexity. Start with a ledger and simple projections; add event sourcing when you need replayability.
  • Tight coupling via synchronous calls between domains. A payment service shouldn’t depend on user service being up for non-critical reads. Use asynchronous events or caches with TTL.
  • Not planning for schema evolution. Events and API contracts must be versioned. Changing a field’s meaning breaks downstream consumers.

Getting started: setup and mental models

If you’re building a fintech service, start with three pillars:

  • A ledger for balances and movement.
  • Idempotent APIs with clear contracts.
  • A reliable eventing layer (outbox pattern).

For a Node.js + PostgreSQL + Kafka stack, your setup workflow looks like this:

  • Define domain events and API schemas first.
  • Implement the ledger with strict transaction boundaries.
  • Add the outbox table and a relay worker.
  • Build projections for read models.
  • Choose orchestration or choreography for multi-step flows.
  • Add observability: structured logs, metrics for PSP latency, and traces across services.
  • Write integration tests that simulate network failures and duplicate requests.

A minimal Docker setup helps with local development:

# docker-compose.yml (excerpt)
version: '3.8'
services:
  db:
    image: postgres:15
    environment:
      POSTGRES_USER: dev
      POSTGRES_PASSWORD: dev
      POSTGRES_DB: fintech
    ports:
      - "5432:5432"
  kafka:
    image: confluentinc/cp-kafka:7.5
    depends_on:
      - zookeeper
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
    ports:
      - "9092:9092"
  zookeeper:
    image: confluentinc/cp-zookeeper:7.5
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181

For a project structure, start with clear boundaries and grow gradually:

/src
  /core
    ledger.ts
    payments.ts
  /events
    outbox.ts
    definitions.ts
  /read-models
    statementProjection.ts
  /api
    v1
      payments.ts
    v2
      payments.ts
  /worker
    outboxRelay.ts
    orchestrator.ts
  /adapters
    psp
      providerA.ts
      providerB.ts

Mental model:

  • Treat money movement as a ledger-first domain.
  • Events are facts, commands are actions, and sagas orchestrate compensation.
  • Read models are derived, never the source of truth.
  • Idempotency is a boundary, not an afterthought.

What stands out about this approach

  • Correctness: Ledger models and double-entry rules prevent data corruption.
  • Auditability: Events and outbox provide a clear trail for regulators and reconciliation.
  • Adaptability: Swapping PSPs or adding new rails is easier with adapters and events.
  • Maintainability: Clear domain boundaries and versioned contracts reduce friction.
  • Resilience: Orchestration with retries and compensations handles inevitable failures.

The developer experience is improved because the code mirrors business concepts: payments, accounts, ledgers, and risks. You can trace a problem from a failed PSP call to a saga step to an event to a ledger entry. That traceability is invaluable during incidents.

Free learning resources

Summary: Who should use these patterns and who might skip them

You should adopt ledger-first design, idempotency, and event-driven patterns if you:

  • Build core banking, payments, or any money-moving service.
  • Need auditability and regulatory reporting.
  • Operate with external integrations and unreliable networks.
  • Want resilience across multiple failure modes.

You might skip or defer these patterns if you:

  • Are early-stage and not yet handling real money movement.
  • Have a single-purpose tool without multi-step orchestration.
  • Need ultra-low-latency, single-service architectures for specialized workloads (e.g., trading).

In the end, fintech architecture is about balance: correctness vs speed, consistency vs availability, simplicity vs scale. Start with a ledger and idempotent APIs, add events when you need decoupling, and bring in orchestration when flows get complex. These patterns won’t eliminate bugs, but they’ll give you a clear path to find, fix, and prevent them.