rponte/avoid-distributed-transactions.md

Last active May 5, 2025 05:19

Star (15) You must be signed in to star a gist
Fork (3) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/rponte/9477858e619d8b986e17771c8be7827f.js"></script>
Save rponte/9477858e619d8b986e17771c8be7827f to your computer and use it in GitHub Desktop.

Download ZIP

THEORY: Distributed Transactions and why you should avoid them (2 Phase Commit , Saga Pattern, TCC, Idempotency etc)

Raw

avoid-distributed-transactions.md

Distributed Transactions and why you should avoid them

Modern technologies won't support it (RabbitMQ, Kafka, etc.);
This is a form of using Inter-Process Communication in a synchronized way and this reduces availability;
All participants of the distributed transaction need to be avaiable for a distributed commit, again: reduces availability.

Implementing business transactions that span multiple services is not straightforward. Distributed transactions are best avoided because of the CAP theorem. Moreover, many modern (NoSQL) databases don’t support them. The best solution is to use the Saga Pattern.

[...]

One of the most well-known patterns for distributed transactions is called Saga. The first paper about it was published back in 1987 and has it been a popular solution since then.

There are a couple of different ways to implement a saga transaction, but the two most popular are:

Events/Choreography: When there is no central coordination, each service produces and listen to other service’s events and decides if an action should be taken or not;
Command/Orchestration: when a coordinator service is responsible for centralizing the saga’s decision making and sequencing business logic;

Author

rponte commented Sep 2, 2024

YugabyteDB avoids bloat with the outbox pattern

Author

rponte commented Sep 5, 2024 •

edited

Loading

⭐️ Fault-Tolerance and Data Consistency Using Distributed Sagas
- It’s important to note that a compensating transaction Ci does not necessarily return the database to the original state it was in before the transaction Ti was run. The compensation only needs to be semantically equivalent.

Author

rponte commented Sep 10, 2024 •

edited

Loading

Exactly-once message processsing

Distributed algorithms are difficult. If you find yourself struggling to understand one of them, we assure you – you are not alone. We have spent last couple of years researching ways to ensure exactly-once message processing in systems that exchange messages in an asynchronous and durable way (a.k.a. message queues) and you know what? We still struggle and make silly mistakes. The reason is that even a very simple distributed algorithm generates vast numbers of possible execution paths.

Very good article, Exactly-once intuition, about a set of heuristics that are very helpful in sketching the structure of an algorithm to achieve exactly-once message processing. Below there're a summary of those heuristics:

The transaction and the side effects: The outcome of processing a message consists of two parts. There is a transactional part and a side effects part. The transaction consists of application state change and of marking the incoming message as processed. The side effects include things like creating objects in non-transactional data stores (e.g. uploading a blob) and sending messages.;
Until the transaction is committed, nothing happened: In order for an algorithm to behave correctly, it has to guarantee that until a transaction is committed, no effects of the processing are visible to the external observers.
Prepare - Commit - Publish: [...] For this reason any correct algorithm has to make sure the side effects are made durable, but not visible (prepared), before the transaction is committed. Then, after the commit, the side affects are published.
Side effects stored in each processing attempt are isolated: [...] In our PDF example each processing attempt would generate its own PDF document but only the attempt that succeeded to commit would publish its outgoing messages, announcing to the world the true location of the PDF.
Register - Cleanup: Although we can’t avoid generating garbage, a well-behaved algorithm ensures that the garbage is eventually cleaned up.
Concurrency control ensures serialization of processing: [...] the outbox record also contains the side effects information. It can exist in only two states: created and dispatched. The transition from created to dispatched does not generate any new information so it does not require concurrency control to prevent lost writes.

Author

rponte commented Sep 16, 2024

Não confie em Exactly-Once Message Processing

Author

rponte commented Sep 18, 2024

Scaling Shared Data in Distributed Systems

Consistency, by definition, requires linearizability. In multi-threaded programs, we achieve this with mutexes. In distributed systems, we use transactions and distributed locking. Intuitively, both involve performance trade-offs.
There are several different strategies, each with their own pros and cons: Immutable Data > Last-Write Wins > Application-Level Conflict Resolution > Causal Ordering > Distributed Data Types
Use weakly consistent models when you can because they afford you high availability and low latency, and rely on stronger models only when absolutely necessary. Do what makes sense for your system.

Author

rponte commented Oct 11, 2024 •

edited

Loading

Some interesting articles from LittleHorse blog:

While Saga is very hard to implement, it's simple to describe:

Try to perform the actions across the multiple systems.

If one of the actions fails, then run a compensation for all previously-executed tasks.

The compensation is simply an action that "undoes" the previous action. For example, the compensation for a payment task might be to issue a refund.

The Basics of Workflow

But what is a workflow engine?

It is a system that allows you to reliably execute a series of steps while being robust to technical failures (network outages, crashes) and business process failures. A step in a workflow can be calling a piece of code on a server, reaching out to an external API, waiting for a callback from a person or external system, or more.

A core challenge when automating a business process is Failure and Exception Handling: figuring out what to do when something doesn't happen, happens with an unexpected outcome, or plain simply fails. This is often difficult to reason about, leaving your applications vulnerable to uncaught exceptions, incomplete business workflows, or data loss.

A workflow engine standardizes how to throw an exception, where the exception is logged, and the logic around when/how to retry. This gives you peace of mind that once a workflow run is started, it will reliably complete.

Author

rponte commented Nov 6, 2024 •

edited

Loading

Sequin

Sequin is a tool for capturing changes and streaming data out of your Postgres database.

No such thing as exactly-once delivery
- Processing is the the full message lifecycle: the message was delivered to the receiver, the receiver did its job, and then the receiver acknowledged the message.
  
  With that definition, SQS, Kafka, and Sequin are all systems that guarantee exactly-once processing. The term processing captures both the delivery of the message and the successful acknowledgment of the message.
- In my mind, the terms at-most-once and at-least-once delivery help us distinguish between delivery mechanics. And the term "exactly-once processing" indicates it's a messaging system with at-least-once delivery and acknowledgments.
- A debate over a Github issue - At the end of the day, perfect exactly-once mechanics are a platonic ideal. And a system can only bring you so far, at some point you must implement idempotency on the client if your requirements demand it.

Author

rponte commented Nov 18, 2024

The Two Generals problem in TCP

Author

rponte commented Nov 18, 2024 •

edited

Loading

Excellent content about Oubox Pattern and Change-Data-Capture (CDC) written and recommended by Gunnar Morling:

⭐️ Revisiting the Outbox Pattern (2024) - by Gunnar Morling

💡 Reliable Microservices Data Exchange With the Outbox Pattern
💡 Event-driven architecture on the modern stack of Java technologies
Tweet by Martin Kleppmann about "How do you define the order of items in the outbox?"
How Postgres sequences issues can impact your messaging guarantees
Debezium docs: Basic outbox table example
The Wonders of Postgres Logical Decoding Messages for CDC: pg_logical_emit_message()
- What if we send the outbox event directly to WAL instead of having an outbox table? 🤔
Incremental Snapshots in Debezium
- To learn more about this approach to concurrent incremental backfilling;
How-To: Enable the transactional outbox pattern
- Article about how Dapr implements the Outbox pattern without an Outbox table;
Stop Overusing the Outbox Pattern
- Instead of writing to the database, write only to the broker;
- The downside of this approach is that there are no synchronous read-your-own-write guarantees.
  This can make for confusing user experiences, if for instance a user can’t see their appointment right after creating it.
  Also other tasks which are trivial when writing to a database first—such as enforcing unique constraints—are becoming challenging with this approach.
  While some techniques for mitigation exist, it’s not an approach I’d recommend due to the inherent complexities.
⭐️ InfoQ: Saga Orchestration for Microservices Using the Outbox Pattern - by Gunnar Morling
- Using the outbox pattern and a bit of glue code, it is possible to orchestrate Sagas, long-running business transactions, across distributed services.
- [...] from the perspective of the overall Saga, the isolation level is comparable to “read uncommitted.”
- As said above, a Saga coordinator should preferably communicate asynchronously with participating services, via request and reply message channels. Apache Kafka is a super-popular choice for implementing these channels.
- When implementing complex patterns like Sagas, it’s vital to exactly understand their constraints and semantics. Two things to be aware of in the context of the proposed solution are the inherent eventual consistency and the limited isolation level of the overarching business transaction.
- Slides: Practical Change Data Streaming Use Cases With Apache Kafka and Debezium (QCon San Francisco 2019)
- ⭐️ Github: Repository with example of implementing a Saga pattern via Outbox pattern, Kafka e Debezium
"Friends don’t let friends do dual writes" - by Gunnar Morling

Author

rponte commented Nov 19, 2024

Tweet by Raul Junco - "Idempotency is underrated and under-implemented."

Author

rponte commented Nov 26, 2024 •

edited

Loading

Confluent content

This article presents a very didactic step-by-step text about dual-write, and Outbox Pattern:

⭐️ Confluent: Solving the Dual-Write Problem: Effective Strategies for Atomic Updates Across Systems

Problem: Trying to write to both a database and Kafka topic at the same time;
Solution 1: Order of operations;
Solution 2: Database transactions;
Solution 3: Retries in memory (fails on crashes or restarts);
Solution 4: Retries with different storage (same issue: dual write);
Solution 5: Transactional Outbox Pattern;
Solution 6: Event sourcing;
Solution 7: The listen-to-yourself pattern;
Solution 8: Other solutions - 2PC, XA Transactions and Saga;

⭐️ Some very good videos from Designing Event-Driven Microservices course about distributed system patterns:

⭐️ The Dual Write Problem
- Solution 1: The Transactional Outbox Pattern
- Solution 2: Event Sourcing
- Solution 4: The Listen to Yourself Pattern
- Case Study - Solving the Dual-Write Problem: very good video (~7min) explaining the dual write problem and how to solve it with Outbox pattern or Event Sourcing;

Author

rponte commented Nov 26, 2024 •

edited

Loading

💡 Nobody Needs Reliable Messaging
- if reliability is important on the business level, do it on the business level.
⭐️ Reliable Messaging Without Distributed Transactions - by Udi Dahan | video
💡 Outbox Pattern: Avoiding dual writes in event-driven applications
- Red Hat wrote article on how to implement the Outbox pattern with Java, JPA/Hibernate, CloudEvents, PostgreSQL, and Kafka with Debezium;
The Challenges of Event-Driven Architecture: Dealing with the Dual Write Anti-Pattern

Author

rponte commented Dec 2, 2024

Video: Reliable Messaging Without Distributed Transactions

Udi Dahan explains how you can still do reliable messaging even if you can't use (or don't want) distributed transactions (DTC).

Author

rponte commented Dec 2, 2024 •

edited

Loading

Reliable Messaging in Distributed Systems by Aleksei (@fairday)

In this article, we reviewed several approaches for building reliable messaging in distributed systems. There are several recommendations we might consider while building systems with these characteristics

Always develop idempotent consumers since network failure is unavoidable.
Carefully use the First-Local-Commit-Then-Publish with a clear understanding of guarantee requirements.
Never use the First-Publish-Then-Local-Commit approach since it may lead to severe data inconsistency in your system.
If existing database choice decision very likely may change or technical strategy implies to select the best storage solution for the problem — don’t build shared libraries by binding to database solutions like CDC.
Use the Transactional Outbox approach as a standard solution for achieving at least once guarantees.
Consider using the Listen to yourself approach when No-SQL databases are leveraged.

Author

rponte commented Dec 3, 2024 •

edited

Loading

Strict Ordering and Best-Effort Ordering

Tweet do Marcelo Costa no Bluesky:

Regras que adoto:

Strict ordering: Se a ordem das mensagens for crítica para o sucesso do processamento;

Best effort ordering: Se vc prioriza desempenho e escalabilidade e pode lidar com reordenação;

Lidar com reordenação —> aplicar alguma técnica de SEDA.
Tem um mundo ai para explorar

Comparação entre Strict Ordering e Best-Effort Ordering

Tweet do Marcelo Costa no Bluesky

Tem uma tentativa de livro que eu e uns amigos estamos tentando publicar que a gente fala sobre isso:

Author

rponte commented Dec 4, 2024 •

edited

Loading

A good example of designing an Outbox(er) component in Go:
Achieving reliable dual writes in distributed systems

In my opinion, they basically implemented a job scheduler under the hood:

Author

rponte commented Dec 25, 2024 •

edited

Loading

The Reactive Principles

These are the foundational principles of implementing distributed systems that make an application Reactive:

II. Accept Uncertainty

As soon as we cross the boundary of the local machine, or of the container, we enter a vast and endless ocean of nondeterminism: the world of distributed systems. It is a scary world in which systems can fail in the most spectacular and intricate ways, where information becomes lost, reordered, and corrupted, and where failure detection is a guessing game. It’s a world of uncertainty.

This has a lot of implications: we can’t always trust time as measured by clocks and timestamps, or order (causality new tab might not even exist). Accepting this uncertainty, we have to use strategies to cope with it. For example: rely on logical clocks new tab (such as vector clocks new tab); when appropriate use eventual consistency new tab (e.g. certain NoSQL new tab databases and CRDTs new tab); and make sure our communication protocols are associative new tab (batch-insensitive), commutative new tab (order-insensitive), and idempotent new tab (duplication-insensitive).

The key is to manage uncertainty directly in the application architecture. To design resilient autonomous components that publish their protocols to the world—protocols that clearly define what they can promise, what commands and events will be accepted, and, as a result of that, what behavior will trigger and how the data model should be used. The timeliness and assessed accuracy of underlying information should be visible to other components where appropriate so that they — or the end-user — can judge the reliability of the current system state.

VI. Decouple Time

[...] This is a fragile assumption in the context of distributed systems, where we can’t ensure the availability or reachability of all components in a system at all times. By introducing temporal decoupling in our communication protocols, one component does not need to assume and require the availability of the other components. It makes the components more independent and autonomous and, as a consequence, the overall system more reliable. Popular techniques to implement temporal decoupling include durable message queues, append-only journals, and publish-subscribe topics with a retention duration.

With temporal decoupling, we give the caller the option to perform other work, asynchronously new tab, rather than be blocked waiting on the resource to become available. This can be achieved by allowing the caller to put its request on a queue, register a callback new tab to be notified later, return immediately, and continue execution (e.g., non-blocking I/O new tab). A great way to orchestrate callbacks is to use a Finite State Machine new tab (FSM), other techniques include Futures/Promises new tab, Dataflow Variables, new tab Async/Await new tab, Coroutines new tab, and composition of asynchronous functional combinators new tab in streaming libraries.

Author

rponte commented Dec 25, 2024 •

edited

Loading

💡 Top insights (and one point I don't agree with) about the Gregor Hohpe's article "Event-driven = Loosely coupled? Not so fast!":

TLDR
This is fantastic post from Gregor for reading slowly and deep thinking. It gets to the core of buzzwords such as EDA and Coupling used daily without really analysing direct and second-level consequences.

https://www.linkedin.com/posts/bibryam_i-love-blog-posts-that-make-me-think-gregor-activity-7218910524107878400-ZMy_/

I love blog posts that make me think. Gregor Hohpe hits the nail on the head 100% with this one. Here are the top insights (and one point I don't agree with):

Messages vs. Events
Messaging is an interaction style, whereas "Event" describes the semantics of a message.
Commands, Events, Documents are all Messages.

Messages vs. Channels
We must delineate which characteristics of an event-driven system derive from the properties of the channel rather than the message intent.
Message + Channel = Some Decoupling

Events vs. Coupling
This is the core of the article, introduces a unique perspective about coupling for the different interaction styles. We can now explore the Interaction Styles (RPC, P2P, Pub/Sub) with different Coupling types (change propagation).

Now vs Later
I love the second-level effects analysis too

Folks pitching EDA as decoupled may be the ones early in the lifecycle (like vendors) and won't have to live with the consequences of their (coupling) decisions. The more you take advantage of your ability to add recipients, the harder it will become to make a change. Thus, taking advantage of one dimension (add recipients) of decoupling exposes you more to another dimension of coupling (ex: schema changes).

⭐️ 5. What I don't fully agree with Gregor

"Location Coupling propagates location changes of a recipient to a sender (or vice versa). Message channels decouple location changes because they are (mostly) logical constructs... RPC can achieve some decoupling through DNS or active load balancers, but it's much more likely that a change affects the sender."

⭐️ 5.1.
IMO, one of the main reasons for using RPC is the decoupling of service consumer and provider, so they can evolve in isolation but also start/stop, scale, move, independently. Otherwise, there’s no reason to split a monolith into services and in-memory calls should be preferred over RPC.

⭐️ 5.2.
Today, no caller has a hard-coded callee address; it’s almost always a logical name:

Kubernetes service name, not Pod IP

Dapr sidecar/Istio proxy, with the logical name of the target service

AWS API Gateway address, rather than Lambda URL

Client-side service discovery that looks up the location, still not affecting the client

IMO, RPC today always involves some kind of proxy

⭐️ 5.3.
In all these cases, the provider can scale based on demand or move without affecting the client. You might say the client needs to know the seed location (K8s service, AWS gateway, sidecar, etc.), but that’s true for the location of the messaging channel too. Even if the topic is a logical name the broker URL is with fixed location.

Author

rponte commented Dec 31, 2024

How Complex Systems Fail

5. Complex systems run in degraded mode.
A corollary to the preceding point is that complex systems run as broken systems. The system continues to function because it contains so many redundancies and because people can make it function, despite the presence of many flaws. [...]

16. Safety is a characteristic of systems and not of their components
Safety is an emergent property of systems; it does not reside in a person, device or department of an organization or system. Safety cannot be purchased or manufactured; it is not a feature that is separate from the other components of the system.

Author

rponte commented Jan 3, 2025 •

edited

Loading

Outbox Pattern - by Unico

The interesting part is they use Protobuf as a content type when sending events to the broker. Still, for some reason that's unclear in the article, they serialize this Protobuf data into JSON format before persisting it in the outbox table. I guess they do so because they use Debezium under the hood.
They also use the CloudEvents (v1.0.2) spec for defining the format of event data;

This is the Protobuf message using the CloudEvent spec:

syntax = "proto3";
import "google/protobuf/timestamp.proto";
import "google/protobuf/any.proto";  
message OutboxEvent {
  string specversion = 1;  
  string type = 2;  
  string source = 3;  
  string subject = 4;  
  string id = 5;  
  google.protobuf.Timestamp time = 6;  
  string datacontenttype = 7;  
  string dataschema = 8;  
  google.protobuf.Any data = 9;
}

And this is an example:

{
  "specversion": "1.0",
  "type": "someevent",
  "source": "integration",
  "subject": "1ec07712-79b7-485a-a0e2-0a1c33fd1016",
  "time": "2020-04-30T04:00:00Z",
  "datacontenttype": "application/json",
  "dataschema": "http://<schemapath>",
  "data": {
    "transactionId": "1ec07712-79b7-485a-a0e2-0a1c33fd1016",
    "doc": "123.123.123-00",
    "image_id": "ea02254f-28f4-4b31-99a5-957bb024f78d"
  }
}

Author

rponte commented Jan 16, 2025

Fidelis blog: System Design - Saga Pattern 🇧🇷 - artigo sobre Saga e Outbox Pattern escrito pelo Matheus Fidelis.

Author

rponte commented Feb 7, 2025

Do Caos a Consistência: A Ordem das Mensagens em Sistemas Distribuídos

Author

rponte commented Mar 26, 2025

River's blog: Building an idempotent email API with River unique jobs

Author

rponte commented May 3, 2025 •

edited

Loading

⭐️ (slides) Definition of Insanity: timeouts, retries and idempotency - by Sam Newman

Author

rponte commented May 3, 2025 •

edited

Loading

Thread on Twitter (X) by Qian Li:

Durable workflow timeouts

Timeouts are essential for building efficient and resilient systems. They help prevent systems from waiting indefinitely and free up resources while maintaining responsiveness under heavy load.

For example, suppose your server must finish a task within 30 minutes, but some operations are taking much longer to complete. Even if they eventually succeed, the response will still miss the deadline — wasting resources in the process. In such cases, proactively cancelling on timeout is the right choice.

DBOS docs: Workflow Timeouts

rponte/avoid-distributed-transactions.md

Distributed Transactions and why you should avoid them

rponte commented Sep 2, 2024

Uh oh!

rponte commented Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented Sep 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Exactly-once message processsing

Uh oh!

rponte commented Sep 16, 2024

Uh oh!

rponte commented Sep 18, 2024

Uh oh!

rponte commented Oct 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented Nov 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Sequin

Uh oh!

rponte commented Nov 18, 2024

Uh oh!

rponte commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented Nov 19, 2024

Uh oh!

rponte commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Confluent content

Uh oh!

rponte commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented Dec 2, 2024

Uh oh!

rponte commented Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented Dec 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Strict Ordering and Best-Effort Ordering

Comparação entre Strict Ordering e Best-Effort Ordering

Uh oh!

rponte commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented Dec 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The Reactive Principles

II. Accept Uncertainty

VI. Decouple Time

Uh oh!

rponte commented Dec 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💡 Top insights (and one point I don't agree with) about the Gregor Hohpe's article "Event-driven = Loosely coupled? Not so fast!":

Uh oh!

rponte commented Dec 31, 2024

Uh oh!

rponte commented Jan 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented Jan 16, 2025

Uh oh!

rponte commented Feb 7, 2025

Uh oh!

rponte commented Mar 26, 2025

Uh oh!

rponte commented May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented Sep 5, 2024 •

edited

Loading

rponte commented Sep 10, 2024 •

edited

Loading

rponte commented Oct 11, 2024 •

edited

Loading

rponte commented Nov 6, 2024 •

edited

Loading

rponte commented Nov 18, 2024 •

edited

Loading

rponte commented Nov 26, 2024 •

edited

Loading

rponte commented Nov 26, 2024 •

edited

Loading

rponte commented Dec 2, 2024 •

edited

Loading

rponte commented Dec 3, 2024 •

edited

Loading

rponte commented Dec 4, 2024 •

edited

Loading

rponte commented Dec 25, 2024 •

edited

Loading

rponte commented Dec 25, 2024 •

edited

Loading

rponte commented Jan 3, 2025 •

edited

Loading

rponte commented May 3, 2025 •

edited

Loading

rponte commented May 3, 2025 •

edited

Loading