Skip to content

Instantly share code, notes, and snippets.

@rponte
Last active May 3, 2025 15:54
Show Gist options
  • Save rponte/9477858e619d8b986e17771c8be7827f to your computer and use it in GitHub Desktop.
Save rponte/9477858e619d8b986e17771c8be7827f to your computer and use it in GitHub Desktop.
THEORY: Distributed Transactions and why you should avoid them (2 Phase Commit , Saga Pattern, TCC, Idempotency etc)

Distributed Transactions and why you should avoid them

  1. Modern technologies won't support it (RabbitMQ, Kafka, etc.);
  2. This is a form of using Inter-Process Communication in a synchronized way and this reduces availability;
  3. All participants of the distributed transaction need to be avaiable for a distributed commit, again: reduces availability.

Implementing business transactions that span multiple services is not straightforward. Distributed transactions are best avoided because of the CAP theorem. Moreover, many modern (NoSQL) databases don’t support them. The best solution is to use the Saga Pattern.

[...]

One of the most well-known patterns for distributed transactions is called Saga. The first paper about it was published back in 1987 and has it been a popular solution since then.

There are a couple of different ways to implement a saga transaction, but the two most popular are:

  • Events/Choreography: When there is no central coordination, each service produces and listen to other service’s events and decides if an action should be taken or not;
  • Command/Orchestration: when a coordinator service is responsible for centralizing the saga’s decision making and sequencing business logic;
@rponte
Copy link
Author

rponte commented Jan 3, 2025

Outbox Pattern - by Unico

  • The interesting part is they use Protobuf as a content type when sending events to the broker. Still, for some reason that's unclear in the article, they serialize this Protobuf data into JSON format before persisting it in the outbox table. I guess they do so because they use Debezium under the hood.
  • They also use the CloudEvents (v1.0.2) spec for defining the format of event data;

This is the Protobuf message using the CloudEvent spec:

syntax = "proto3";
import "google/protobuf/timestamp.proto";
import "google/protobuf/any.proto";  
message OutboxEvent {
  string specversion = 1;  
  string type = 2;  
  string source = 3;  
  string subject = 4;  
  string id = 5;  
  google.protobuf.Timestamp time = 6;  
  string datacontenttype = 7;  
  string dataschema = 8;  
  google.protobuf.Any data = 9;
}

And this is an example:

{
  "specversion": "1.0",
  "type": "someevent",
  "source": "integration",
  "subject": "1ec07712-79b7-485a-a0e2-0a1c33fd1016",
  "time": "2020-04-30T04:00:00Z",
  "datacontenttype": "application/json",
  "dataschema": "http://<schemapath>",
  "data": {
    "transactionId": "1ec07712-79b7-485a-a0e2-0a1c33fd1016",
    "doc": "123.123.123-00",
    "image_id": "ea02254f-28f4-4b31-99a5-957bb024f78d"
  }
}

@rponte
Copy link
Author

rponte commented Jan 16, 2025

Fidelis blog: System Design - Saga Pattern 🇧🇷 - artigo sobre Saga e Outbox Pattern escrito pelo Matheus Fidelis.

@rponte
Copy link
Author

rponte commented Feb 7, 2025

@rponte
Copy link
Author

rponte commented Mar 26, 2025

@rponte
Copy link
Author

rponte commented May 3, 2025

@rponte
Copy link
Author

rponte commented May 3, 2025

Thread on Twitter (X) by Qian Li:

Durable workflow timeouts

Timeouts are essential for building efficient and resilient systems. They help prevent systems from waiting indefinitely and free up resources while maintaining responsiveness under heavy load.

For example, suppose your server must finish a task within 30 minutes, but some operations are taking much longer to complete. Even if they eventually succeed, the response will still miss the deadline — wasting resources in the process. In such cases, proactively cancelling on timeout is the right choice.

DBOS docs: Workflow Timeouts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment