シャドーイング練習: Distributed Transactions Explained: 2 Phase Commit vs Saga Pattern - YouTubeで英語スピーキングを学ぶ

C1
When you're first building an application, transactions are easy.
⏸ 一時停止中
277
文が短すぎたり長すぎる場合は、Editをタップして調整してください。
1
When you're first building an application, transactions are easy.
2
You have a database, and when a customer places an order,
3
you wrap the whole thing in the transaction.
4
Charge their card, reserve the inventory,
5
record a ledger entry for accounting.
6
If any of those rights fails,
7
the database rolls everything back automatically,
8
and it's just like nothing happened.
9
You probably don't even think about it much.
10
Your database gives you what are called asset guarantees,
11
which basically means two things that matter here.
12
First, atomicity.
13
Either all three of those rights happen together,
14
or none of them do.
15
There's no world where the card gets charged,
16
but the inventory doesn't get reserved.
17
Second is isolation.
18
While that transaction is in progress,
19
no other part of your system can see the half-finished state.
20
Another query checking the customer's balance won't see a charge for an order that hasn't fully processed yet.
21
The database handles all of this behind the scenes,
22
and you just write your SQL and move on.
23
But when your application grows,
24
you start to get more traffic, more data, more writes.
25
And eventually, that single database starts hitting its limits.
26
So you do what everyone does at this point.
27
You split things up.
28
Maybe you shard the database to spread write load across multiple machines,
29
or maybe you break up your monolith into microservices,
30
where each service now owns its own database.
31
The specifics can vary, but the result is the same.
32
Your data now lives on multiple independent machines instead of one.
33
And at this point, everything changes.
34
The payment flow that used to be one transaction against one database
35
is now three completely separate operations against three separate databases on three separate machines.
36
The card gets charged to the payment database,
37
inventory gets reserved in the inventory database,
38
and that ledger entry gets recorded in the accounting database.
39
You can't wrap a transaction across these independent databases because they don't know about each other.
40
So if the card charge commits,
41
but then the inventory reservation fails because the item is out of stock,
42
There's no database-level rollback that can undo that charge.
43
It's already committed and a completely different database on a completely different machine.
44
When you're processing thousands of transactions a second across distributed infrastructure,
45
partial failures like these aren't edge cases,
46
they actually become pretty routine.
47
Now this whole class of problem is what's called a distributed transaction.
48
A single logical operation that needs to span multiple independent databases or services,
49
where all the steps need to either succeed together or be cleaned up when something goes wrong.
50
The textbooks give us two approaches to distributed transactions,
51
two-phase commit and a SOGA pattern.
52
In practice, the industry has overwhelmingly chosen one over the other,
53
and understanding why will save you a lot of pain.
54
The two-phase commit is the classic academic solution to distributed transactions.
55
The idea is to introduce a new component called a coordinator whose entire job is to make sure
56
that all participants in a transaction agree on the outcome before any make their changes permanent.
57
It works in two phases,
58
which is where the name comes from of course.
59
In the first phase, called the prepare phase,
60
the coordinator sends a message to every participant asking,
61
can you commit this transaction?
62
Each participating database then does the actual work.
63
It processes the request, durably records the changes so that nothing is lost if it crashes,
64
and locks the affected rows so that no other transactions can modify them in the meantime.
65
Then it responds to the coordinator with either yes,
66
I'm ready to commit, or no, something went wrong.
67
If any single participant votes no,
68
the coordinator tells everyone to abort and release their locks.
69
If every participant votes yes on the other hand,
70
then the coordinator moves to phase 2.
71
It sends a commit message to everyone.
72
Each participant makes its changes permanent and releases those locks,
73
and the transaction is now complete.
74
What this gives you is strong consistency,
75
the same guarantee you had with a single database.
76
Every participant agrees on the outcome before anything is finalized,
77
so there's no window where the system is in a partial or inconsistent state.
78
On paper, this is exactly what you want,
79
but the problems show up when you try to run this in production.
80
The fundamental problem with two-phase commit or 2PC is that it's a blocking protocol,
81
and blocking in a distributed system is dangerous because you're now dependent on multiple machines all staying healthy at the same time.
82
I want you to picture this.
83
The coordinator collects all three yes votes from the participants,
84
but then it crashes.
85
Right there after collecting the votes,
86
but before it gets a chance to send the commit decision.
87
Now the participants are all stuck.
88
They're all sitting there with locks held on the rows they prepared and they have no idea what to do next.
89
They can't go ahead and commit on their own,
90
because maybe the coordinator was about to tell them to abort.
91
They can't abort on their own either,
92
because maybe the coordinator was about to tell them to commit and the other participants already went through with it.
93
So they just wait.
94
And every other transaction in your system that needs to touch any of those locked rows is now blocked too,
95
waiting for the locks that nobody can release.
96
Crashes aren't even the only problem with 2PC.
97
A single slow participant holds up the entire transaction.
98
So if the ledger service,
99
for example, were to take 10 seconds to respond to the prepared message,
100
the card service
101
and the inventory service are both just sitting there with their locks held for those full 10 seconds doing nothing.
102
That means the entire system moves at the speed of the slowest participant.
103
And if a network partition means the coordinator can't reach the participant at all,
104
there's no safe default.
105
It can't tell whether the message got through or not.
106
This is why almost nobody uses two-phase commit across services in productions.
107
Pat Helland wrote a really influential paper called Life Beyond Distributed Transactions.
108
In that paper, he argues exactly this point.
109
Distributed transactions across autonomous services don't work at internet scale.
110
The industry took this lesson to heart.
111
2PC does exist in production,
112
but only inside distributed databases like Google Spanner or YugoBiteDB,
113
where the coordinator and the participants are tightly coupled within the same system.
114
This is where the database handles the complexity internally so that you as the caller don't have to.
115
But across independent services with different deployment schedules and different failure characteristics,
116
that's where it all falls apart.
117
So what should you do instead?
118
Well, when companies need to coordinate work across multiple services,
119
the saga pattern is what they reach for.
120
Uber, Netflix, Amazon, DoorDash, they all use this pattern in production.
121
Sagas start from a very different assumption than 2PC.
122
You don't actually need all or nothing atomicity spanning multiple services.
123
You just need a way to eventually get to a consistent state,
124
even when things go wrong along the way.
125
Instead of coordinating one big distributed transaction with locks held across services,
126
you break the work into a chain of independent local transactions.
127
So each service does its piece of work and commits to its own database on its own terms.
128
When something fails further down the chain,
129
there's no way to roll back to earlier steps since they've already been committed to that separate database.
130
So instead, you run what is called a compensating action.
131
These are business-level undoes that reverse the effects of what already happened.
132
So a refund instead of a rollback,
133
a cancellation instead of an abort.
134
Something needs to detect that failure and trigger those compensations
135
and how that works is a key design decision we'll get into in just a moment.
136
But the trade-off is that instead of getting the strong consistency you get with 2PC,
137
Saga gives you what's called eventual consistency.
138
The system might be temporarily in an inconsistent state while compensations are running.
139
A customer might briefly see a charge on their card before the refund goes through,
140
but it always converges to a correct state and nothing is blocked while that convergence is happening.
141
Other transactions can keep flowing normally that entire time.
142
Now there are two ways to implement sagas,
143
and the choice between them determines who is responsible for detecting failures and running compensations.
144
The first approach is called choreography,
145
and it's the decentralized option.
146
It uses a publish subscribe pattern where each service broadcasts an event when it finishes its work,
147
and any interested service can pick it up and react.
148
So the card service charges the card and then publishes a card charged event.
149
The inventory service is listening for that event,
150
so when it arrives, it reserves the stock and publishes an inventory reserved event.
151
The ledger service picks that up and then records the entry.
152
If something fails, the failing service publishes a failure event,
153
and the upstream services react by running their own compensations.
154
This works well when you have a simple flow with just two or three steps.
155
But once you get to five or six services all publishing and reacting to each other's events,
156
figuring out the current state of any given transaction becomes really difficult.
157
Where exactly did it fail?
158
Which compensating actions have already run?
159
Did the refund actually go through?
160
Without a central place tracking all of this,
161
you end up digging through logs across a dozen different services,
162
trying to piece together what happened.
163
The second approach is called orchestration,
164
and it's what most teams end up using once they reach any serious scale.
165
Instead of services reacting to each other's events,
166
you have a dedicated orchestrator service that controls the entire flow.
167
It tells each service what to do one step at a time.
168
Card service, charge the card.
169
It waits for confirmation.
170
Inventory service, reserve the stock.
171
Wait for confirmation.
172
If something fails, the orchestrator knows exactly what steps failed and can run the right compensating action in the right order.
173
Tools like Temporal, which was created by the engineer behind Uber's Cadence workflow engine,
174
or AWS Step Functions, are purpose-built for exactly this kind of orchestration.
175
It's what we use here at Hello Interview to coordinate our payment and fulfillment flows,
176
and it's the pattern we'd recommend that most teams use.
177
The important difference between Saga orchestration and 2PC coordinators is what happens when it crashes.
178
The orchestrator doesn't leave locks dangling across your system.
179
It's durable.
180
When it restarts, it reads its own state from a database and it picks up exactly where it left off.
181
So no rows are locked in the meantime and no other transactions are blocked during that recovery period.
182
Saga solved the blocking problem that makes 2PC impractical,
183
but they introduce a different kind of complexity.
184
the compensating action themselves.
185
The idea of just undo the previous step sounds clean,
186
but in practice, it gets pretty messy quickly.
187
Say the card charge went through and committed,
188
and then the inventory reservation fails because the item is out of stock.
189
The compensation is to issue a refund on the card,
190
but unlike a database rollback,
191
that refund is visible to the customer.
192
They see an actual charge show up on their card,
193
and then a few seconds later, they see a refund.
194
Their bank might even send them a push notification for each one.
195
It works correctly, but it's not the invisible cleanup that a database rollback gives you.
196
And some actions are genuinely hard to undo at all.
197
If one of the steps in your flow is to send a confirmation email,
198
you can't unsend that email.
199
If you fired a webhook to a third-party payment system,
200
you can send a follow-up cancellation,
201
but you can't guarantee that they'll process it in time or at all.
202
Each step in your saga needs a well-defined compensating action,
203
and some of those compensations are inherently imperfect.
204
On top of that, compensating actions can themselves fail.
205
What if the refund API is down when you need to issue that refund?
206
Now you need retry logic for your compensations.
207
And if you're retrying a refund,
208
you need the operation to be idempotent,
209
meaning it produces the same result whether you run it once or 10 times,
210
so that a retry doesn't accidentally refund the customer twice.
211
You end up needing the same level of reliability engineering for your failure handling as you do for your happy paths.
212
Even with solid compensation logic in place,
213
there's one more failure mode that catches teams off guard.
214
When your card service finishes charging the card,
215
it needs to do two things.
216
Save the result to its own database
217
and publish an event to a message broker so that the next service in the chain knows it's time to proceed.
218
The problem is that those are two completely separate writes to two completely separate systems.
219
This is called the dual write problem.
220
If the database write succeeds but the event published fails,
221
the next step in your saga never gets triggered and the whole flow stalls.
222
If the event publishes successfully but the database write fails,
223
now downstream services are reacting to something that didn't actually happen in your database.
224
You can fix this with something called a transactional outbox pattern.
225
Instead of writing to your database and publishing an event as two separate operations,
226
you write both your data and the outgoing event into the same database via a single local transaction.
227
The event goes into a special outboxed table right alongside your regular data at right,
228
so they either both commit or neither do.
229
Then, a separate background process watches the outboxed table and publishes those events to your message broker.
230
That background process can use change data capture,
231
which means it tails the database's own transaction logs to pick up new entries,
232
or it can simply pull the outboxed table on a regular interval.
233
Before you reach for either of these patterns,
234
the first question to ask yourself is whether you actually need a distributed transaction at all.
235
If you can design your service boundaries so that the data that transacts together lives in the same database, do that.
236
This is easier to get right up front than to try to retrofit later,
237
so it's worth thinking about early.
238
For example, move the inventory and ledger table into the same database that has the payments table,
239
if they always need to be updated together.
240
This way a local database transaction is simpler,
241
faster, and more reliable than any distributed alternative,
242
and this is always the best answer when you can make it work.
243
If you genuinely can't avoid distributing the transaction across services,
244
you're going to use a saga.
245
That's not really a debate in the industry anymore.
246
The question is which flavor of saga makes sense for your situation.
247
Choreography is usually where teams start,
248
and for simple flows it works great.
249
If you have three or four steps,
250
the services are truly independent,
251
and you don't need centralized visibility into where each transaction stands,
252
choreography keeps things simple.
253
Something like an e-commerce notification system where an order is placed,
254
event triggers, an email or push notification independently,
255
this is a natural fit.
256
Most teams eventually outgrow it as their flows get more complex,
257
but there's no reason to over-engineer from day one.
258
For anything more complex than that,
259
orchestration is the way to go.
260
Complex flows with branching logic,
261
flows where you need to see exactly where a transaction is stuck,
262
flows where the compensation logic is tricky
263
and you want it defined in one clear place rather than scattered across half a dozen services.
264
Most teams end up here,
265
and tools like temporal or AWS step functions make it very practical to implement.
266
One last note, if eventual consistency truly is not acceptable for a particular piece of your system,
267
consider whether that data can live in a single distributed database like Spanner
268
or YugaByteDB that handle that strong consistency internally for you.
269
That's a very different thing from trying to build 2PC yourself across independent services.
270
At the end of the day,
271
the pattern that you'll see at most companies operating at scale is saga with orchestration,
272
independent operations at every step so the retries are always safe,
273
and a transactional outbox to make sure events are as reliable as database rights.
274
It means accepting eventual consistency,
275
but that's a trade-off the industry has made deliberately.
276
And it's the architecture that Uber,
277
Netflix, and Amazon actually run in production today.

アプリをダウンロード

話したすべての文をAIが採点

スキャンしてダウンロード
スキャンしてダウンロード
TRENDING

人気動画

文脈と背景

この動画では、アプリケーションの開発を始めた時のトランザクション管理の基本について説明されています。データベースに頼って、すべての操作は単一のトランザクション内で行われ、エラーが発生した際には自動的に元に戻されます。しかし、アプリケーションが成長し、データが複数のデータベースに分かれると、トランザクションの管理が複雑になり、部分的な失敗が通常の現象となります。ここでは、分散トランザクションの重要性とその解決策としての「2フェーズコミット」と「サガパターン」について学ぶことができます。

日常コミュニケーションのためのトップ5フレーズ

  • トランザクション: "When a customer places an order, you wrap the whole thing in the transaction."
  • 整合性: "You can't wrap a transaction across these independent databases because they don't know about each other."
  • ロールバック: "The database rolls everything back automatically."
  • 分散トランザクション: "This whole class of problem is what's called a distributed transaction."
  • 操作の一貫性: "All the steps need to either succeed together or be cleaned up when something goes wrong."

ステップ・バイ・ステップのシャドーイングガイド

このビデオの内容は、英語における技術的な言語を含み、挑戦的ですが、効果的なスピーキング練習の材料として活用できます。以下は、このビデオを通じて英語スピーキング練習をするためのステップです。

  1. まず、動画を通して一度視聴し、全体の雰囲気を理解しましょう。
  2. 次に、動画の重要なフレーズをメモし、それを繰り返して声に出してみてください。特にshadow speakのテクニックを用いて、正確なリズムとイントネーションをマスターしましょう。
  3. 各フレーズを一文ずつ分け、動画を一時停止しながら発音を確認します。この過程で、部分的な失敗がどのように英語で表現されるかを理解することがポイントです。
  4. 最後に、IELTSスピーキング対策として、再度フレーズを自分の言葉で説明してみて、理解度を深めましょう。

このプロセスを続けることで、シャドーイング能力が向上し、自然な英語の流暢さを得る助けになるでしょう。shadowspeakを通じて練習を重ね、自信を持って会話を楽しんでください。

シャドーイングとは?英語上達に効果的な理由

シャドーイング(Shadowing)は、もともとプロの通訳者養成プログラムで開発された言語学習法で、多言語習得者として知られるDr. Alexander Arguelles によって広く普及されました。方法はシンプルですが非常に効果的:ネイティブスピーカーの英語を聞きながら、1〜2秒の遅延で声に出してすぐに繰り返す——まるで「影(shadow)」のように話者を追いかけます。文法ドリルや受動的なリスニングと異なり、シャドーイングは脳と口の筋肉が同時にリアルタイムで英語を処理・再現することを強制します。研究により、発音精度、抑揚、リズム、連音、リスニング力、そして会話の流暢さが大幅に向上することが確認されています。IELTSスピーキング対策や自然な英語コミュニケーションを目指す方に特におすすめです。

コーヒーをおごる