쉐도잉 연습: Learn how to SOLVE a data analytics case study problem - YouTube로 영어 말하기 배우기

C1
and then we ask the interviewer hey is that good
⏸ 일시 정지
239 문장
문장이 너무 짧거나 길면 Edit를 눌러 조정하세요.
1
and then we ask the interviewer hey is that good
2
or not
3
and they're like no do something a little bit more complicated
4
hi everyone today i'm going to solve a data analytics case
5
study problem first we're going to go over what constitutes a
6
data analytics problem a framework for how to tackle it
7
and then a step-by-step explanation of diving into it
8
so first off what is a data analytics case study for our purposes i've defined data analytics case study problems
9
as any problem that requires 1 formulating metrics for a hypothetical scenario
10
and 2 writing a SQL query or analyzing a dataset in Python to retrieve those metrics.
11
For example,
12
the problem we're tackling today involves understanding how to build a
13
new audio chat feature to improve the match rate between car buyers and sellers in a marketplace app.
14
Now if you're thinking that data analytics case study questions sound very similar to product metrics type questions,
15
you're probably right.
16
And for all intents and purposes,
17
most product metrics questions fall under the umbrella of data analytics.
18
It's just that product metrics questions are more focused around product case studies.
19
So for example, it'd be something like,
20
how would you investigate a 10% drop in Uber ride requests, right?
21
Or let's say that we want to launch a new feature for Uber,
22
how would you analyze results?
23
So all those product metrics questions are focused around a specific product itself
24
and so data analytics can be a little bit more broad than
25
that the second main difference is
26
that data analytics case studies questions we actually expect the candidate to use sql
27
or python to actually implement their metrics whereas in product metrics
28
case study questions you're usually just kind of brainstorming different ideas
29
clarifying the question giving out structured analysis without actually diving into the data itself
30
because there is no data it's a hypothetical scenario
31
so data analytics case studies what you'll find is
32
that they'll actually give you an actual data set many times this will happen in a take-home assignment
33
or an official phone screen
34
or even on an on-site where the interviewer will basically just give you a data set
35
and a python jupiter notebook and be like go to hand it's kind of nice
36
because basically it kills two birds with one stone
37
if you're an interviewer they're trying to assess how you do on sequel
38
and the same time they can also get a sense of your product intuition
39
The main issue with these is that if you have bad product intuition and you then work on a SQL query,
40
you're either going to make your life really difficult by working on a SQL problem that's extremely difficult
41
or on the entirely wrong problem.
42
So let's just dive into this question and then see how it goes.
43
All right.
44
An online marketplace company has introduced a new feature
45
that allows potential buyers and sellers to conduct audio chats with each other prior to transacting.
46
Let's say we have two tables that represents this data.
47
So we have a chats table and we have a marketplaces purchases table.
48
How would you measure the success of this new feature?
49
Write a query that can represent if this feature is successful or not.
50
How do we tackle this problem?
51
So the first thing we have to do is to treat this like it's a classic product metrics,
52
just any kind of analytics case study, right?
53
So we're doing stuff like we're clarifying,
54
we're assessing requirements, requirements, we're validating our solution.
55
Before that, we're proposing a solution and then we're validating it.
56
Yeah, let's just start out with that first.
57
And, you know, given the time constraints of most of these interviews,
58
I would say the most optimal thing
59
that we're going to be doing is actually be proposing one metric and immediately coding that up.
60
I think in every instance,
61
we also want to keep this metric pretty simple.
62
I see a lot of candidates shoot themselves in the foot by first proposing like a really complicated metric
63
and then
64
when they actually have to code it up in SQL they're
65
just spending you know like 40 minutes the rest of the time trying to get to a solution.
66
The best way and I think always the best way to approach this is always iteration right
67
so we iterate on our approaches we propose a simple solution
68
and then we code it up and we do that
69
and we do that fast and then we ask the interviewer hey is
70
that good or not and they're like no do something a little bit more complicated
71
and then they'll be like okay yeah so i'll do something new now
72
so again let's try to iterate on all of our solutions here
73
and start out by keeping it simple kiss framework right keep
74
it simple stupid the second tip is to keep the metric super flexible for further analysis okay
75
so if we start out with just one simple solution then we can actually make it flexible
76
so that later down the line
77
we are also not cheating ourselves on the foot again by having to rewrite our entire SQL query.
78
We want to be able to make it so
79
that we can actually write it in a way that is flexible for further solutions down the line.
80
Given this problem and given
81
that we have to understand how a feature basically allows users to conduct audio chats with each other,
82
try this again is just to visualize exactly what the output is that I want.
83
So I want something that says use audio chat,
84
right and then if it is a one or a zero then i want also purchase completion rates
85
and then this is going to be something like you know 50 and then this will be something like
86
25 right so this is the output that i want to see right
87
if i'm a pm i want to see that oh people
88
that use the audio chat completed their purchases at a higher rate than people
89
that did not use audio chat okay
90
so awesome we visualize this output we come up with this
91
metric this thing i want to note in our analysis is
92
that you know
93
if we'd like to expand the analysis we can actually analyze it by the call duration
94
or the number of calls and so uh conversations
95
that probably have on average three audio chats
96
or 2x the total length of call time then other conversations are more likely to have transactions completed than something
97
that has you know on average one audio chat
98
and then like 30 seconds of call time instead of one minute of call time on average
99
and those have a lower transaction rate let's say we're analyzing
100
call chat time basically like 30 seconds 60 seconds 90 seconds
101
and then we look at our metric that we care about which is purchase
102
completion rates 30 percent 50 percent 69 basically we're seeing
103
that you know on average if we bucket our call chat time
104
and as the chat time goes longer the purchase completion rates
105
go higher then probably this audio feature is then working for us
106
so i just want to say that out of all of our metrics
107
that we put together one thing
108
that we do have to keep in mind is normalizing the data
109
so usually we have to compare two equally interested groups of conversational buyers
110
so that we know
111
that the audio chat feature is actually making the difference it's not the fact
112
that because i make more calls i'm already inherently more likely to complete a transaction
113
and so a lot of that normalization you know we can't really do
114
that with our existing data set so i would just make
115
that assumption up front so that you get extra brownie points with your interviewer
116
and you know most of the analytics problems are all causal inference
117
so
118
if you mention you know you know causal inference they're like
119
oh cool you know what causal inference is let me check off this box right to be honest
120
that is really what a lot of interviewers and interviews are like
121
because of the fact that they have to do
122
so many interviews they're just trying to speed it up so they can get to their lunch break
123
all right so let's just solve for this uh simple calculation first right we just want to know
124
if they use the audio chat did they actually increase their
125
purchase completion rate first i'm gonna write select from
126
chats left join marketplace purchases as c as
127
mp on c dot buyer user id equals mp.buyer user id look at
128
that autocomplete oh my god we need to pay our engineers more
129
user id is equal to mp.seller user id uh group by call connected
130
so when we group by the call connected uh
131
so as you can see we're doing double join right we
132
want to join the buyer to the buyer of the chat
133
and to the buyer of the marketplace we want to join
134
the seller of the chat to the the seller of the marketplace.
135
Then we want to group by if the call was connected or not.
136
And then we want to grab the distinct marketplace ID divided by the distinct number of chats.
137
And then I can just run this query.
138
Hopefully it'll run.
139
I have an error.
140
Of course, I have an error.
141
I didn't comment this out.
142
Let me comment this out.
143
Cool.
144
Awesome.
145
So as you can see below,
146
and I'll with my video.
147
The number of calls that were connected resulted in a higher conversion rate.
148
Obviously, this is fake data,
149
so of course it was going to.
150
So we're good, except I spot an error in our SQL query.
151
And if you want to pause the video,
152
maybe you guys can try to find it first before I go on.
153
Okay, let's unpause it.
154
So where is the error in my SQL query?
155
Basically, what I noticed was that If you look at chats,
156
technically you could have more than one chat with the same person, right?
157
So if I call this person once,
158
twice, they don't pick up.
159
Then I call the seller three times, they finally pick up.
160
Then technically, I'm actually grouping by call connected,
161
which would then mess up my data, right?
162
Because I'd have two zeros and then a one.
163
So if I group by that,
164
then this wouldn't really work out and this data is wrong.
165
So what I have to do instead is group by the distinct number of chats and then rejoin it.
166
So let's do that again.
167
So instead, let me think.
168
I still think I can reuse some of my data here.
169
So I'd say let's group by the buyer user ID,
170
the seller user ID.
171
And then let's run actually a function in here where we want to know if there was at least one connection.
172
So instead, our actual formula means use audio chat at least once.
173
Because here, you know, technically,
174
if I'm like talking to this guy and I make two purchases,
175
you know, without the audio chat,
176
but then I use the audio chat in a separate purchase and I use it at least once,
177
but then I chat with him,
178
like create like three different chats with that same guy.
179
I want to know if at least using it once influenced
180
and gives me a higher purchase completion rate than never using it ever.
181
This is the bare bones analysis.
182
If I use this feature at least once, does it do anything?
183
Does it change up the completion rates at all?
184
So here we're going to just run a max on call connected.
185
So this means that for every buyer-seller combo,
186
did they at least connect on a call at least once?
187
So I'll label this at least one connection, call connected.
188
Then next I'll just do count distinct mp.id same as this one divided by the
189
Actually no so instead I'll just use I will also just group by 1 and 2 So as total purchases
190
Then I'll wrap this in a subquery with distinct chat as this.
191
So we're grabbing the total purchases,
192
this is actually completed transaction.
193
So we know that we completed this transaction.
194
So now I want to group by this value and I can do that with the CTE.
195
So I'm grouping by this.
196
This is going to be distinct chats.
197
And then group by this value, one.
198
And then now I can actually just look at the average number of completed transactions.
199
And then the sum of completed transactions.
200
Okay, cool.
201
So why did I make this completed transaction?
202
So basically for every combo,
203
what this query does is that it basically finds exactly if there is at least one call connected,
204
and then if there is a completed transaction.
205
So for example, you know,
206
one buyer and seller, they didn't connect in a call, no completed transaction.
207
Let's say they did complete at least one call and they did complete a transaction.
208
So this would be effectively just,
209
you know, one because they're only doing one transaction either way so once we take the average of this
210
we're taking the average of ones and zeros and
211
so that's how we're getting this value of 0.435 or sorry 4.35
212
and 3.92 that's this that's the data analytics case question i
213
think this is a great kind of solution for now i
214
think one last thing i'd like to do is like why don't we try to
215
and do an analysis and see if like more call completions lead to more purchase transaction rates.
216
So kind of that scenario that I was drawing above,
217
except this one is like if we have three audio chats instead of two audio chats instead of one audio chat,
218
like does the three audio chats lead to more completed transactions than the one audio chat?
219
If it does, then it probably means that audio chats are beneficial for this business.
220
So I'll actually leave that for the exercise for you guys.
221
If you guys want to see the answer,
222
just go to interview query,
223
go to the link below,
224
and then you'll see the answer in the solution bar right here,
225
because I'm going to finish that before this video comes out officially.
226
And I hope this was helpful.
227
I hope this video isn't all over the place.
228
I hope I have a really good video editor that helps me out with this stuff.
229
And please remember to like and subscribe on this video
230
so I can go through this exercise of doing more videos for you,
231
even though, you know, I'm jumping back into it.
232
So it's been a painful experience so far,
233
but I'm hoping I'll get better and better at it.
234
I just need your guys' support.
235
Awesome.
236
All right?
237
Cool.
238
Thanks.
239
Bye.

앱 다운로드

당신이 말하는 모든 문장을 AI가 채점

TRENDING

인기 동영상

이 비디오로 말하기 연습을 하는 이유

이 비디오는 데이터 분석 사례 연구 문제를 해결하는 방법에 대해 다루고 있습니다. 이러한 주제를 통해 영어 학습자는 실제 문제 해결 과정에서 사용되는 전문 용어와 구문을 학습할 수 있으며, 더 나아가 실무에서의 대화 상황을 시뮬레이션하는 좋은 기회를 제공합니다. shadow speak 기법을 활용하면 비디오의 내용을 반복하여 연습함으로써 영어 회화 능력을 향상시킬 수 있습니다. 이 과정은 특히 IELTS 스피킹 시험과 같은 공식적인 시험 준비에도 큰 도움이 됩니다. 또한, 데이터 분석이라는 주제를 통해 비즈니스와 관련된 영어 표현을 배우고, 그로 인해 더욱 유익한 대화가 가능해질 것입니다.

맥락에서의 문법 및 표현

이번 비디오에서 중요한 몇 가지 문법적 구조와 표현을 살펴보겠습니다:

  • “How to tackle it”: 이 표현은 문제를 해결하는 방법을 설명하는 데 사용됩니다. 'tackle'은 일반적으로 문제가 발생했을 때 그 문제를 처리하겠다는 의미로, 실제 대화에서도 자주 쓰이는 표현입니다.
  • “Measure the success”: 성공을 측정하는 방법을 논의할 때 사용되는 표현입니다. 'measure'라는 동사는 진행 중인 활동의 효과를 평가할 때 매우 유용합니다.
  • “Write a query”: 데이터 분석에서 쿼리를 작성하는 과정은 기초적인 기술 중 하나입니다. 실제 대화에서도 이러한 기술적인 용어는 매우 중요하게 다가올 수 있습니다.

흔한 발음 함정

비디오에서 자주 사용되는 몇 가지 발음과 억양에서 흔히 발생할 수 있는 함정들을 살펴보겠습니다:

  • “SQL”: 이 용어는 '에스큐엘' 또는 '시퀄'로 발음될 수 있습니다. 두 가지 발음 모두 사용되므로, 상황에 따라 자신감 있게 발음하는 것이 중요합니다.
  • “Analytics”: 이 단어의 발음은 특히 어려울 수 있습니다. '애널리틱스'라고 호흡을 자연스럽게 나누어 발음해야 이해하기 쉽습니다.
  • “Feature”: 이 단어는 '피쳐' 혹은 '피춰어'로 발음되는 경우가 있습니다. 명확한 발음이 중요하므로 연습이 필요합니다.

이러한 함정들을 인지하고 연습함으로써 shadow speak 기법을 효과적으로 활용할 수 있습니다. 실질적인 영어 회화 연습을 통해 더욱 자신감을 가질 수 있을 것입니다.

쉐도잉이란? 영어 실력을 빠르게 키우는 과학적 방법

쉐도잉(Shadowing)은 원래 전문 통역사 훈련을 위해 개발된 언어 학습 기법으로, 다언어 학자인 Dr. Alexander Arguelles에 의해 대중화된 방법입니다. 핵심 원리는 간단하지만 매우 강력합니다: 원어민의 영어를 들으면서 1~2초의 짧은 지연으로 즉시 소리 내어 따라 말하는 것——마치 '그림자(shadow)'처럼 화자를 따라가는 것입니다. 문법 공부나 수동적인 청취와 달리, 쉐도잉은 뇌와 입 근육이 동시에 실시간으로 영어를 처리하고 재현하도록 훈련합니다. 연구에 따르면 이 방법은 발음 정확도, 억양, 리듬, 연음, 청취력, 말하기 유창성을 크게 향상시킵니다. IELTS 스피킹 준비와 자연스러운 영어 소통을 원하는 분들에게 특히 효과적입니다.

커피 한 잔 사주기