Pratique du Shadowing: Learn how to SOLVE a data analytics case study problem - Apprendre l'anglais à l'oral avec YouTube

C1
and then we ask the interviewer hey is that good
⏸ En pause
239 phrases
Si les phrases sont trop courtes ou trop longues, cliquez sur Edit pour les ajuster.
1
and then we ask the interviewer hey is that good
2
or not
3
and they're like no do something a little bit more complicated
4
hi everyone today i'm going to solve a data analytics case
5
study problem first we're going to go over what constitutes a
6
data analytics problem a framework for how to tackle it
7
and then a step-by-step explanation of diving into it
8
so first off what is a data analytics case study for our purposes i've defined data analytics case study problems
9
as any problem that requires 1 formulating metrics for a hypothetical scenario
10
and 2 writing a SQL query or analyzing a dataset in Python to retrieve those metrics.
11
For example,
12
the problem we're tackling today involves understanding how to build a
13
new audio chat feature to improve the match rate between car buyers and sellers in a marketplace app.
14
Now if you're thinking that data analytics case study questions sound very similar to product metrics type questions,
15
you're probably right.
16
And for all intents and purposes,
17
most product metrics questions fall under the umbrella of data analytics.
18
It's just that product metrics questions are more focused around product case studies.
19
So for example, it'd be something like,
20
how would you investigate a 10% drop in Uber ride requests, right?
21
Or let's say that we want to launch a new feature for Uber,
22
how would you analyze results?
23
So all those product metrics questions are focused around a specific product itself
24
and so data analytics can be a little bit more broad than
25
that the second main difference is
26
that data analytics case studies questions we actually expect the candidate to use sql
27
or python to actually implement their metrics whereas in product metrics
28
case study questions you're usually just kind of brainstorming different ideas
29
clarifying the question giving out structured analysis without actually diving into the data itself
30
because there is no data it's a hypothetical scenario
31
so data analytics case studies what you'll find is
32
that they'll actually give you an actual data set many times this will happen in a take-home assignment
33
or an official phone screen
34
or even on an on-site where the interviewer will basically just give you a data set
35
and a python jupiter notebook and be like go to hand it's kind of nice
36
because basically it kills two birds with one stone
37
if you're an interviewer they're trying to assess how you do on sequel
38
and the same time they can also get a sense of your product intuition
39
The main issue with these is that if you have bad product intuition and you then work on a SQL query,
40
you're either going to make your life really difficult by working on a SQL problem that's extremely difficult
41
or on the entirely wrong problem.
42
So let's just dive into this question and then see how it goes.
43
All right.
44
An online marketplace company has introduced a new feature
45
that allows potential buyers and sellers to conduct audio chats with each other prior to transacting.
46
Let's say we have two tables that represents this data.
47
So we have a chats table and we have a marketplaces purchases table.
48
How would you measure the success of this new feature?
49
Write a query that can represent if this feature is successful or not.
50
How do we tackle this problem?
51
So the first thing we have to do is to treat this like it's a classic product metrics,
52
just any kind of analytics case study, right?
53
So we're doing stuff like we're clarifying,
54
we're assessing requirements, requirements, we're validating our solution.
55
Before that, we're proposing a solution and then we're validating it.
56
Yeah, let's just start out with that first.
57
And, you know, given the time constraints of most of these interviews,
58
I would say the most optimal thing
59
that we're going to be doing is actually be proposing one metric and immediately coding that up.
60
I think in every instance,
61
we also want to keep this metric pretty simple.
62
I see a lot of candidates shoot themselves in the foot by first proposing like a really complicated metric
63
and then
64
when they actually have to code it up in SQL they're
65
just spending you know like 40 minutes the rest of the time trying to get to a solution.
66
The best way and I think always the best way to approach this is always iteration right
67
so we iterate on our approaches we propose a simple solution
68
and then we code it up and we do that
69
and we do that fast and then we ask the interviewer hey is
70
that good or not and they're like no do something a little bit more complicated
71
and then they'll be like okay yeah so i'll do something new now
72
so again let's try to iterate on all of our solutions here
73
and start out by keeping it simple kiss framework right keep
74
it simple stupid the second tip is to keep the metric super flexible for further analysis okay
75
so if we start out with just one simple solution then we can actually make it flexible
76
so that later down the line
77
we are also not cheating ourselves on the foot again by having to rewrite our entire SQL query.
78
We want to be able to make it so
79
that we can actually write it in a way that is flexible for further solutions down the line.
80
Given this problem and given
81
that we have to understand how a feature basically allows users to conduct audio chats with each other,
82
try this again is just to visualize exactly what the output is that I want.
83
So I want something that says use audio chat,
84
right and then if it is a one or a zero then i want also purchase completion rates
85
and then this is going to be something like you know 50 and then this will be something like
86
25 right so this is the output that i want to see right
87
if i'm a pm i want to see that oh people
88
that use the audio chat completed their purchases at a higher rate than people
89
that did not use audio chat okay
90
so awesome we visualize this output we come up with this
91
metric this thing i want to note in our analysis is
92
that you know
93
if we'd like to expand the analysis we can actually analyze it by the call duration
94
or the number of calls and so uh conversations
95
that probably have on average three audio chats
96
or 2x the total length of call time then other conversations are more likely to have transactions completed than something
97
that has you know on average one audio chat
98
and then like 30 seconds of call time instead of one minute of call time on average
99
and those have a lower transaction rate let's say we're analyzing
100
call chat time basically like 30 seconds 60 seconds 90 seconds
101
and then we look at our metric that we care about which is purchase
102
completion rates 30 percent 50 percent 69 basically we're seeing
103
that you know on average if we bucket our call chat time
104
and as the chat time goes longer the purchase completion rates
105
go higher then probably this audio feature is then working for us
106
so i just want to say that out of all of our metrics
107
that we put together one thing
108
that we do have to keep in mind is normalizing the data
109
so usually we have to compare two equally interested groups of conversational buyers
110
so that we know
111
that the audio chat feature is actually making the difference it's not the fact
112
that because i make more calls i'm already inherently more likely to complete a transaction
113
and so a lot of that normalization you know we can't really do
114
that with our existing data set so i would just make
115
that assumption up front so that you get extra brownie points with your interviewer
116
and you know most of the analytics problems are all causal inference
117
so
118
if you mention you know you know causal inference they're like
119
oh cool you know what causal inference is let me check off this box right to be honest
120
that is really what a lot of interviewers and interviews are like
121
because of the fact that they have to do
122
so many interviews they're just trying to speed it up so they can get to their lunch break
123
all right so let's just solve for this uh simple calculation first right we just want to know
124
if they use the audio chat did they actually increase their
125
purchase completion rate first i'm gonna write select from
126
chats left join marketplace purchases as c as
127
mp on c dot buyer user id equals mp.buyer user id look at
128
that autocomplete oh my god we need to pay our engineers more
129
user id is equal to mp.seller user id uh group by call connected
130
so when we group by the call connected uh
131
so as you can see we're doing double join right we
132
want to join the buyer to the buyer of the chat
133
and to the buyer of the marketplace we want to join
134
the seller of the chat to the the seller of the marketplace.
135
Then we want to group by if the call was connected or not.
136
And then we want to grab the distinct marketplace ID divided by the distinct number of chats.
137
And then I can just run this query.
138
Hopefully it'll run.
139
I have an error.
140
Of course, I have an error.
141
I didn't comment this out.
142
Let me comment this out.
143
Cool.
144
Awesome.
145
So as you can see below,
146
and I'll with my video.
147
The number of calls that were connected resulted in a higher conversion rate.
148
Obviously, this is fake data,
149
so of course it was going to.
150
So we're good, except I spot an error in our SQL query.
151
And if you want to pause the video,
152
maybe you guys can try to find it first before I go on.
153
Okay, let's unpause it.
154
So where is the error in my SQL query?
155
Basically, what I noticed was that If you look at chats,
156
technically you could have more than one chat with the same person, right?
157
So if I call this person once,
158
twice, they don't pick up.
159
Then I call the seller three times, they finally pick up.
160
Then technically, I'm actually grouping by call connected,
161
which would then mess up my data, right?
162
Because I'd have two zeros and then a one.
163
So if I group by that,
164
then this wouldn't really work out and this data is wrong.
165
So what I have to do instead is group by the distinct number of chats and then rejoin it.
166
So let's do that again.
167
So instead, let me think.
168
I still think I can reuse some of my data here.
169
So I'd say let's group by the buyer user ID,
170
the seller user ID.
171
And then let's run actually a function in here where we want to know if there was at least one connection.
172
So instead, our actual formula means use audio chat at least once.
173
Because here, you know, technically,
174
if I'm like talking to this guy and I make two purchases,
175
you know, without the audio chat,
176
but then I use the audio chat in a separate purchase and I use it at least once,
177
but then I chat with him,
178
like create like three different chats with that same guy.
179
I want to know if at least using it once influenced
180
and gives me a higher purchase completion rate than never using it ever.
181
This is the bare bones analysis.
182
If I use this feature at least once, does it do anything?
183
Does it change up the completion rates at all?
184
So here we're going to just run a max on call connected.
185
So this means that for every buyer-seller combo,
186
did they at least connect on a call at least once?
187
So I'll label this at least one connection, call connected.
188
Then next I'll just do count distinct mp.id same as this one divided by the
189
Actually no so instead I'll just use I will also just group by 1 and 2 So as total purchases
190
Then I'll wrap this in a subquery with distinct chat as this.
191
So we're grabbing the total purchases,
192
this is actually completed transaction.
193
So we know that we completed this transaction.
194
So now I want to group by this value and I can do that with the CTE.
195
So I'm grouping by this.
196
This is going to be distinct chats.
197
And then group by this value, one.
198
And then now I can actually just look at the average number of completed transactions.
199
And then the sum of completed transactions.
200
Okay, cool.
201
So why did I make this completed transaction?
202
So basically for every combo,
203
what this query does is that it basically finds exactly if there is at least one call connected,
204
and then if there is a completed transaction.
205
So for example, you know,
206
one buyer and seller, they didn't connect in a call, no completed transaction.
207
Let's say they did complete at least one call and they did complete a transaction.
208
So this would be effectively just,
209
you know, one because they're only doing one transaction either way so once we take the average of this
210
we're taking the average of ones and zeros and
211
so that's how we're getting this value of 0.435 or sorry 4.35
212
and 3.92 that's this that's the data analytics case question i
213
think this is a great kind of solution for now i
214
think one last thing i'd like to do is like why don't we try to
215
and do an analysis and see if like more call completions lead to more purchase transaction rates.
216
So kind of that scenario that I was drawing above,
217
except this one is like if we have three audio chats instead of two audio chats instead of one audio chat,
218
like does the three audio chats lead to more completed transactions than the one audio chat?
219
If it does, then it probably means that audio chats are beneficial for this business.
220
So I'll actually leave that for the exercise for you guys.
221
If you guys want to see the answer,
222
just go to interview query,
223
go to the link below,
224
and then you'll see the answer in the solution bar right here,
225
because I'm going to finish that before this video comes out officially.
226
And I hope this was helpful.
227
I hope this video isn't all over the place.
228
I hope I have a really good video editor that helps me out with this stuff.
229
And please remember to like and subscribe on this video
230
so I can go through this exercise of doing more videos for you,
231
even though, you know, I'm jumping back into it.
232
So it's been a painful experience so far,
233
but I'm hoping I'll get better and better at it.
234
I just need your guys' support.
235
Awesome.
236
All right?
237
Cool.
238
Thanks.
239
Bye.

Télécharger l'application

Notation IA pour chaque phrase que vous prononcez

TRENDING

Populaires

Context & Background

In this video, the speaker dives into the intricacies of solving a data analytics case study problem, particularly in the context of a new audio chat feature implemented in an online marketplace for buying and selling cars. The focus is on formulating metrics and utilizing SQL and Python to analyze datasets, which aligns perfectly with the growing need for data literacy in various fields. By exploring this topic, learners not only strengthen their analytical skills but also enhance their English language proficiency through specialized vocabulary and structured problem-solving techniques.

Top 5 Phrases for Daily Communication

  • Formulating metrics: This phrase is essential for discussing how to measure success in any given scenario.
  • Dive into it: A common expression suggesting a deep exploration of a subject or problem.
  • Analyze a dataset: This phrase is crucial when discussing data-driven decision-making.
  • Calculate success: A straightforward way to inquire about the effectiveness of an initiative.
  • Piece together the information: This expression indicates the process of gathering and understanding data.

Step-by-step Shadowing Guide

To effectively improve your English-speaking skills while grasping the complexities of data analytics, follow this structured shadowing technique:

  1. Watch and Listen: Start by watching the video a few times. Focus on the phrases and terminology used, such as "formulating metrics" and "analyze a dataset." This will help build your vocabulary related to data analytics.
  2. Repeat and Shadow: Use the shadowspeak method by repeating phrases immediately after hearing them. This not only improves your pronunciation but also reinforces your understanding of each term's context in a practical scenario like a data analytics case study.
  3. Break It Down: Take the time to analyze how the speaker approaches the problem. Write down the main steps taken to solve the case study, as this reinforces your ability to articulate similar processes in English.
  4. Practice Speaking: Engage in IELTS speaking practice by discussing the case study with a partner or speaking aloud. Focus on conveying the logic behind metrics formulation and how to analyze success with SQL and Python.
  5. Record and Review: Record yourself explaining the case study and the metrics used. Listen to the recording to identify areas for improvement in fluency and clarity, adapting your shadow speech skills to communicate effectively.

By applying this structured approach, you’ll not only enhance your English proficiency but also gain valuable analytical skills essential in today's data-driven world.

Qu'est-ce que la technique du Shadowing ?

Le Shadowing est une technique d'apprentissage des langues fondée sur la science, développée à l'origine pour la formation des interprètes professionnels. Le principe est simple mais puissant : vous écoutez de l'anglais natif et le répétez immédiatement à voix haute — comme une ombre suivant le locuteur avec un décalage de 1 à 2 secondes. Les recherches montrent une amélioration significative de la précision de la prononciation, de l'intonation, du rythme, des liaisons, de la compréhension orale et de la fluidité.

Offrez-nous un café