シャドーイング練習: Learn how to SOLVE a data analytics case study problem - YouTubeで英語スピーキングを学ぶ

and then we ask the interviewer hey is that good

⏸ 一時停止中

再生速度:

239 文

文が短すぎたり長すぎる場合は、Editをタップして調整してください。

and then we ask the interviewer hey is that good

or not

and they're like no do something a little bit more complicated

hi everyone today i'm going to solve a data analytics case

study problem first we're going to go over what constitutes a

data analytics problem a framework for how to tackle it

and then a step-by-step explanation of diving into it

so first off what is a data analytics case study for our purposes i've defined data analytics case study problems

as any problem that requires 1 formulating metrics for a hypothetical scenario

and 2 writing a SQL query or analyzing a dataset in Python to retrieve those metrics.

For example,

the problem we're tackling today involves understanding how to build a

new audio chat feature to improve the match rate between car buyers and sellers in a marketplace app.

Now if you're thinking that data analytics case study questions sound very similar to product metrics type questions,

you're probably right.

And for all intents and purposes,

most product metrics questions fall under the umbrella of data analytics.

It's just that product metrics questions are more focused around product case studies.

So for example, it'd be something like,

how would you investigate a 10% drop in Uber ride requests, right?

Or let's say that we want to launch a new feature for Uber,

how would you analyze results?

So all those product metrics questions are focused around a specific product itself

and so data analytics can be a little bit more broad than

that the second main difference is

that data analytics case studies questions we actually expect the candidate to use sql

or python to actually implement their metrics whereas in product metrics

case study questions you're usually just kind of brainstorming different ideas

clarifying the question giving out structured analysis without actually diving into the data itself

because there is no data it's a hypothetical scenario

so data analytics case studies what you'll find is

that they'll actually give you an actual data set many times this will happen in a take-home assignment

or an official phone screen

or even on an on-site where the interviewer will basically just give you a data set

and a python jupiter notebook and be like go to hand it's kind of nice

because basically it kills two birds with one stone

if you're an interviewer they're trying to assess how you do on sequel

and the same time they can also get a sense of your product intuition

The main issue with these is that if you have bad product intuition and you then work on a SQL query,

you're either going to make your life really difficult by working on a SQL problem that's extremely difficult

or on the entirely wrong problem.

So let's just dive into this question and then see how it goes.

All right.

An online marketplace company has introduced a new feature

that allows potential buyers and sellers to conduct audio chats with each other prior to transacting.

Let's say we have two tables that represents this data.

So we have a chats table and we have a marketplaces purchases table.

How would you measure the success of this new feature?

Write a query that can represent if this feature is successful or not.

How do we tackle this problem?

So the first thing we have to do is to treat this like it's a classic product metrics,

just any kind of analytics case study, right?

So we're doing stuff like we're clarifying,

we're assessing requirements, requirements, we're validating our solution.

Before that, we're proposing a solution and then we're validating it.

Yeah, let's just start out with that first.

And, you know, given the time constraints of most of these interviews,

I would say the most optimal thing

that we're going to be doing is actually be proposing one metric and immediately coding that up.

I think in every instance,

we also want to keep this metric pretty simple.

I see a lot of candidates shoot themselves in the foot by first proposing like a really complicated metric

and then

when they actually have to code it up in SQL they're

just spending you know like 40 minutes the rest of the time trying to get to a solution.

The best way and I think always the best way to approach this is always iteration right

so we iterate on our approaches we propose a simple solution

and then we code it up and we do that

and we do that fast and then we ask the interviewer hey is

that good or not and they're like no do something a little bit more complicated

and then they'll be like okay yeah so i'll do something new now

so again let's try to iterate on all of our solutions here

and start out by keeping it simple kiss framework right keep

it simple stupid the second tip is to keep the metric super flexible for further analysis okay

so if we start out with just one simple solution then we can actually make it flexible

so that later down the line

we are also not cheating ourselves on the foot again by having to rewrite our entire SQL query.

We want to be able to make it so

that we can actually write it in a way that is flexible for further solutions down the line.

Given this problem and given

that we have to understand how a feature basically allows users to conduct audio chats with each other,

try this again is just to visualize exactly what the output is that I want.

So I want something that says use audio chat,

right and then if it is a one or a zero then i want also purchase completion rates

and then this is going to be something like you know 50 and then this will be something like

25 right so this is the output that i want to see right

if i'm a pm i want to see that oh people

that use the audio chat completed their purchases at a higher rate than people

that did not use audio chat okay

so awesome we visualize this output we come up with this

metric this thing i want to note in our analysis is

that you know

if we'd like to expand the analysis we can actually analyze it by the call duration

or the number of calls and so uh conversations

that probably have on average three audio chats

or 2x the total length of call time then other conversations are more likely to have transactions completed than something

that has you know on average one audio chat

and then like 30 seconds of call time instead of one minute of call time on average

and those have a lower transaction rate let's say we're analyzing

100

call chat time basically like 30 seconds 60 seconds 90 seconds

101

and then we look at our metric that we care about which is purchase

102

completion rates 30 percent 50 percent 69 basically we're seeing

103

that you know on average if we bucket our call chat time

104

and as the chat time goes longer the purchase completion rates

105

go higher then probably this audio feature is then working for us

106

so i just want to say that out of all of our metrics

107

that we put together one thing

108

that we do have to keep in mind is normalizing the data

109

so usually we have to compare two equally interested groups of conversational buyers

110

so that we know

111

that the audio chat feature is actually making the difference it's not the fact

112

that because i make more calls i'm already inherently more likely to complete a transaction

113

and so a lot of that normalization you know we can't really do

114

that with our existing data set so i would just make

115

that assumption up front so that you get extra brownie points with your interviewer

116

and you know most of the analytics problems are all causal inference

117

118

if you mention you know you know causal inference they're like

119

oh cool you know what causal inference is let me check off this box right to be honest

120

that is really what a lot of interviewers and interviews are like

121

because of the fact that they have to do

122

so many interviews they're just trying to speed it up so they can get to their lunch break

123

all right so let's just solve for this uh simple calculation first right we just want to know

124

if they use the audio chat did they actually increase their

125

purchase completion rate first i'm gonna write select from

126

chats left join marketplace purchases as c as

127

mp on c dot buyer user id equals mp.buyer user id look at

128

that autocomplete oh my god we need to pay our engineers more

129

user id is equal to mp.seller user id uh group by call connected

130

so when we group by the call connected uh

131

so as you can see we're doing double join right we

132

want to join the buyer to the buyer of the chat

133

and to the buyer of the marketplace we want to join

134

the seller of the chat to the the seller of the marketplace.

135

Then we want to group by if the call was connected or not.

136

And then we want to grab the distinct marketplace ID divided by the distinct number of chats.

137

And then I can just run this query.

138

Hopefully it'll run.

139

I have an error.

140

Of course, I have an error.

141

I didn't comment this out.

142

Let me comment this out.

143

Cool.

144

Awesome.

145

So as you can see below,

146

and I'll with my video.

147

The number of calls that were connected resulted in a higher conversion rate.

148

Obviously, this is fake data,

149

so of course it was going to.

150

So we're good, except I spot an error in our SQL query.

151

And if you want to pause the video,

152

maybe you guys can try to find it first before I go on.

153

Okay, let's unpause it.

154

So where is the error in my SQL query?

155

Basically, what I noticed was that If you look at chats,

156

technically you could have more than one chat with the same person, right?

157

So if I call this person once,

158

twice, they don't pick up.

159

Then I call the seller three times, they finally pick up.

160

Then technically, I'm actually grouping by call connected,

161

which would then mess up my data, right?

162

Because I'd have two zeros and then a one.

163

So if I group by that,

164

then this wouldn't really work out and this data is wrong.

165

So what I have to do instead is group by the distinct number of chats and then rejoin it.

166

So let's do that again.

167

So instead, let me think.

168

I still think I can reuse some of my data here.

169

So I'd say let's group by the buyer user ID,

170

the seller user ID.

171

And then let's run actually a function in here where we want to know if there was at least one connection.

172

So instead, our actual formula means use audio chat at least once.

173

Because here, you know, technically,

174

if I'm like talking to this guy and I make two purchases,

175

you know, without the audio chat,

176

but then I use the audio chat in a separate purchase and I use it at least once,

177

but then I chat with him,

178

like create like three different chats with that same guy.

179

I want to know if at least using it once influenced

180

and gives me a higher purchase completion rate than never using it ever.

181

This is the bare bones analysis.

182

If I use this feature at least once, does it do anything?

183

Does it change up the completion rates at all?

184

So here we're going to just run a max on call connected.

185

So this means that for every buyer-seller combo,

186

did they at least connect on a call at least once?

187

So I'll label this at least one connection, call connected.

188

Then next I'll just do count distinct mp.id same as this one divided by the

189

Actually no so instead I'll just use I will also just group by 1 and 2 So as total purchases

190

Then I'll wrap this in a subquery with distinct chat as this.

191

So we're grabbing the total purchases,

192

this is actually completed transaction.

193

So we know that we completed this transaction.

194

So now I want to group by this value and I can do that with the CTE.

195

So I'm grouping by this.

196

This is going to be distinct chats.

197

And then group by this value, one.

198

And then now I can actually just look at the average number of completed transactions.

199

And then the sum of completed transactions.

200

Okay, cool.

201

So why did I make this completed transaction?

202

So basically for every combo,

203

what this query does is that it basically finds exactly if there is at least one call connected,

204

and then if there is a completed transaction.

205

So for example, you know,

206

one buyer and seller, they didn't connect in a call, no completed transaction.

207

Let's say they did complete at least one call and they did complete a transaction.

208

So this would be effectively just,

209

you know, one because they're only doing one transaction either way so once we take the average of this

210

we're taking the average of ones and zeros and

211

so that's how we're getting this value of 0.435 or sorry 4.35

212

and 3.92 that's this that's the data analytics case question i

213

think this is a great kind of solution for now i

214

think one last thing i'd like to do is like why don't we try to

215

and do an analysis and see if like more call completions lead to more purchase transaction rates.

216

So kind of that scenario that I was drawing above,

217

except this one is like if we have three audio chats instead of two audio chats instead of one audio chat,

218

like does the three audio chats lead to more completed transactions than the one audio chat?

219

If it does, then it probably means that audio chats are beneficial for this business.

220

So I'll actually leave that for the exercise for you guys.

221

If you guys want to see the answer,

222

just go to interview query,

223

go to the link below,

224

and then you'll see the answer in the solution bar right here,

225

because I'm going to finish that before this video comes out officially.

226

And I hope this was helpful.

227

I hope this video isn't all over the place.

228

I hope I have a really good video editor that helps me out with this stuff.

229

And please remember to like and subscribe on this video

230

so I can go through this exercise of doing more videos for you,

231

even though, you know, I'm jumping back into it.

232

So it's been a painful experience so far,

233

but I'm hoping I'll get better and better at it.

234

I just need your guys' support.

235

Awesome.

236

All right?

237

Cool.

238

Thanks.

239

Bye.

TRENDING

シャドーイングとは？英語上達に効果的な理由

シャドーイング（Shadowing）は、もともとプロの通訳者養成プログラムで開発された言語学習法で、多言語習得者として知られるDr. Alexander Arguelles によって広く普及されました。方法はシンプルですが非常に効果的：ネイティブスピーカーの英語を聞きながら、1〜2秒の遅延で声に出してすぐに繰り返す——まるで「影（shadow）」のように話者を追いかけます。文法ドリルや受動的なリスニングと異なり、シャドーイングは脳と口の筋肉が同時にリアルタイムで英語を処理・再現することを強制します。研究により、発音精度、抑揚、リズム、連音、リスニング力、そして会話の流暢さが大幅に向上することが確認されています。IELTSスピーキング対策や自然な英語コミュニケーションを目指す方に特におすすめです。

☕ コーヒーをおごる

ShadowingEnglishは皆様の支援により完全無料を維持しています。サーバーやAIのコストは高額です。皆様のコーヒー一杯が私たちの支えになります！🙏

PayPalで寄付する

ShadowingEnglish.com – 英語シャドーイング練習

シャドーイング（Shadowing）でネイティブ英語を体に染み込ませよう。YouTube動画を一文ずつ聞いて声に出して繰り返し、本物の発音と流暢さを身につける。

シャドーイング練習: Learn how to SOLVE a data analytics case study problem - YouTubeで英語スピーキングを学ぶ

人気動画

コンテキストと背景

日常コミュニケーションのための5つのフレーズ

ステップバイステップシャドーイングガイド

シャドーイングとは？英語上達に効果的な理由

シャドーイング練習: Learn how to SOLVE a data analytics case study problem - YouTubeで英語スピーキングを学ぶ

アプリをダウンロード

コンテキストと背景

日常コミュニケーションのための5つのフレーズ

ステップバイステップ シャドーイングガイド

シャドーイングとは？英語上達に効果的な理由

ステップバイステップシャドーイングガイド