تدريب Shadowing: Beyond Code Coverage: Functionality Testing with Playwright — Marlene Mhangami, Microsoft - تعلم التحدث بالإنجليزية مع YouTube

Okay.

⏸ متوقف مؤقتاً

السرعة:

306 جمل

إذا كانت الجمل قصيرة أو طويلة جدًا، انقر على Edit لتعديلها.

Okay.

Hi everyone.

My name is Marlene and I am a senior developer advocate at both Microsoft and and GitHub.

I work in a group called Core AI,

which looks at how developers are using AI across our products.

So this is kind of new.

To start off today, I wanted to show you some stats about GitHub from last year's GitHub Octoverse report in 2025,

which shows data about how developers are using GitHub.

What we saw from our report was that more code was added to GitHub last year than ever before.

So about a billion commits were pushed to the platform in 2025,

which is GitHub's most active year ever.

Okay?

What we know now in 2026 is that this growth is accelerating.

So a couple of, we haven't actually released any official stats yet

but a couple of days ago our COO Kyle Diggle

tweeted that we're seeing about 275 million commits to the platform every week

and if we extrapolate

that over time we're going to see about 14 billion commits by the end of the year

so that's 14 times the amount of growth we saw last year

or commits we saw last year which by By the way,

last year again was our biggest year ever with a billion commits.

One thing that we know is that there's a growing share of these commits that are co-authored by AI agents.

We haven't released the data yet,

but we can actually track and see,

some Claude, for example, cosigns commits and so does Copilot,

but Codex doesn't, but we can also kind of track based off of some wording in the code.

I had a question when I saw all of this growth in terms of how much code we were seeing.

And that question is, does AI actually make developers more productive?

We're seeing all of this code.

Does it actually correlate with productivity?

One of the best resources I've seen

that tries to answer this question is actually from AI engineer from a talk last year.

And this graph shares findings from that talk that's from a Stanford University study of 120,000 developers.

And in this study, it found that while,

yes, AI can make developers more productive,

it's actually how the developers are using AI that matters the most.

So this graph from the study actually shows us that clean code bases amplify AI gains and AI productivity,

while unchecked AI in a code base is going to amplify entropy.

To illustrate this point, the speaker from this talk gave a case study example of a company

that used AI in an unchecked way in their database.

And what you can see is that the number of PRs that the team was pushing out increased,

but at the same time,

the code quality that the team was seeing decreased and actually they spent a lot more time reworking that code,

refactoring that code.

And so overall though there was effective output increase of like 1%,

AI didn't really improve the productivity from this team.

So what we learned from this study is that a lot of value

that we are wanting to see as developers from AI relies or hinges on us having a clean code base.

So So for developers that are using AI tools,

we want to focus on things like good test coverage,

type coverage, and things like good documentation, modularity, and so on.

So I'd actually also argue that we need to start standardizing some practices across our teams and across our industry.

And this is something that's a bit of a controversial topic

because some people at this conference believe in just closing their eyes in shipping.

And that's also okay.

And I think, in my ideal world and from the study we've seen,

I would recommend standardized practices for keeping a codebase clean.

So how can developers create and maintain clean code?

This question is actually not a new question.

In our industry over time,

we have seen several methods that have tried to make maintaining a clean codebase a central part of their philosophy.

of those approaches that I've actually seen a lot of developers

that are doing agentic coding with coding agents talking about is test-driven development, or TDD.

Simon Willison, who's very popular,

just recently published a blog post about how he's using this specific flavor of TDD called red-green TDD.

And here, what happens is a developer,

the first thing that happens is the developer gets an incoming feature request.

As soon as they get the request,

they immediately start by writing a failing test because the feature doesn't exist.

After that, the developer focuses on getting the test to pass.

And in this green phase,

when they're trying to get the test to pass,

historically, you should not be focusing on the quality of the code.

All you're focusing on is speed and getting the test to go green.

So in the past, developers maybe would copy code from Stack Overflow and so on and get the test passed.

But then after that, the final phase of this is the refactor phase.

And in this phase, you're just focusing on code quality.

So you're taking that code that you made past and refactoring it so that it follows all the best practices.

So not everyone is a fan of TDD.

And like many things in this industry,

TDD was pronounced dead in 2014.

And one of the most common complaints

that I've seen on the internet about TDD is

that it focuses too much on code coverage with unit tests and that it doesn't actually test the system.

So DHH, who created Rails,

published this blog post in 2014 and was kind of talking about this,

that is an over-focus on unit tests.

And we know that when we over-index on code coverage,

there are several issues that come up.

One of the issues is that there's a tendency to test implementation details.

So take an example like we see on the screen where we are having an order calculation with a discount.

If the test is tied directly to a method like calculate,

just simply renaming the name of the test,

even if the functionality is still fine,

is going to break those unit tests.

So that's not going to be great.

And even if we test specifically the behavior of the system,

like the final end result of the price we're looking for,

or we test on something like a stable contract,

100

like our API or a module that doesn't change,

101

but we export it, it should survive any refactors of our internal code.

102

I would say that if you're interested in learning more about this and behavior-driven TDD,

103

I would recommend the talk by Ian Cooper called TDD,

104

Where It All Went Wrong.

105

It's a very good talk.

106

Another thing that we see is that in the age of AI,

107

that many developers are using AI to generate tests.

108

What they've noticed is that AI sometimes generates self-affirming tests.

109

So while the code coverage test might pass and your unit test suite is all green,

110

the behavior of the system is not being validated.

111

And that's where the problem lies.

112

So for the rest of this talk,

113

I'm going to be focusing on showing you how you can avoid these problems and start to test for functionality using Playwright.

114

Playwright is an open source testing framework that's built by Microsoft.

115

And it automates end-to-end testing in the browser by simulating user interactions.

116

And the link that you see on the screen there is the documentation.

117

So Playwright supports a number of different languages right now, Python TypeScript C-sharp.

118

And the example scripts that you can see on the screen is what a test would typically look like.

119

So you have that line that says page go to telling the script that it starts at the toys.

120

Page is where we want to start.

121

And then we're going to look for the placeholder search.

122

And then we're going to fill that search bar with the word Furby.

123

And that will actually run the search for us automatically in the browser, for example.

124

You can also use headed or headless mode.

125

So you don't necessarily have to look at the browser while your tests are running.

126

You can actually just have them running in the background as well.

127

So going back to that idea of TDD,

128

when we're using Playwright with AI,

129

it actually should speed up the full process of TDD for us.

130

So a lot of developers in the past

131

really complained about how TDD is slow and

132

that it's not effective for teams that want to move fast

133

but if we have AI then what happens is that red part

134

and the green part are fast

135

so we're focusing on getting our agents to generate these behavioral

136

tests the playwright tests then we're focusing on getting the agent

137

to quickly generate as fast as it can code that's going to make the test pass.

138

And then I would recommend that developers are going to spend the most amount of time

139

so it grows bigger on that refactoring stage.

140

So they're spending time looking at the code the agent has generated and making that code better.

141

There's a number of ways you can connect your coding agents today with Playwright.

142

One of those ways is through the Playwright MCP server.

143

You can use the CLI tool if you'd like that instead,

144

or you can use something called Playwright agents.

145

And when you're using Playwright agents,

146

you'll run the command that you can see on the screen.

147

And once you run that command,

148

it's going to install for you three agent.md files.

149

So the first one is going to be a planner,

150

second is a generator, and the third is a healer.

151

So the planner will plan which tests to run,

152

the generator is going to actually generate the tests and then the healer will fix those tests for you.

153

So I do want to show you a demo and I am going to hope the demo gods are smiling today.

154

So we will give this a try.

155

Oh, oh no. Okay.

156

Here we go.

157

So I want to give us a scenario.

158

The scenario is that, oh,

159

you can't see my, you are only looking at my PowerPoint right now.

160

And I don't know how to stop that.

161

Let me close the PowerPoint maybe and see if that will help.

162

I don't want to just show this screen.

163

Sorry.

164

Hopefully they'll give me more time.

165

Okay.

166

Perfect.

167

That's working as easy.

168

Okay.

169

Perfect.

170

So the scenario that we're going to imagine today is imagine I'm a developer.

171

I'm working at a toy company called tailspin toys.

172

And a few days ago,

173

I got an email from the search product management team,

174

and they asked me to add some new search and filter features to the site.

175

They asked me to add in a search bar with text search for simple searches

176

and Azure AI search for more complex ones.

177

And they've also asked me to add in a side bar so customers can filter by category and price.

178

So I'd like for co-pilots to help me with this task

179

and also for us to use this TDD first style of development.

180

So this is GitHub Copilot CLI.

181

And the first thing

182

that we can do is we're going to try to get the agents to get the information

183

that we saw in that email and bring in the features here into our terminal.

184

And for this, we're going to use something called WorkIQ, which is Microsoft.

185

It's a skill that Microsoft has developed that lets developers connect to the M365 suite,

186

so Outlook, PowerPoint, whatever it is you would like,

187

and to bring that information here into the terminal.

188

So if you're using the M365 suite for work,

189

I can definitely recommend it.

190

And what I will also mention with TDD,

191

in the past when we've done things like unit tests,

192

typically people, what would trigger writing a unit test is adding a new method to a class.

193

But actually, in this new world,

194

what we want to focus on is the behavior.

195

So we want to focus on a feature.

196

So if a feature request comes,

197

that is what the trigger is for the test to be written.

198

So now we have our list of what needs to be actually developed.

199

And I'm tossing in the second prompt,

200

and I'm asking Copilot to help me develop these features using red-green TDD to start by writing the playwright tests

201

that fail for each feature.

202

And I'm telling it not to commit the changes just for the sake of this example.

203

And I do want to point out

204

that the first thing the agent is going to do is it's going to start to examine my code base.

205

So it's going to understand what's in my code base.

206

I have the Playwright MCP server already installed in my CLI,

207

into Copilot CLI,

208

and it knows what it needs to do to create the tests to be able to test for these functions.

209

So the agent is going to understand what the code base is going to look like

210

and then going to write the test for it.

211

This process is actually going to take a while.

212

So in the meantime, I'm going to switch over to a new tab.

213

And I'm going to run the command to get the playwright test.

214

So earlier today, I got the agent to generate those failing tests.

215

And then I got it to do the green phase where

216

the agent just creates the code to get the test to pass.

217

And then now I'm asking my agent to go ahead

218

and run the playwright tests to actually test for that search bar and filter feature for us.

219

So like I mentioned before,

220

I have the Playwright MCP server already installed.

221

You can see it installed here.

222

And our agent is just going to look for the test file.

223

And if everything works correctly,

224

it's going to start writing some tests, running some tests.

225

So we see it's opened the correct page.

226

typing in different inputs, which is testing for the search bar is working.

227

We saw Furby was correctly found,

228

Simon was correctly found, and now it's clicking buttons,

229

so also testing the category filter is working correctly.

230

Again, my hands are not on the keyboard.

231

This is all playwright and copilot, so super cool.

232

And now it's correctly finding all of the toys in the specific price range.

233

So when I run these functionality tests,

234

all I can see actively that,

235

OK, the agent has written this code.

236

The code is working as I expected.

237

The app is working as I expected.

238

So there's so many different ways that you can test your app by functionality,

239

and all of our tests pass.

240

Now, once our tests have passed,

241

that's when I would say we step into the next phase of actually going ahead and writing our refactor,

242

so refactoring the code the agent has created to generate these tests that pass.

243

So a final thing I will do is I will give you some best practices with Playwright.

244

The first thing I will say is that when Playwright runs those functionality tests,

245

it's going to take screenshots of all the tests that it's run.

246

I've gotten into the practice of adding those screenshots to a PR.

247

So if I had made some changes,

248

I'll add them to a PR.

249

The second thing is that you don't have to run it where it launches the browser,

250

like you saw in the example.

251

You can run it in headless mode so it runs in the background.

252

And then a final thing is I would say commit your code before you actually get it to fix the test.

253

Or commit before it starts to make changes to your code,

254

because if you don't commit,

255

it might not remember what happened in the past.

256

So that's something to do.

257

And then I would also say to generate one feature,

258

one test per feature as well.

259

As a final note, these are some resources you could take a look at.

260

Ah, I forgot to add the link to the GitHub repo,

261

but all of the slides are gonna be available at that link there.

262

You can check out the documentation and you can connect with me on social media as well.

263

So yeah, thanks everyone.

264

I think that's all the time.

265

I think we have two minutes for questions.

266

Does anyone have any questions about this?

267

Yes, I see a question there.

268

I used to say, I forget it with Storybook,

269

because you need to set up a computer state.

270

How would you, like now,

271

this example, people websites by .

272

Yeah.

273

How would you, what are tips for more complex SaaS and admin panels where you have a lot of state management?

274

I mean, I think if you have a lot of state management,

275

I would focus on maybe I would recommend using Playwright Agents,

276

where it downloads the specific agent.md file,

277

because that's going to have some specialized instructions that are better at handling state and things like that.

278

So I found that agents,

279

Playwright Agents specifically, has a lot of good instructions already built into it that should help with that.

280

Another thing that you could do is,

281

If you didn't want to use Playwright for everything,

282

you could also just directly test your APIs.

283

If there is an API available,

284

that's something you could do.

285

So yeah, that's what I would recommend.

286

And are there any other questions?

287

Maybe one more question.

288

I'm not sure.

289

Yeah.

290

Can Playwright also check different sizes,

291

like from desktop side to like?

292

Yeah.

293

Yes, yes, it can.

294

It can check your mobile versus on desktop.

295

It should just work.

296

Yeah, one more.

297

Yeah.

298

It's browser-based checking for a Mac app or an iPhone app.

299

It's browser-based for the moment.

300

Yeah, for the moment, it's only browser-based.

301

Yeah.

302

OK, I think that is all the time I have for today.

303

Thanks, everyone.

304

Sorry about that.

305

No link to the GitHub.

306

Thank you.

TRENDING

الأكثر شعبية

ما هي تقنية التظليل الصوتي؟

التظليل الصوتي (Shadowing) تقنية تعلم لغة مدعومة علمياً، طُورت أصلاً لتدريب المترجمين الفوريين المحترفين. الطريقة بسيطة لكنها قوية: تستمع لصوت إنجليزي أصلي وتكرره فوراً بصوت عالٍ — كظل يتبع المتحدث بتأخير 1-2 ثانية. تُظهر الأبحاث تحسناً كبيراً في دقة النطق والتنغيم والإيقاع وربط الأصوات والاستماع والطلاقة.

☕ اشترِ لنا قهوة

يظل ShadowingEnglish مجانيًا بنسبة 100% بفضل دعمكم. تكاليف الخوادم والذكاء الاصطناعي مرتفعة — قهوتكم تبقينا مستمرين! 🙏

التبرع عبر PayPal

ShadowingEnglish.com – تمرين التظليل الصوتي للإنجليزية

تحدث الإنجليزية بطلاقة باستخدام تقنية التظليل الصوتي. استمع لفيديوهات يوتيوب الأصلية، كرر جملة بجملة، وابنِ نطقاً حقيقياً وطلاقة — يستخدمها متعلمو IELTS حول العالم.

تدريب Shadowing: Beyond Code Coverage: Functionality Testing with Playwright — Marlene Mhangami, Microsoft - تعلم التحدث بالإنجليزية مع YouTube

الأكثر شعبية

لماذا ممارسة المحادثة باستخدام هذا الفيديو؟

القواعد والتعابير في السياق

فخاخ النطق الشائعة

ما هي تقنية التظليل الصوتي؟

تدريب Shadowing: Beyond Code Coverage: Functionality Testing with Playwright — Marlene Mhangami, Microsoft - تعلم التحدث بالإنجليزية مع YouTube

تنزيل التطبيق

لماذا ممارسة المحادثة باستخدام هذا الفيديو؟

القواعد والتعابير في السياق

فخاخ النطق الشائعة

ما هي تقنية التظليل الصوتي؟