Prática de Shadowing: Introduction to Diminsionality Reduction (a.k.a. Manifold Learning) - Aprenda a falar inglês com o YouTube

⏸ Pausado

Hi everyone, my name is Bing Breton and I'm a professor at the University of Washington in Seattle.

Velocidade:

123 frases

Se as frases estiverem muito curtas ou longas, clique em Edit para ajustá-las.

Hi everyone, my name is Bing Breton and I'm a professor at the University of Washington in Seattle.

In this next series of lectures we're going to be talking about one of my favorite topics, nonlinear dimensionality reduction, sometimes also known as manifold learning.

So what is this field about?

What is actually a manifold?

We're going to be breaking it down and going through some of the most popular ways of performing nonlinear dimensionality reduction.

We're also going to be giving some examples

and talking about different ways in which you can

and also cannot use manifold learning as a tool in your work

so the basic tenant of manifold learning and non-linear dimensionality reduction is

that even if you have really really big

and complicated data patterns do in fact exist in the data we believe this to be true

because you know after all

if the patterns don't exist why are we bothering to collect this data in the first place

so we believe um as a as a starting point

that patterns exist in complicated data so we have to to believe this.

Now, when we're talking about dimensionality reduction, part of the problem is that we have to be able to visualize the data.

Now, I and hopefully most of you are humans who are constrained to walk around in this three-dimensional world.

And a lot of what my visual intuition is in two dimensions, so things that can be done on a piece of paper

or at least on a computer screen and so

if i have data that is higher than two or three dimensional

and that's most data sets we have a the next video is going to be all about examples

and intuition about high dimensional data sets what we have a problem

which is that the data is there

and the patterns do exist the data we believe to be true i can't actually see it

and for me because i'm a really visual person i like seeing the data

so a lot of the challenge in manifold learning is figuring out what the the patterns actually are

so that we can actually see it and gain some intuition for data.

So I found these rare earth magnets in my office and it's one of my favorite office toys to play with.

So I brought them as a prop to show you what kind of high dimensional data might look like.

So let's say your data is like this little ball of little magnets here.

It's roughly a smash into a little ball.

And so in order to describe all of the data sets on this little ball, you kind of need all three dimensions because we exist in a three dimensional world.

Now on the other hand, let's say your data looked more like something that I smashed up into this little ribbon here.

Okay, now if your data looks more like this, okay, where it's a little ribbon, what you can see here is that even though this toy,

just like the other toy, exists in three-dimensional world, it is in in fact, lying on a surface.

Okay?

Now, the surface could be a flat surface, like you can describe it by a plane.

All right, so we call this in linear algebra subspace because it's planar and it's flat.

And that's great because you can use linear dimensionality reduction techniques to describe this plane.

So I can rotate it however I want in three-dimensional space.

It's still kind of on a plane, and I can describe that plane.

This is a simpler way of describing my data.

The problem becomes if the plane becomes warped of some kind.

So let's say I can make a little bracelet out of it, and now it's a little ring, okay?

Or maybe it's not connected, and it's just kind of curvy like this, okay?

This does not fundamentally change the fact

that all of the data points on my little toy are still on a flat-ish surface.

I can flatten it, it's locally flat, just like the surface of the Earth is locally flat, this Earth that we all walk on.

But if you zoom out and look at it, you can see that you actually do need three-dimensional to describe the data set,

but locally, it's lying on this flat-ish curved surface.

That's kind of roughly speaking, if I'm waving my hands around, what a manifold is.

It's a description of something that is approximately flat, if you look closely enough, but globally, it might be curved.

And so if I can learn what this curved surface is, then I'm able to describe my data much more simply

by describing the curve and then figuring out where my data points are on this curve

without having to use all three dimensions.

Now, the idea here is that we need to be able to reduce and visualize the data.

So here's like a physical prop of visualizing my data set.

Most data sets are not something that you can play with as a desk toy.

And so the goal of Manifold Learning is to reduce the data.

We need to reduce and visualize the data.

We want to reduce it because we suspect that patterns do in fact exist, so we can describe it more simply.

And we want to visualize it because humans are really intuitive visual creatures.

And so when we can see something, we believe in it and we can actually see patterns in it that wouldn't have been obvious otherwise.

And the reason we want to do this is because we want to gain intuition.

And we want to communicate to ourselves

and to each other about what we've actually got is one

of the most compact ways of communicating your data is being able to make a really compelling visualization.

Now, the trick here with dimensionality reduction and manifold learning is how do we do this, right?

How do we actually pick out patterns that exist in the data set and reduce and visualize them?

So it turns out that, hopefully I can demonstrate again with this little toy here, it has to do with this notion of what's actually close, like what's similar to each other.

Like, am I similar to my cousin more than some random person on the street?

Probably.

But, like, how do you define that?

So let's say that we have this curved surface here, okay, my little toy.

And you can see that it's lying on this flattish surface, this curved surface.

And so two neighboring points are on the surface, are close to each other because they're actually touching each other.

They're close to each other, right?

So we kind of want to say that if we're going to reduce the dimensionality of my data set here, my little ring, my little bracelet I've made,

I want the points that are closer in the original data set to also end up closer in my reduced learned manifold,

in my reduced dimensionality space.

And points that are farther apart should also end up farther apart in my reduced space, so that I haven't lost information.

So things that used to be similar should actually be similar should end up closer together in my reduced space.

And things that are farther apart, less similar, should also end up less similar and less farther apart in my reduced space.

The problem then becomes, how do I actually define that?

So how do you actually compute distances in high dimensional spaces?

And what is the most compact way of doing it?

What's the most convenient way of doing it?

These are decisions we have to make.

So part of what we're going to be learning in the

next couple of lectures are common ways to defining distances and similarity.

So the punchline here is that there's no one right way of doing it.

This is a decision that one makes, and I'm gonna tell you about some of the most common ways of doing it

that seems to work well for different types of datasets, and how do you make these kinds of decisions.

And then also, what do we mean by more similar?

Like all similarities, are they all equally important?

So for example here, I can compute kind of like, grid size distances between any of these two points on my dataset here, okay?

But you can kind of see that the data sets over here are close in physical 3D space, in this studio space, to the points over here, right?

Because they're actually really close to each other.

Is that the same?

100

Does that matter as much as the fact that you actually have to go?

101

They're not actually connected.

102

They're not actually touching each other as little magnets.

103

Does that matter?

104

Because you had to count connected magnets, you'd have to go all the way up here to get to the other side.

105

And is that notion of distance more important than the fact that as the fly flies, you can get right over there.

106

These are all valid notions of similarity and distance, but are they all equally important in the context of manifold learning?

107

That's something that we're going to be talking about.

108

And then this idea of points that start out closer together should end up close together.

109

Well, what does ending up close together mean?

110

How do we interpret the fact that we end up with some kind of visualization of a beautiful manifold?

111

And can we actually interpret two points

112

that are closer together in the manifold space as being actually more similar in an interpretable, meaningful, engineering relevant way.

113

This is something that we'll discuss as well

114

because it is variously different depending on the algorithm you use and also on your notion of distance.

115

So we're going to dig right into it in the next lecture.

116

But I'm going to leave you with the idea that manifolds are everywhere.

117

We're going to do some manifold explaining right now.

118

But there's no one right way of doing manifold learning.

119

We're going to start by looking at some linear methods first

120

and trying to draw connections between our notions of similarity

121

and distance with some linear algebra and some linear dimensionality reductions

122

that you've already heard about earlier in the series

123

and then we're going to generalize these concepts to talk about non-linear dimensionality reduction with some examples and to build some intuitions.

O que é a Técnica de Shadowing?

Shadowing é uma técnica de aprendizado de idiomas com base científica, originalmente desenvolvida para o treinamento de intérpretes profissionais. O método é simples, mas poderoso: você ouve áudio em inglês nativo e repete imediatamente em voz alta — como uma sombra seguindo o falante com 1-2 segundos de atraso. Pesquisas mostram melhora significativa na precisão da pronúncia, entonação, ritmo, sons conectados, compreensão auditiva e fluência na fala.

ShadowingEnglish.com – Prática de Shadowing em Inglês

Fale inglês fluentemente usando a técnica de Shadowing. Ouça vídeos nativos do YouTube, repita frase por frase e desenvolva pronúncia e fluência reais — usado por estudantes de IELTS no mundo todo.

Prática de Shadowing: Introduction to Diminsionality Reduction (a.k.a. Manifold Learning) - Aprenda a falar inglês com o YouTube

Sobre Esta Lição

Vocabulário e Frases-Chave

Dicas de Prática

O que é a Técnica de Shadowing?

Prática de Shadowing: Introduction to Diminsionality Reduction (a.k.a. Manifold Learning) - Aprenda a falar inglês com o YouTube

Baixar aplicativo

Sobre Esta Lição

Vocabulário e Frases-Chave

Dicas de Prática

O que é a Técnica de Shadowing?