Prática de Shadowing: Introduction to Diminsionality Reduction (a.k.a. Manifold Learning) - Aprenda a falar inglês com o YouTube

C1
Hi everyone, my name is Bing Breton and I'm a professor at the University of Washington in Seattle.
⏸ Pausado
123 frases
Se as frases estiverem muito curtas ou longas, clique em Edit para ajustá-las.
1
Hi everyone, my name is Bing Breton and I'm a professor at the University of Washington in Seattle.
2
In this next series of lectures we're going to be talking about one of my favorite topics, nonlinear dimensionality reduction, sometimes also known as manifold learning.
3
So what is this field about?
4
What is actually a manifold?
5
We're going to be breaking it down and going through some of the most popular ways of performing nonlinear dimensionality reduction.
6
We're also going to be giving some examples
7
and talking about different ways in which you can
8
and also cannot use manifold learning as a tool in your work
9
so the basic tenant of manifold learning and non-linear dimensionality reduction is
10
that even if you have really really big
11
and complicated data patterns do in fact exist in the data we believe this to be true
12
because you know after all
13
if the patterns don't exist why are we bothering to collect this data in the first place
14
so we believe um as a as a starting point
15
that patterns exist in complicated data so we have to to believe this.
16
Now, when we're talking about dimensionality reduction, part of the problem is that we have to be able to visualize the data.
17
Now, I and hopefully most of you are humans who are constrained to walk around in this three-dimensional world.
18
And a lot of what my visual intuition is in two dimensions, so things that can be done on a piece of paper
19
or at least on a computer screen and so
20
if i have data that is higher than two or three dimensional
21
and that's most data sets we have a the next video is going to be all about examples
22
and intuition about high dimensional data sets what we have a problem
23
which is that the data is there
24
and the patterns do exist the data we believe to be true i can't actually see it
25
and for me because i'm a really visual person i like seeing the data
26
so a lot of the challenge in manifold learning is figuring out what the the patterns actually are
27
so that we can actually see it and gain some intuition for data.
28
So I found these rare earth magnets in my office and it's one of my favorite office toys to play with.
29
So I brought them as a prop to show you what kind of high dimensional data might look like.
30
So let's say your data is like this little ball of little magnets here.
31
It's roughly a smash into a little ball.
32
And so in order to describe all of the data sets on this little ball, you kind of need all three dimensions because we exist in a three dimensional world.
33
Now on the other hand, let's say your data looked more like something that I smashed up into this little ribbon here.
34
Okay, now if your data looks more like this, okay, where it's a little ribbon, what you can see here is that even though this toy,
35
just like the other toy, exists in three-dimensional world, it is in in fact, lying on a surface.
36
Okay?
37
Now, the surface could be a flat surface, like you can describe it by a plane.
38
All right, so we call this in linear algebra subspace because it's planar and it's flat.
39
And that's great because you can use linear dimensionality reduction techniques to describe this plane.
40
So I can rotate it however I want in three-dimensional space.
41
It's still kind of on a plane, and I can describe that plane.
42
This is a simpler way of describing my data.
43
The problem becomes if the plane becomes warped of some kind.
44
So let's say I can make a little bracelet out of it, and now it's a little ring, okay?
45
Or maybe it's not connected, and it's just kind of curvy like this, okay?
46
This does not fundamentally change the fact
47
that all of the data points on my little toy are still on a flat-ish surface.
48
I can flatten it, it's locally flat, just like the surface of the Earth is locally flat, this Earth that we all walk on.
49
But if you zoom out and look at it, you can see that you actually do need three-dimensional to describe the data set,
50
but locally, it's lying on this flat-ish curved surface.
51
That's kind of roughly speaking, if I'm waving my hands around, what a manifold is.
52
It's a description of something that is approximately flat, if you look closely enough, but globally, it might be curved.
53
And so if I can learn what this curved surface is, then I'm able to describe my data much more simply
54
by describing the curve and then figuring out where my data points are on this curve
55
without having to use all three dimensions.
56
Now, the idea here is that we need to be able to reduce and visualize the data.
57
So here's like a physical prop of visualizing my data set.
58
Most data sets are not something that you can play with as a desk toy.
59
And so the goal of Manifold Learning is to reduce the data.
60
We need to reduce and visualize the data.
61
We want to reduce it because we suspect that patterns do in fact exist, so we can describe it more simply.
62
And we want to visualize it because humans are really intuitive visual creatures.
63
And so when we can see something, we believe in it and we can actually see patterns in it that wouldn't have been obvious otherwise.
64
And the reason we want to do this is because we want to gain intuition.
65
And we want to communicate to ourselves
66
and to each other about what we've actually got is one
67
of the most compact ways of communicating your data is being able to make a really compelling visualization.
68
Now, the trick here with dimensionality reduction and manifold learning is how do we do this, right?
69
How do we actually pick out patterns that exist in the data set and reduce and visualize them?
70
So it turns out that, hopefully I can demonstrate again with this little toy here, it has to do with this notion of what's actually close, like what's similar to each other.
71
Like, am I similar to my cousin more than some random person on the street?
72
Probably.
73
But, like, how do you define that?
74
So let's say that we have this curved surface here, okay, my little toy.
75
And you can see that it's lying on this flattish surface, this curved surface.
76
And so two neighboring points are on the surface, are close to each other because they're actually touching each other.
77
They're close to each other, right?
78
So we kind of want to say that if we're going to reduce the dimensionality of my data set here, my little ring, my little bracelet I've made,
79
I want the points that are closer in the original data set to also end up closer in my reduced learned manifold,
80
in my reduced dimensionality space.
81
And points that are farther apart should also end up farther apart in my reduced space, so that I haven't lost information.
82
So things that used to be similar should actually be similar should end up closer together in my reduced space.
83
And things that are farther apart, less similar, should also end up less similar and less farther apart in my reduced space.
84
The problem then becomes, how do I actually define that?
85
So how do you actually compute distances in high dimensional spaces?
86
And what is the most compact way of doing it?
87
What's the most convenient way of doing it?
88
These are decisions we have to make.
89
So part of what we're going to be learning in the
90
next couple of lectures are common ways to defining distances and similarity.
91
So the punchline here is that there's no one right way of doing it.
92
This is a decision that one makes, and I'm gonna tell you about some of the most common ways of doing it
93
that seems to work well for different types of datasets, and how do you make these kinds of decisions.
94
And then also, what do we mean by more similar?
95
Like all similarities, are they all equally important?
96
So for example here, I can compute kind of like, grid size distances between any of these two points on my dataset here, okay?
97
But you can kind of see that the data sets over here are close in physical 3D space, in this studio space, to the points over here, right?
98
Because they're actually really close to each other.
99
Is that the same?
100
Does that matter as much as the fact that you actually have to go?
101
They're not actually connected.
102
They're not actually touching each other as little magnets.
103
Does that matter?
104
Because you had to count connected magnets, you'd have to go all the way up here to get to the other side.
105
And is that notion of distance more important than the fact that as the fly flies, you can get right over there.
106
These are all valid notions of similarity and distance, but are they all equally important in the context of manifold learning?
107
That's something that we're going to be talking about.
108
And then this idea of points that start out closer together should end up close together.
109
Well, what does ending up close together mean?
110
How do we interpret the fact that we end up with some kind of visualization of a beautiful manifold?
111
And can we actually interpret two points
112
that are closer together in the manifold space as being actually more similar in an interpretable, meaningful, engineering relevant way.
113
This is something that we'll discuss as well
114
because it is variously different depending on the algorithm you use and also on your notion of distance.
115
So we're going to dig right into it in the next lecture.
116
But I'm going to leave you with the idea that manifolds are everywhere.
117
We're going to do some manifold explaining right now.
118
But there's no one right way of doing manifold learning.
119
We're going to start by looking at some linear methods first
120
and trying to draw connections between our notions of similarity
121
and distance with some linear algebra and some linear dimensionality reductions
122
that you've already heard about earlier in the series
123
and then we're going to generalize these concepts to talk about non-linear dimensionality reduction with some examples and to build some intuitions.

Baixar aplicativo

Everything you need to speak fluently

AI PronunciationScore every sentence
IPA PracticeMaster every sound
VocabularyBuild your word bank
Vocab GameLearn while playing

Sobre Esta Lição

Nesta lição, você irá praticar o entendimento de conceitos relacionados à redução de dimensionalidade, também conhecida como aprendizado de variedades. Vamos explorar como padrões podem ser identificados em dados complexos e como esses dados podem ser visualizados de maneira eficaz. Através da análise de metáforas visuais, como a representação de dados em formas como bolas e fitas, você será capaz de desenvolver uma melhor intuição sobre dados multidimensionais. Além disso, ao se envolver com este conteúdo educativo, você melhorará sua capacidade de falar sobre conceitos matemáticos em inglês, fundamental para a prática de conversação em inglês.

Vocabulário e Frases-Chave

  • Redução de Dimensionalidade: Técnica usada para simplificar dados complexos, mantendo suas características principais.
  • Aprendizado de Variedades: Outra forma de se referir à redução de dimensionalidade não linear.
  • Padrões em Dados: Tende-se a acreditar que existam padrões mesmo em dados complexos.
  • Visualização de Dados: Processo de representar informações gráficas para facilitar a análise.
  • Subespaço: Um espaço plano em que os dados podem ser facilmente descritos.
  • Superfície Localmente Plana: Conceito que se refere a dados que parecem planos em uma escala menor, embora sejam complexos em uma perspectiva maior.
  • Dimensões Altas: Referem-se a conjuntos de dados que possuem mais de três dimensões, dificultando a visualização.

Dicas de Prática

Para aprimorar sua aprendizagem e melhorar a pronúncia em inglês, sugiro que você utilize a técnica do shadowing. Assista ao vídeo e repita as frases em voz alta, tentando imitar o ritmo e a entonação do palestrante. O falante Bing Breton apresenta uma dicção clara e uma cadência que é ideal para praticar novas palavras e frases. Concentre-se em reproduzir não apenas as palavras, mas também a emoção e a ênfase que ele coloca em certas partes do discurso. Isso ajudará você a interiorizar não apenas o vocabulário, mas também a expressão viva da língua. Utilize recursos como aprender inglês com YouTube e o método shadowspeaks para fazer dessa prática um hábito diário, culminando em um desenvolvimento significativo nas suas habilidades de conversação em inglês.

O que é a Técnica de Shadowing?

Shadowing é uma técnica de aprendizado de idiomas com base científica, originalmente desenvolvida para o treinamento de intérpretes profissionais. O método é simples, mas poderoso: você ouve áudio em inglês nativo e repete imediatamente em voz alta — como uma sombra seguindo o falante com 1-2 segundos de atraso. Pesquisas mostram melhora significativa na precisão da pronúncia, entonação, ritmo, sons conectados, compreensão auditiva e fluência na fala.

Pague-nos um café