跟读练习: Introduction to Diminsionality Reduction (a.k.a. Manifold Learning) - 通过YouTube学习英语口语
C1
Hi everyone, my name is Bing Breton and I'm a professor at the University of Washington in Seattle.
123 句
如果句子过短或过长,请点击 Edit 进行调整。
1
Hi everyone, my name is Bing Breton and I'm a professor at the University of Washington in Seattle.
2
In this next series of lectures we're going to be talking about one of my favorite topics, nonlinear dimensionality reduction, sometimes also known as manifold learning.
3
So what is this field about?
4
What is actually a manifold?
5
We're going to be breaking it down and going through some of the most popular ways of performing nonlinear dimensionality reduction.
6
We're also going to be giving some examples
7
and talking about different ways in which you can
8
and also cannot use manifold learning as a tool in your work
9
so the basic tenant of manifold learning and non-linear dimensionality reduction is
10
that even if you have really really big
11
and complicated data patterns do in fact exist in the data we believe this to be true
12
because you know after all
13
if the patterns don't exist why are we bothering to collect this data in the first place
14
so we believe um as a as a starting point
15
that patterns exist in complicated data so we have to to believe this.
16
Now, when we're talking about dimensionality reduction, part of the problem is that we have to be able to visualize the data.
17
Now, I and hopefully most of you are humans who are constrained to walk around in this three-dimensional world.
18
And a lot of what my visual intuition is in two dimensions, so things that can be done on a piece of paper
19
or at least on a computer screen and so
20
if i have data that is higher than two or three dimensional
21
and that's most data sets we have a the next video is going to be all about examples
22
and intuition about high dimensional data sets what we have a problem
23
which is that the data is there
24
and the patterns do exist the data we believe to be true i can't actually see it
25
and for me because i'm a really visual person i like seeing the data
26
so a lot of the challenge in manifold learning is figuring out what the the patterns actually are
27
so that we can actually see it and gain some intuition for data.
28
So I found these rare earth magnets in my office and it's one of my favorite office toys to play with.
29
So I brought them as a prop to show you what kind of high dimensional data might look like.
30
So let's say your data is like this little ball of little magnets here.
31
It's roughly a smash into a little ball.
32
And so in order to describe all of the data sets on this little ball, you kind of need all three dimensions because we exist in a three dimensional world.
33
Now on the other hand, let's say your data looked more like something that I smashed up into this little ribbon here.
34
Okay, now if your data looks more like this, okay, where it's a little ribbon, what you can see here is that even though this toy,
35
just like the other toy, exists in three-dimensional world, it is in in fact, lying on a surface.
36
Okay?
37
Now, the surface could be a flat surface, like you can describe it by a plane.
38
All right, so we call this in linear algebra subspace because it's planar and it's flat.
39
And that's great because you can use linear dimensionality reduction techniques to describe this plane.
40
So I can rotate it however I want in three-dimensional space.
41
It's still kind of on a plane, and I can describe that plane.
42
This is a simpler way of describing my data.
43
The problem becomes if the plane becomes warped of some kind.
44
So let's say I can make a little bracelet out of it, and now it's a little ring, okay?
45
Or maybe it's not connected, and it's just kind of curvy like this, okay?
46
This does not fundamentally change the fact
47
that all of the data points on my little toy are still on a flat-ish surface.
48
I can flatten it, it's locally flat, just like the surface of the Earth is locally flat, this Earth that we all walk on.
49
But if you zoom out and look at it, you can see that you actually do need three-dimensional to describe the data set,
50
but locally, it's lying on this flat-ish curved surface.
51
That's kind of roughly speaking, if I'm waving my hands around, what a manifold is.
52
It's a description of something that is approximately flat, if you look closely enough, but globally, it might be curved.
53
And so if I can learn what this curved surface is, then I'm able to describe my data much more simply
54
by describing the curve and then figuring out where my data points are on this curve
55
without having to use all three dimensions.
56
Now, the idea here is that we need to be able to reduce and visualize the data.
57
So here's like a physical prop of visualizing my data set.
58
Most data sets are not something that you can play with as a desk toy.
59
And so the goal of Manifold Learning is to reduce the data.
60
We need to reduce and visualize the data.
61
We want to reduce it because we suspect that patterns do in fact exist, so we can describe it more simply.
62
And we want to visualize it because humans are really intuitive visual creatures.
63
And so when we can see something, we believe in it and we can actually see patterns in it that wouldn't have been obvious otherwise.
64
And the reason we want to do this is because we want to gain intuition.
65
And we want to communicate to ourselves
66
and to each other about what we've actually got is one
67
of the most compact ways of communicating your data is being able to make a really compelling visualization.
68
Now, the trick here with dimensionality reduction and manifold learning is how do we do this, right?
69
How do we actually pick out patterns that exist in the data set and reduce and visualize them?
70
So it turns out that, hopefully I can demonstrate again with this little toy here, it has to do with this notion of what's actually close, like what's similar to each other.
71
Like, am I similar to my cousin more than some random person on the street?
72
Probably.
73
But, like, how do you define that?
74
So let's say that we have this curved surface here, okay, my little toy.
75
And you can see that it's lying on this flattish surface, this curved surface.
76
And so two neighboring points are on the surface, are close to each other because they're actually touching each other.
77
They're close to each other, right?
78
So we kind of want to say that if we're going to reduce the dimensionality of my data set here, my little ring, my little bracelet I've made,
79
I want the points that are closer in the original data set to also end up closer in my reduced learned manifold,
80
in my reduced dimensionality space.
81
And points that are farther apart should also end up farther apart in my reduced space, so that I haven't lost information.
82
So things that used to be similar should actually be similar should end up closer together in my reduced space.
83
And things that are farther apart, less similar, should also end up less similar and less farther apart in my reduced space.
84
The problem then becomes, how do I actually define that?
85
So how do you actually compute distances in high dimensional spaces?
86
And what is the most compact way of doing it?
87
What's the most convenient way of doing it?
88
These are decisions we have to make.
89
So part of what we're going to be learning in the
90
next couple of lectures are common ways to defining distances and similarity.
91
So the punchline here is that there's no one right way of doing it.
92
This is a decision that one makes, and I'm gonna tell you about some of the most common ways of doing it
93
that seems to work well for different types of datasets, and how do you make these kinds of decisions.
94
And then also, what do we mean by more similar?
95
Like all similarities, are they all equally important?
96
So for example here, I can compute kind of like, grid size distances between any of these two points on my dataset here, okay?
97
But you can kind of see that the data sets over here are close in physical 3D space, in this studio space, to the points over here, right?
98
Because they're actually really close to each other.
99
Is that the same?
100
Does that matter as much as the fact that you actually have to go?
101
They're not actually connected.
102
They're not actually touching each other as little magnets.
103
Does that matter?
104
Because you had to count connected magnets, you'd have to go all the way up here to get to the other side.
105
And is that notion of distance more important than the fact that as the fly flies, you can get right over there.
106
These are all valid notions of similarity and distance, but are they all equally important in the context of manifold learning?
107
That's something that we're going to be talking about.
108
And then this idea of points that start out closer together should end up close together.
109
Well, what does ending up close together mean?
110
How do we interpret the fact that we end up with some kind of visualization of a beautiful manifold?
111
And can we actually interpret two points
112
that are closer together in the manifold space as being actually more similar in an interpretable, meaningful, engineering relevant way.
113
This is something that we'll discuss as well
114
because it is variously different depending on the algorithm you use and also on your notion of distance.
115
So we're going to dig right into it in the next lecture.
116
But I'm going to leave you with the idea that manifolds are everywhere.
117
We're going to do some manifold explaining right now.
118
But there's no one right way of doing manifold learning.
119
We're going to start by looking at some linear methods first
120
and trying to draw connections between our notions of similarity
121
and distance with some linear algebra and some linear dimensionality reductions
122
that you've already heard about earlier in the series
123
and then we're going to generalize these concepts to talk about non-linear dimensionality reduction with some examples and to build some intuitions.
下载应用
Everything you need to speak fluently
AI PronunciationScore every sentence
IPA PracticeMaster every sound
VocabularyBuild your word bank
Vocab GameLearn while playing
为什么与此视频练习口语?
通过与这段视频进行口语练习,学习者不仅能够提升自己的语言表达能力,还可以在非线性维度缩减这一复杂主题上,培养自己的理解和思维能力。视频中的讲解者以清晰的示范和实例,深入浅出地介绍了数据可视化和模式识别的概念,因此观看并跟读视频内容,有助于提高英语听力与口语的流利度。在这样的情境下练习,学习者能够潜移默化地提高语言应用能力,特别是涉及到专业词汇和表达时。使用“英语影子跟读”的技巧,重复讲者的语句,可以帮助增强记忆,同时也能提升语音语调的准确性。
语法与表达在语境中的运用
- 非线性维度缩减 (nonlinear dimensionality reduction):这一术语表达了数据处理中的关键概念,学习者可以通过反复练习这种专业名词,提高在数据科学领域的英语水平。
- 描述所有的数据集 (describe all of the data sets):这个句子结构强调了讲者如何清晰阐述复杂概念的方法,对提高英语表达能力十分重要。
- 数据的模式确实存在 (patterns do exist in the data):这一表达强化了讲者传递信息的逻辑性,学习者可以模仿这种结构,提升自身的逻辑表达能力。
常见发音陷阱
在这段视频中,某些英语单词可能会对刚学习者造成发音上的困难。例如,“dimensionality”这个词对于非母语者来说尤为复杂,建议反复听取讲者的发音并尝试模仿。此外,词汇“manifold”在不同口音中可能有所不同,熟悉其发音可以帮助学习者在不同语境中更自如地使用。通过“看YouTube学英语”和进行“shadow speak”练习,学习者可以克服这些发音的挑战,提高发音的准确性和自信心。
什么是跟读法?
跟读法 (Shadowing) 是一种有科学依据的语言学习技巧,最初开发用于专业口译员的培训,并由多语言者Alexander Arguelles博士普及。这个方法简单而强大:您在听英语母语原声的同时立即大声重复——就像是一个延迟1-2秒紧跟说话者的影子。与被动听力或语法练习不同,跟读法强迫您的大脑和口腔肌肉同时处理并模仿真实的讲话模式。研究表明它能显着提高发音准确性,语调,节奏,连读,听力理解和口语流利度——使其成为雅思口语备考和真实英语交流最有效的方法之一。
