쉐도잉 연습: Large Language Models explained briefly - YouTube로 영어 말하기 배우기

C2
Imagine you happen across a short movie script that describes a scene between a person and their AI assistant.
⏸ 일시 정지
54 문장
문장이 너무 짧거나 길면 Edit를 눌러 조정하세요.
1
Imagine you happen across a short movie script that describes a scene between a person and their AI assistant.
2
The script has what the person asks the AI, but the AI's response has been torn off.
3
Suppose you also have this powerful magical machine that can take any text and provide a sensible prediction of what word comes next.
4
You could then finish the script by feeding in what you have to the machine, seeing what it would predict to start the AI's answer, and then repeating this over and over with a growing script completing the dialogue.
5
When you interact with a chatbot, this is exactly what's happening.
6
A large language model is a sophisticated mathematical function that predicts what word comes next for any piece of text.
7
Instead of predicting one word with certainty, though, what it does is assign a probability to all possible next words.
8
To build a chatbot, you lay out some text that describes an interaction between a user and a hypothetical AI assistant, add on whatever the user types in as the first part of the interaction, and then have the model repeatedly predict the next word that such a hypothetical AI assistant would say in response, and that's what's presented to the user.
9
In doing this, the output tends to look a lot more natural if you allow it to select less likely words along the way at random.
10
So what this means is even though the model itself is deterministic, a given prompt typically gives a different answer each time it's run.
11
Models learn how to make these predictions by processing an enormous amount of text, typically pulled from the internet.
12
For a standard human to read the amount of text that was used to train GPT-3, for example, if they read non-stop 24-7, it would take over 2600 years.
13
Larger models since then train on much, much more.
14
You can think of training a little bit like tuning the dials on a big machine.
15
The way that a language model behaves is entirely determined by these many different continuous values, usually called parameters or weights.
16
Changing those parameters will change the probabilities that the model gives for the next word on a given input.
17
What puts the large in large language model is how they can have hundreds of billions of these parameters.
18
No human ever deliberately sets those parameters.
19
Instead, they begin at random, meaning the model just outputs gibberish, but they're repeatedly refined based on many example pieces of text.
20
One of these training examples could be just a handful of words, or it could be thousands, but in either case, the way this works is to pass in all but the last word from that example into the model and compare the prediction that it makes with the true last word from the example.
21
An algorithm called backpropagation is used to tweak all of the parameters in such a way that it makes the model a little more likely to choose the true last word and a little less likely to choose all the others.
22
When you do this for many, many trillions of examples, not only does the model start to give more accurate predictions on the training data, but it also starts to make more reasonable predictions on text that it's never seen before.
23
Given the huge number of parameters and the enormous amount of training data, the scale of computation involved in training a large language model is mind-boggling.
24
To illustrate, imagine that you could perform one billion additions and multiplications every single second.
25
How long do you think it would take for you to do all of the operations involved in training the largest language models?
26
Do you think it would take a year?
27
Maybe something like 10,000 years?
28
The answer is actually much more than that.
29
It's well over 100 million years.
30
This is only part of the story, though.
31
This whole process is called pre-training.
32
The goal of auto-completing a random passage of text from the internet is very different from the goal of being a good AI assistant.
33
To address this, chatbots undergo another type of training, just as important, called reinforcement learning with human feedback.
34
Workers flag unhelpful or problematic predictions, and their corrections further change the model's parameters, making them more likely to give predictions that users prefer.
35
Looking back at the pre-training, though, this staggering amount of computation is only made possible by using special computer chips that are optimized for running many operations in parallel, known as GPUs.
36
However, not all language models can be easily parallelized.
37
Prior to 2017, most language models would process text one word at a time, but then a team of researchers at Google introduced a new model known as the transformer.
38
Transformers don't read text from the start to the finish, they soak it all in at once, in parallel.
39
The very first step inside a transformer, and most other language models for that matter, is to associate each word with a long list of numbers.
40
The reason for this is that the training process only works with continuous values, so you have to somehow encode language using numbers, and each of these lists of numbers may somehow encode the meaning of the corresponding word.
41
What makes transformers unique is their reliance on a special operation known as attention.
42
This operation gives all of these lists of numbers a chance to talk to one another and refine the meanings they encode based on the context around, all done in parallel.
43
For example, the numbers encoding the word bank might be changed based on the context surrounding it to somehow encode the more specific notion of a riverbank.
44
Transformers typically also include a second type of operation known as a feed-forward neural network, and this gives the model extra capacity to store more patterns about language learned during training.
45
All of this data repeatedly flows through many different iterations of these two fundamental operations, and as it does so, the hope is that each list of numbers is enriched to encode whatever information might be needed to make an accurate prediction of what word follows in the passage.
46
At the end, one final function is performed on the last vector in this sequence, which now has had a chance to be influenced by all the other context from the input text, as well as everything the model learned during training, to produce a prediction of the next word.
47
Again, the model's prediction looks like a probability for every possible next word.
48
Although researchers design the framework for how each of these steps work, it's important to understand that the specific behavior is an emergent phenomenon based on how those hundreds of billions of parameters are tuned during training.
49
This makes it incredibly challenging to determine why the model makes the exact predictions that it does.
50
What you can see is that when you use large language model predictions to autocomplete a prompt, the words that it generates are uncannily fluent, fascinating, and even useful.
51
If you're a new viewer and you're curious about more details on how transformers and attention work, boy do I have some material for you.
52
One option is to jump into a series I made about deep learning, where we visualize and motivate the details of attention and all the other steps in a transformer.
53
Also, on my second channel I just posted a talk I gave a couple months ago about this topic for the company TNG in Munich.
54
Sometimes I actually prefer the content I make as a casual talk rather than a produced video, but I leave it up to you which one of these feels like the better follow-on.

앱 다운로드

당신이 말하는 모든 문장을 AI가 채점

TRENDING

인기 동영상

이 비디오로 말하기 연습을 하는 이유는 무엇인가요?

이 영상은 인공지능과 대화하는 장면을 다룬 짧은 영화 대본을 기반으로 합니다. 이 대화의 맥락을 이해하면, 영어 회화 연습에 매우 유용할 수 있습니다. 영어 쉐도잉 기법을 활용하여 화자의 억양과 발음을 따라 하며 말하기 연습을 할 수 있습니다. 이 비디오에서는 대화형 AI의 작동 방식에 대해 설명하면서 자연스러운 대화 흐름을 보여주기 때문에, 배우는 과정 속에서 실제 대화에서 자주 사용되는 표현들을 익힐 수 있는 기회가 됩니다. shadow speech를 통해 듣기 능력뿐만 아니라, 발음과 문장 구성 능력을 동시에 향상시킬 수 있습니다.

문법 및 표현 분석

  • Imagine that...: '~라고 상상해 보세요'와 같은 표현은 청자가 상황을 상상하도록 유도합니다. 이 문장을 연습하면 의견을 제시할 때 활용할 수 있습니다.
  • It's important to understand that...: '이해하는 것이 중요하다'는 표현은 상대방에게 정보를 전달할 때 사용할 수 있습니다. 이렇게 문장을 시작하는 연습을 통해 강조하고 싶은 정보를 효과적으로 전달할 수 있습니다.
  • What this means is...: '~라는 것은...'이라는 표현은 설명을 할 때 유용합니다. 이 구조를 사용할 수 있으면 설명하는 기술이 한층 향상될 것입니다.

일상적인 발음 함정

이 비디오에서는 특정 단어나 억양이 발음하기 어려울 수 있습니다. 예를 들어, 'language model'과 같은 복합어는 발음이 매끄럽지 않으면 이해하기 어려울 수 있습니다. 또한 'GPT-3'와 같은 기술적인 용어들은 한국어와 영어 간의 발음 차이로 인해 혼동을 줄 수 있습니다. 발음을 연습할 때는 틀리기 쉬운 단어를 반복해서 말하는 것이 좋습니다. 영어 회화 연습을 위해 시청할 때, 특히 집중하여 이러한 단어들의 발음을 따라 말해 보세요. shadowspeak을 통해 발음이 개선될 뿐 아니라, 자신감을 높일 수 있습니다.

쉐도잉이란? 영어 실력을 빠르게 키우는 과학적 방법

쉐도잉(Shadowing)은 원래 전문 통역사 훈련을 위해 개발된 언어 학습 기법으로, 다언어 학자인 Dr. Alexander Arguelles에 의해 대중화된 방법입니다. 핵심 원리는 간단하지만 매우 강력합니다: 원어민의 영어를 들으면서 1~2초의 짧은 지연으로 즉시 소리 내어 따라 말하는 것——마치 '그림자(shadow)'처럼 화자를 따라가는 것입니다. 문법 공부나 수동적인 청취와 달리, 쉐도잉은 뇌와 입 근육이 동시에 실시간으로 영어를 처리하고 재현하도록 훈련합니다. 연구에 따르면 이 방법은 발음 정확도, 억양, 리듬, 연음, 청취력, 말하기 유창성을 크게 향상시킵니다. IELTS 스피킹 준비와 자연스러운 영어 소통을 원하는 분들에게 특히 효과적입니다.

커피 한 잔 사주기