brunch

You can make anything
by writing

C.S.Lewis

by 라인하트 Sep 24. 2020

앤드류 응의 머신러닝 (1-4) : 비지도 학습

온라인 강의 플랫폼 코세라의 창립자인 앤드류 응 (Andrew Ng) 교수는 인공지능 업계의 거장입니다. 그가 스탠퍼드 대학에서 머신 러닝 입문자에게 한 강의를 그대로 코세라 온라인 강의 (Coursera.org)에서 무료로 배울 수 있습니다. 이 강의는 머신러닝 입문자들의 필수코스입니다. 인공지능과 머신러닝을 혼자 공부하면서 자연스럽게 만나게 되는 강의입니다.

Welcome

환영

Introduction (소개)

Unsupervised learning (비지도 학습)

In this video, we'll talk about the second major type of machine learning problem, called Unsupervised Learning.

이번 강의에서 머신 러닝 문제의 두 번째 유형인 비지도 학습을 설명합니다.

In the last video, we talked about Supervised Learning. Back then, recall data sets that look like this, where each example was labeled either as a positive or negative example, whether it was a benign or a malignant tumor. So for each example in Supervised Learning, we were told explicitly what is the so-called right answer, whether it's benign or malignant.

지난 강의에서 우리는 지도 학습을 다루었습니다. 지난 강의에서 다루었던 데이터 셋을 기억해봅니다. 양성 종양인지 악성 종양인지를 알려주는 Positve 또는 Negative로 레이블이 지정된 학습 예제가 있습니다. 지도 학습은 명시적으로 학습 예제에 악성인지 양성인지에 대한 정답이 있습니다.

In Unsupervised Learning, we're given data that looks different than data that looks like this that doesn't have any labels or that all has the same label or really no labels. So we're given the data set and we're not told what to do with it and we're not told what each data point is. Instead we're just told, "here is a data set. Can you find some structure in the data?" Unsupervised Learning algorithm might decide that Given this data set, an the data lives in two different clusters. And so there's one cluster and there's a different cluster. And yes, Supervised Learning algorithm may break these data into these two separate clusters. So this is called a clustering algorithm. And this turns out to be used in many places.

비지도 학습은 다른 유형의 학습 데이터 셋을 다룹니다. 모두 레이블이 없거나 모두 같은 레이블이 있습니다. 그래서 데이터 셋이 무엇을 의미하는지 또는 데이터 셋으로 무엇을 할지를 알 수 없습니다. 대신에 우리는 "여기 데이터가 있습니다, 어떤 구조를 찾을 수 있을까요?"라고 알고리즘에 물을 뿐입니다. 여기 비지도 학습 알고리즘은 두 가지 다른 클러스터로 되어 있다고 판단할 수도 있습니다. 여기 아래에 클러스터 하나가 있고, 위쪽에 클러스터 하나가 있습니다. 즉, 데이터를 두 가지 서로 다른 클러스터로 나누는 알고리즘을 클러스터링 알고리즘이라고 합니다. 정말 많은 곳에서 사용합니다.

One example where clustering is used is in Google News and if you have not seen this before, you can actually go to this URL news.google.com to take a look. What Google News does is everyday it goes and looks at tens of thousands or hundreds of thousands of new stories on the web and it groups them into cohesive news stories. For example, let's look here. The URLs here link to different news stories about the BP Oil Well story.

예를 들면, 클러스터링 알고리즘을 사용하는 곳은 구글 뉴스입니다. 만일 전에 이것을 본 적이 없다면, news.google.com에서 확인할 수 있습니다. 구글 뉴스는 웹에서 매일 수만, 수천 가지의 새로운 기사들을 확인하고 기사들을 연관성이 있는 것끼리 묶습니다. 이 URL들은 "BP Oil Well" 토픽에 대한 서로 다른 기사들을 링크합니다.

So, let's click on one of these URL's and we'll click on one of these URL's. What I'll get to is a web page like this. Here's a Wall Street Journal article about, you know, the BP Oil Well Spill stories of "BP Kills Macondo", which is a name of the spill and if you click on a different URL from that group then you might get the different story. Here's the CNN story about a game, the BP Oil Spill, and if you click on yet a third link, then you might get a different story. Here's the UK Guardian story about the BP Oil Spill. So what Google News has done is look for tens of thousands of news stories and automatically cluster them together. So, the news stories that are all about the same topic get displayed together.

이 URL 중 하나를 클릭합니다. URL을 클릭하면 웹 페이지에 들어갈 수 있습니다. 여기는 BP Oil에 대한 월스트리트 저널의 기사입니다. "BP Kills Macondo", 는 오일 유출 사건의 이름입니다. 그룹핑된 다른 기사의 URL을 클릭하면 또 다른 이야기를 확인할 수 있습니다. 여기는 BP 기름 유출의 CNN의 기사입니다. 또 다른 세 번째 링크를 클릭해보면 또 다른 기사를 확인할 수 있습니다. 여기는 BP 기름 유출에 대한 UK 가디언의 기사입니다. 구글 뉴스는 말 그대로 수만 개의 기사들을 자동적으로 클러스터링 합니다. 구글 뉴스는 같은 토픽이나 주제의 기사들을 클러스터로 묶습니다.

It turns out that clustering algorithms and Unsupervised Learning algorithms are used in many other problems as well. Here's an example of DNA microarray data. The idea is put a group of different individuals and for each of them, you measure how much they do or do not have a certain gene. Technically you measure how much certain genes are expressed. So these colors, red, green, gray and so on, they show the degree to which different individuals do or do not have a specific gene. And what you can do is then run a clustering algorithm to group individuals into different categories or into different types of people. Here's one on understanding genomics. So this is Unsupervised Learning because we're not telling the algorithm in advance that these are type 1 people, those are type 2 persons, those are type 3 persons and so on and instead what were saying is yeah "Here's a bunch of data. I don't know what's in this data. I don't know who's and what type. Idon't even know what the different types of people are, but can you automatically find structure in the data from the you automatically cluster the individuals into these types that I don't know in advance?" Because we're not giving the algorithm the right answer for the examples in my data set, this is Unsupervised Learning. Unsupervised Learning or clustering is used for a bunch of other applications.

클러스터링 알고리즘과 비지도 학습 알고리즘은 여러 문제들을 해결할 수 있습니다. 대표적으로 DNA Microarray 사례가 있습니다. 사람들을 그룹으로 묶습니다. 그들이 특정 유전자를 얼마나 가지고 있는 지를 측정합니다. 엄밀히 말하면, 얼마나 발현되는 지를 측정합니다. 즉, 이 색깔들, 빨강, 녹색, 초록, 회색 등등, 어떤 개인이 특정 유전자를 얼마나 가졌느냐를 보여줍니다. 클러스터링 알고리즘은 사람들을 서로 다른 그룹 또는 타입으로 묶습니다. 이것이 우리가 유전자를 이해하는 방식입니다. 이것이 비지도 학습입니다. 왜냐하면 우리는 알고리즘에게 이 사람들은 1번 타입, 저 사람들은 2번 타입, 또 다른 사람들은 3번 타입이라고 미리 알려주지 않습니다. 대신에 "여기 엄청 많은 데이터가 있습니다. 데이터 안에 무엇이 있는지 모릅니다. 누구의 데이터이고 어떤 유형 인지도 모릅니다. 사람들은 어떤 타입으로 나누어야 할지도 모릅니다. 이 데이터에서 자동으로 구조를 찾아내 줄 수 있겠습니까? 자동으로 각 사람을 뭔진 모르지만 어떤 타입으로 클러스터링 해줄 수 있겠니?"라고 하는 것입니다. 알고리즘에게 데이터 셋의 정답을 알려주지 않기 때문에 비지도 학습입니다. 비지도 학습 또는 클러스터링은 다른 곳에서도 많이 응용합니다.

It's used to organize large computer clusters.I had some friends looking at large data centers, that is large computer clusters and trying to figure out which machines tend to work together and if you can put those machines together, you can make your data center work more efficiently.

첫 번째는 대규모 컴퓨터 클러스터입니다. 제 친구들 중에 데이터 센터에서 일하는 전문가들이 있습니다. 클러스터링 알고리즘은 거대한 컴퓨터 클러스터에서 어떤 기계들이 주로 함께 동작하는 경향이 있는 지를 파악합니다. 함께 동작하는 기계들을 같은 위치에 놓는다면, 데이테센터를 더 효율적으로 만들 수 있기 때문입니다.

This second application is on social network analysis. So given knowledge about which friends you email the most or given your Facebook friends or your Google+ circles, can we automatically identify which are cohesive groups of friends, also which are groups of people that all know each other?

두 번째는 소셜 네트워크 분석입니다. 여러분이 어떤 친구에게 이메일을 가장 많이 보내는지 또는 페이스북 친구나 구글 플러스 서클에 대한 정보가 주어질 때 어떤 그룹이 적절한 친구 그룹인지 또한 서로가 모두 다 아는 사람들의 그룹은 어떤 것인지를 자동으로 식별합니다.

Market segmentation. Many companies have huge databases of customer information. So, can you look at this customer data set and automatically discover market segments and automatically group your customers into different market segments so that you can automatically and more efficiently sell or market your different market segments together? Again, this is Unsupervised Learning because we have all this customer data, but we don't know in advance what are the market segments and for the customers in our data set, you know, we don't know in advance who is in market segment one, who is in market segment two, and so on. But we have to let the algorithm discover all this just from the data.

세 번째는 시장 세분화입니다. 많은 회사들이 방대한 고객 정보 데이터베이스를 가지고 있습니다. 클러스터링 알고리즘은 고객 데이터 셋에서 자동으로 시장을 세분화하고, 세분화된 시장에 자동으로 고객을 그룹핑합니다. 그래서, 여러분들은 자동으로 세분화된 시장에 대한 영업과 마케팅을 더 효율적으로 할 수 있습니다. 이것도 비지도 학습입니다. 왜냐하면 우리는 모든 고객 데이터가 있지만, 시장 세분화가 무엇인지 미리 알려 주지 않았습니다. 데이터 셋에 있는 고객 중에 어떤 고객이 1번 세분화된 시장에 있는지, 어떤 고객이 2번 세분화된 장에 있는지 등등을 알 수 없습니다. 그러나, 클러스터링 알고리즘은 데이터에서 모든 것을 발견할 수 있습니다.

Finally, it turns out that Unsupervised Learning is also used for surprisingly astronomical data analysis and these clustering algorithms gives surprisingly interesting useful theories of how galaxies are formed. All of these are examples of clustering, which is just one type of Unsupervised Learning.

마지막으로, 비지도 학습은 놀랍게도 천문학 데이터 분석에도 사용합니다. 클러스터링 알고리즘은 은하계의 생성에 관해 굉장히 흥미롭고 유용한 이론을 만들어 줍니다. 이 모든 것은 비지도 학습의 한 유형인 클러스터링입니다.

Let me tell you about another one. I'm gonna tell you about the cocktail party problem. So, you've been to cocktail parties before, right? Well, you can imagine there's a party, room full of people, all sitting around, all talking at the same time and there are all these overlapping voices because everyone is talking at the same time, and it is almost hard to hear the person in front of you. So maybe at a cocktail party with two people, two people talking at the same time, and it's a somewhat small cocktail party. And we're going to put two microphones in the room so there are microphones, and because these microphones are at two different distances from the speakers, each microphone records a different combination of these two speaker voices. Maybe speaker one is a little louder in microphone one and maybe speaker two is a little bit louder on microphone 2 because the 2 microphones are at different positions relative to the 2 speakers, but each microphone would cause an overlapping combination of both speakers' voices. So here's an actual recordingof two speakers recorded by a. researcher.

또 다른 사례를 설명합니다. 칵테일파티 문제입니다. 여러분들은 파티에 가본 적이 있을 것입니다. 좁은 공간에 가득 찬 사람들이 모두 앉아서 동시에 이야기합니다. 모든 사람들이 동시에 이야기하기 때문에 모든 목소리가 중첩되어 사람들이 하는 말을 듣기도 어렵습니다. 칵테일파티에서 두 사람이 동시에 이야기한다고 가정합니다. 작은 칵테일파티입니다. 방에 두 개의 마이크를 놓여 있습니다. 두 마이크는 화자로부터 서로 다른 거리에 떨어져 있습니다. 각 마이크는 두 사람의 목소리를 서로 다른 조합으로 녹음합니다. 1번 화자는 1번 마이크에서 좀 더 큰 소리가 입력될 수 있고, 2번 화자는 2번 마이크에서 좀 더 큰 소리가 입력될 것입니다. 왜냐하면 두 마이크가 두 화자에 대해 서로 다른 위치에 있기 때문입니다. 각 마이크는 두 목소리가 중첩된 목소리를 녹음합니다. 실제로 녹음된 두 화자의 목소리입니다.

Let me play for you the first, what the first microphone sounds like. "One (uno), two (dos), three (tres), four (cuatro), five (cinco), six (seis), seven (siete), eight (ocho), nine (nueve), ten (y diez)" All right, maybe not the most interesting cocktail party, there's two people counting from one to ten in two languages but you know. What you just heard was the first microphone recording. Here's the second recording. "Uno (one), dos (two), tres (three), cuatro (four), cinco (five), seis (six), siete (seven), ocho (eight), nueve (nine) y diez (ten)".

첫 번째 마이크에 입력된 소리를 재생합니다. 가장 흥미로운 칵테일파티는 아닐 수도 있습니다. 두 사람이 두 개의 언어로 1부터 10까지 셉니다. 여러분이 방금 들은 건 첫 번째 마이크의 녹음입니다. 이게 두 번째 녹음입니다. 두 사람이 두 개의 언어로 1부터 10까지 셉니다.

So we can do, is take these two microphone recorders and give them to an Unsupervised Learning algorithm called the cocktail party algorithm, and tell the algorithm - find structure in this data for you. And what the algorithm will do is listen to these audio recordings and say, you know it sounds like the two audio recordings are being added together or that have being summed together to produce these recordings that we had. Moreover, what the cocktail party algorithm will do is separate out these two audio sources that were being added or being summed together to form other recordings.

두 마이크의 녹음을 칵테일파티 알고리즘이라 불리는 비지도 학습 알고리즘에게 입력하면서 "녹음 파일에서 구조를 찾아 주세요"라고 말합니다. 알고리즘은 두 음성 녹음을 듣고 두 음성 녹음이 더해지거나 또는 합쳐진 것으로 판단합니다. 칵테일파티 알고리즘은 더해지거나 합쳐진 녹음에서 두 개의 음성 소스를 분리할 수 있습니다.

And, in fact, here's the first output of the cocktail party algorithm. "One, two, three, four, five, six, seven, eight, nine, ten " So, I separated out the English voice in one of the recordings.

이것이 칵테일파티 알고리즘의 첫 번째 출력입니다. 하나의 녹음에서 영어를 분리해냈죠

And here's the second of it. "Uno, dos, tres, quatro, cinco, seis, siete, ocho, nueve y diez." Not too bad, to give you one more example.

이게 두 번째 출력입니다 나쁘지 않죠?

Here's another recording of another similar situation, here's the first microphone : One, two, three, four, five, six, seven, eight, nine, ten. OK so the poor guy's gone home from the cocktail party and he 's now sitting in a room by himself talking to his radio. Here's the second microphone recording. "One, two, three, four, five, six, seven, eight, nine, ten". When you give these two microphone recordings to the same algorithm, what it does, is again say, you know, it sounds like there are two audio sources, and moreover,

여기 또 다른 비슷한 상황의 녹음이 있습니다. 이것이 첫 번째 마이크의 녹음을 재생합니다. 이제 불쌍한 남자는 칵테일파티에서 집으로 가고, 그는 지금 방에 앉아서 라디오에게 혼자 이야기합니다. 두 번째 마이크의 녹음을 재생합니다. 여러분이 칵테일파티 알고리즘에게 녹음 파일을 입력할 때, 두 개의 오디오 소스가 있다고 알려줍니다.

The album says, here is. the first of the audio sources I found. "One, two, three, four, five, six, seven, eight, nine, ten." So that wasn't perfect, it got the voice, but it also got a little bit of the music in there. Then here's the second output to the algorithm. Not too bad, in that second output it managed to get rid of the voice entirely. And just, you know, cleaned up the music, got rid of the counting from one to ten.

이것이 첫 번째 오디오 소스입니다. 완벽하진 않습니다. 목소리를 잡아내긴 했지만 음악도 조금 잡혔습니다. 나쁘지 않습니다. 이것이 알고리즘의 두 번째 출력입니다. 두 번째 출력에선 목소리를 완전히 없애는 데 성공했습니다. 1부터 10까지 세는 걸 지우고 음악 소리를 깨끗하게 했습니다.

So you might look at an Unsupervised Learning algorithm like this and ask how complicated this is to implement this, right? It seems like in order to, you know, build this application, it seems like to do this audio processing you need to write a ton of code or maybe link into like a bunch of synthesizer Java libraries that process audio, seems like a really complicated program, to do this audio, separating out audio and so on. It turns out the algorithm, to do what you just heard, that can be done with one line of code - shown right here. It take researchers a long time to come up with this line of code. I'm not saying this is an easy problem, But it turns out that when you use the right programming environment, many learning algorithms can be really short programs.

여러분들은 오디오를 분리하는 비지도 학습 알고리즘을 구현하는 것이 얼마나 복잡한지 궁금할 것입니다. 애플리케이션이 원음에서 오디오를 분리하기 위해 수많은 많은 코드를 작성하거나 C++이나 Java 라이브러리가 필요할 것이라 생각할 수 있습니다. 오디오를 분리해내는 프로그램은 매우 복잡할 것이라 상상할 수 있습니다. 하지만, 사실 알고 보면 알고리즘은 한 줄의 코드로 가능합니다. 이것입니다.

[W, s, v] = svd(Irepmat(sum(x.*x,1), size(x,1),1).*x)*x');

연구자들은 이 한 줄의 코드를 만들기 위해 오랜 시간이 투자했습니다. 이 문제가 쉽다고 말하는 건 아닙니다. 단지 여러분이 적당한 프로그래밍 환경을 사용한다면 많은 학습 알고리즘들을 간단히 프로그래밍할 수 있습니다

So this is also why in this class we're going to use the Octave programming environment. Octave, is free open source software, and using a tool like Octave or Matlab, many learning algorithms become just a few lines of code to implement. Later in this class, I'll just teach you a little bit about how to use Octave and you'll be implementing some of these algorithms in Octave. Or if you have Matlab you can use that too. It turns out the Silicon Valley, for a lot of machine learning algorithms, what we do is first prototype our software in Octave because software in Octave makes it incredibly fast to implement these learning algorithms.Here each of these functions like for example the SVD function that stands for singular value decomposition; but that turns out to be a linear algebra routine, that is just built into Octave. If you were trying to do this in C++ or Java, this would be many many lines of code linking complex C++ or Java libraries. So, you can implement this stuff as C++ or Java or Python, it's just much more complicated to do so in those languages.

이것이 우리가 수업에서 옥타브 프로그래밍 환경을 사용하는 이유입니다. 옥타브 프로그램은 무료로 공개된 소프트웨어입니다. 옥타브나 매트랩 같은 도구를 사용하면 많은 학습 알고리즘을 단지 몇 줄의 코드로 구현할 수 있습니다. 이 수업에서 나중에 옥타브의 사용법을 조금 가르칠 겁니다 그러면 여러분은 몇몇 알고리즘을 옥타브 구현하거나 매트랩으로 구현할 것입니다. 실리콘 밸리에서는 많은 머신 러닝 알고리즘을 우선 옥타브로 소프트웨어 프로토타입을 만듭니다. 왜냐하면 옥타브 프로그램에서 학습 알고리즘을 매우 빨리 구현할 수 있기 때문입니다. 여기서 각각의 함수들은, 예를 들어 SVD함수는 특이값 분해의 약자로 선형 대수학의 과정 중 하나입니다. 옥타브에 기본 내장되어 있습니다. SVD 함수를 C++이나 Java 라이브러리를 활용하여 많은 양의 코드를 해야 합니다. C++, Java, 또는 Python으로도 구현할 수 있지만, 아직 더 복잡합니다.

What I've seen after having taught machine learning for almost a decade now, is that, you learn much faster if you use Octave as your programming environment, and if you use Octave as your learning tool and as your prototyping tool, it'll let you learn and prototype learning algorithms much more quickly.And in fact what many people will do to in the large Silicon Valley companies is in fact, use an algorithm like Octave to first prototype the learning algorithm, and only after you've gotten it to work, then you migrate it to C++ or Java or whatever. It turns out that by doing things this way, you can often get your algorithm to work much faster than if you were starting out in C++. So, I know that as an instructor, I get to say "trust me on this one" only a finite number of times, but for those of you who've never used these Octave type programming environments before, I am going to ask you to trust me on this one, and say that you, you will, I think your time, your development time is one of the most valuable resources. And having seen lots of people do this, I think you as a machine learning researcher, or machine learning developer will be much more productive if you learn to start in prototype, to start in Octave, in some other language.

거의 10년 동안 머신 러닝을 가르치면서 익힌 것이 있습니다. 학생들은 옥타브를 프로그래밍 환경에서 훨씬 빨리 배웁니다. 옥타브를 학습 도구나 프로토타입 제작 도구로 사용하면 더 빨리 알고리즘을 배우고 프로토타입을 제작할 수 있습니다. 사실, 실리콘밸리에 있는 대기업의 직원들은 옥타브와 같은 언어를 사용하여 학습 알고리즘의 프로토타입을 만들고, 잘 동작한다면 C++이나 Java로 옮깁니다. 이런 방식으로 일을 하면 많은 경우 알고리즘을 C++로 작성하는 것보다 빠르게 구현할 수 있습니다. 사실 제가 "이건 그냥 절 믿어 보십시오"라고 말할 수 없습니다. 하지만 옥타브 같은 프로그래밍 환경을 한 번도 사용해보지 않은 분들에게 그냥 절 믿어 달라고 말하고 싶습니다. 여러분의 개발 시간은 가장 귀중한 자원 중 하나이고, 다른 사람들이 이렇게 하는 걸 봐온 결과입니다. 당신이 머신 러닝 연구자나 머신 러닝 개발자로서 옥타브와 같은 언어로 프로토타입을 만든다면 훨씬 더 효율적으로 일할 수 있습니다.

Finally, to wrap up this video, I have one quick review question for you. We talked about Unsupervised Learning, which is a learning setting where you give the algorithm a ton of data and just ask it to find structure in the data for us. Of the following four examples, which ones, which of these four do you think would will be an Unsupervised Learning algorithm as opposed to Supervised Learning problem. For each of the four check boxes on the left, check the ones for which you think Unsupervised Learning algorithm would be appropriate and then click the button on the lower right to check your answer. So when the video pauses, please answer the question on the slide.

마지막으로 이번 강의를 마무리하기 위해, 여러분을 위한 간단한 복습 문제가 하나 있습니다. 지금까지 비지도 학습을 배웠습니다. 비지도 학습은 알고리즘에게 다량의 데이터를 입력한 후 알고리즘이 구조를 찾아내도록 합니다. 다음 4 가지 예 중에서 어떤 것들이 비지도 학습 문제일까요? 왼쪽에 있는 4 개의 체크 박스에 체크하시고 오른쪽 아래 버튼을 클릭하세요. 영상이 멈추면 답을 하시기 바랍니다.

So, hopefully, you've remembered the spam folder problem? If you have labeled data, you know, with spam and non-spam e-mail, we'd treat this as a Supervised Learning problem. The news story example, that's exactly the Google News example that we saw in this video, we saw how you can use a clustering algorithm to cluster these articles together so that's Unsupervised Learning. The market segmentation example I talked a little bit earlier, you can do that as an Unsupervised Learning problem because I am just gonna get my algorithm data and ask it to discover market segments automatically. And the final example, diabetes, well, that's actually just like our breast cancer example from the last video. Only instead of, you know, good and bad cancer tumors or benign or malignant tumors we instead have diabetes or not and so we will use that as a supervised, we will solve that as a Supervised Learning problem just like we did for the breast tumor data.

운이 좋다면 스팸 메일함 문제를 기억할 것입니다. 스팸인지 아닌지 구분하기 위한 레이블이 있는 데이터가 있다면 지도 학습 문제입니다. 이번 강의에서 다룬 뉴스 기사 사례는 정확히 구글 뉴스의 예시입니다. 클러스터링 알고리즘이 뉴스 기사들을 묶는 것을 보았습니다. 이것은 비지도 학습입니다. 시장 세분화 사례는 조금 전에 언급했었습니다. 이것은 비지도 학습 문제입니다. 왜냐하면 알고리즘이 데이터를 보고 세분화된 시장을 자동으로 발견하기 때문입니다. 마지막 당뇨는 이전 강의에서 배운 유방암을 발견하는 사례와 똑같습니다. 단지 좋은/나쁜 암 종양, 또는 양성/악성인지 대신에 당뇨인지 아닌지를 봅니다. 유방암 사례와 똑같은 지도 학습 문제입니다.

So, that's it for Unsupervised Learning and in the next video, we'll delve more into specific learning algorithms and start to talk about just how these algorithms work and how we can, how you can go about implementing them.

여기까지가 비지도 학습입니다. 다음 강의에서 더 구체적인 학습 알고리즘으로 배울 것이고, 알고리즘이 동작하는 방식과 구현 방법에 대해 이야기할 것입니다.

정리하며 - 비지도 학습

지도 학습 알고리즘은 정답(레이블)이 지정된 학습 예제를 통해 학습합니다. 반면에 비지도 학습 알리고즘은 레이블이 없거나 모두 같은 데이터 셋을 통해 학습합니다. 알고리즘은 데이터 셋이 무엇을 의미하는지 또는 데이터 셋으로 무엇을 할 수 있는 지를 알지 못합니다. 단지 데이터 셋에서 구조를 찾을 뿐입니다.

1) 클러스터링 알고리즘

클러스터 (Cluster)는 군집 또는 그룹을 의미합니다. 데이터 셋에서 데이터를 몇몇 그룹으로 묶어서 데이터 셋의 구조를 알아냅니다. 클러스터링 알고리즘을 사용하는 사례는 다음과 같습니다.

- 구글 뉴스 : 같은 주제의 뉴스를 묶어서 보여줌
- DNA Microarray : 사람의 DNA 미세배열의 특정 유전자에 의한 그룹핑

- 데이터 센터에서 서버 배치 : 주로 통신하는 서버들을 그룹핑하여 전기 및 통신비 절감
- 소셜 네트워크 분석 : 친구들을 그룹핑하여 친구 그룹과 그룹의 특징을 드러냄

- 시장 세분화 : 기업의 고객 데이터에서 고객을 그룹핑하여 효율적인 영업과 마케팅 적용

- 천문 데이터 분석 : 우주로부터 얻은 데이터를 그룹핑하여 새로운 이론을 찾음

2) 칵테일파티 문제

칵테일파티 효과는 큰 행사장 또는 파티장에서 모인 사람들은 시끄럽게 떠들지만 사람들은 대화 상대의 목소리를 선택적으로 잘 알아듣는 것을 의미합니다. 사람은 잘 들리지 않더라도 선택적으로 원하는 소리를 들을 수 있는 선택적 지각이 가능합니다. 하지만, 인공지능은 한꺼번에 들려오는 소리를 하나의 소리로 받아들이기 때문에 관심 있는 소리만 선택적으로 듣는 게 불가능합니다. 이것을 칵테일파티 문제라고 합니다. 칵테일파티 문제를 해결하기 위해 비지도 학습 알고리즘을 사용합니다. 비지도 학습 알고리즘은 두 개의 소리를 각각 분리합니다.

정리하며 - 옥타브 프로그래밍 언어

칵테일파티 문제를 해결하기 위해 여러 소리가 겹쳐진 음원에서 하나의 소리를 분리해 내기 위해서는 상당히 복잡한 프로그램을 사용할 것이라고 사람들은 생각합니다. 하지만, 옥타브 프로그래밍 언어를 사용하면 단 한 줄의 코드로 가능합니다. 이 문제가 쉽다고 말하는 것이 아니라 학습 알고리즘을 사용하기 위해 적당한 언어를 사용해야 한다는 것을 강조한 것입니다.

옥타브 (Octave)는 수치 연산을 위한 고수준 프로그래밍 언어입니다. 공학용 애플리케이션 소프트웨어 패키지인 메트랩(MATRAB)과 호환성을 염두에 두고 개발하였기 때문에 무료로 제공되는 옥타브를 대체제로 활용합니다. 옥타브는 무료로 제공될 뿐만 아니라 유전 알고리즘이나 이미지 분석 관련 패키지 등도 제공합니다. 옥타브는 속도가 느린 것이 단점입니다.

사실, 실리콘밸리에 있는 대기업의 직원들은 옥타브와 같은 언어를 사용하여 학습 알고리즘의 프로토타입을 만들고, 잘 동작한다면 C++이나 Java로 옮깁니다

우리는 여기서 프로그래밍 언어를 잘하는 것이 목적이 아니라 학습 알고리즘을 이해하고 활용하는 것이 목적입니다. 옥타브는 목적에 정확하게 부합하는 프로그래밍 언어입니다.