brunch

You can make anything
by writing

C.S.Lewis

앤드류 응의 머신러닝 (1-1) : 머신러닝 사례

by 라인하트 Sep 21. 2020

온라인 강의 플랫폼 코세라의 창립자인 앤드류 응 (Andrew Ng) 교수는 인공지능 업계의 거장입니다. 그가 스탠퍼드 대학에서 머신 러닝 입문자를 위한 강의를 그대로 코세라 온라인 강의 (Coursera.org)에서 무료로 배울 수 있습니다. 이 강의는 머신러닝 입문자들의 필수코스입니다. 인공지능과 머신러닝을 혼자 공부하면서 자연스럽게 만나게 되는 강의입니다. 강의를 간략하게 정리합니다.

Welcome

환영

Welcome (환영)

Welcome to Machine Learning!

(머신러닝 세계에 오신 것을 환영합니다!)

What is machine learning?

You probably use it dozens of times a day without even knowing it. Each time you do a web search on Google or Bing, that works so well because their machine learning software has figured out how to rank what pages.

머신러닝이란 무엇일까요? 여러분은 아마 여러분도 모르게 하루에 수십 번씩 머신러닝을 사용하고 있습니다. 예를 들면, 여러분이 구글이나 빙에서 인터넷 검색을 할 때마다 머신러닝 소프트웨어가 동작합니다. 머신러닝 소프트웨어는 각 사이트의 순서를 어떻게 매길지 알아내어 표시합니다.

When Facebook or Apple's photo application recognizes your friends in your pictures, that's also machine learning. Each time you read your email and a spam filter saves you from having to wade through tons of spam, again, that's because your computer has learned to distinguish spam from non-spam email. So, that's machine learning. There's a science of getting computers to learn without being explicitly programmed.

페이스북이나 애플의 사진 애플리케이션은 여러분의 친구를 사진에서 찾아냅니다. 여러분이 이메일을 읽을 때마다 스팸 필터는 수많은 스팸 메일의 홍수에서 당신을 구해줍니다. 스팸 필터 알고리즘은 일반 메일과 스팸메일을 구분할 수 있기 때문입니다. 머신러닝은 컴퓨터가 명시적으로 프로그래밍되지 않아도 학습할 수 있는 기술이자 과학입니다.

One of the research projects that I'm working on is getting robots to tidy up the house. How do you go about doing that? Well what you can do is have the robot watch you demonstrate the task and learn from that. The robot can then watch what objects you pick up and where to put them and try to do the same thing even when you aren't there. For me, one of the reasons I'm excited about this is the AI, or artificial intelligence problem. Building truly intelligent machines, we can do just about anything that you or I can do. Many scientists think the best way to make progress on this is through learning algorithms called neural networks, which mimic how the human brain works, and I'll teach you about that, too.

제가 참여하고 있는 연구 프로젝트 중 하나는 로봇이 집안을 청소를 하도록 합니다. 그것이 어떻게 가능할까요? 아마 여러분이 직접 집을 청소하는 것을 로봇에게 시연하고 로봇이 학습할 수 있습니다. 로봇은 여러분이 어떤 물건을 집어 들어 어디다 놓는지를 지켜보고 당신이 없어도 똑같이 따라 할 수 있습니다. 이 프로젝트는 AI 또는 인공지능이라 불리는 문제이기 때문에 매우 흥미롭습니다. 진정으로 지적인 기계를 만들거나 인간이 할 수 있는 모든 일을 할 수 있는 로봇을 만드는 가장 좋은 방법은 사람의 뇌를 모방하는 컴퓨터 인공 신경망이라는 학습 알고리즘이라고 과학자들은 생각합니다.

In this class, you learn about machine learning and get to implement them yourself. I hope you sign up on our website and join us.

이 과정에서 여러분은 머신러닝과 실습을 함께 배울 것입니다. 코세라 웹사이트 가입한 후에 이 과정을 수강 신청하시기 바랍니다.

Welcome

환영

Introduction (소개)

Welcome (환영)

Welcome to this free online class on machine learning. Machine learning is one of the most exciting recent technologies. And in this class, you learn about the state of the art and also gain practice implementing and deploying these algorithms yourself.

머신러닝 무료 온라인 강의에 오신 것을 환영합니다. 머신 러닝은 최신 기술 중에서 가장 흥미로운 분야 중 하나입니다. 이 강의에서 여러분은 최신 기술을 배우고 직접 알고리즘을 구현하는 연습을 할 것입니다.

You've probably use a learning algorithm dozens of times a day without knowing it. Every time you use a web search engine like Google or Bing to search the internet, one of the reasons that works so well is because a learning algorithm, one implemented by Google or Microsoft, has learned how to rank web pages. Every time you use Facebook or Apple's photo typing application and it recognizes your friends' photos, that's also machine learning. Every time you read your email and your spam filter saves you from having to wade through tons of spam email, that's also a learning algorithm. For me one of the reasons I'm excited is the AI dream of someday building machines as intelligent as you or me. We're a long way away from that goal, but many AI researchers believe that the best way to towards that goal is through learning algorithms that try to mimic how the human brain learns. I'll tell you a little bit about that too in this class

여러분들은 자신도 모르게 학습 알고리즘을 하루에도 수십 번 사용합니다. 인터넷에서 검색하기 위해 구글이나 빙과 같은 웹 검색 엔진을 사용할 때마다 좋은 결과가 나오는 이유 중에 하나는 학습 알고리즘 때문입니다. 구글이나 마이크로소프트가 구현한 학습 알고리즘이 웹 페이지들의 순위를 매기는 법을 학습합니다. 페이스북이나 애플 사진 애플리케이션을 사용할 때마다 학습 알고리즘은 친구들의 사진을 인식합니다. 그것이 머신러닝입니다. 이메일을 읽을 때마다 스팸 필터는 수많은 스팸 이메일에서 당신을 구해 줍니다. 그것이 학습 알고리즘 때문입니다. 우리는 언젠가 인공지능을 여러분들과 저만큼 지적인 기계로 만들 것입니다. 우리는 아직 그 목표까지 갈 길이 멀지만, 많은 AI 연구자들은 목표를 달성하기 위해 노력 중입니다. 그들은 사람의 뇌가 학습하는 대로 학습 알고리즘을 만드는 것이 최선이라고 생각합니다. 이 강의에서 나중에 더 자세히 다룰 것입니다.

In this class you learn about state-of-the-art machine learning algorithms. But it turns out just knowing the algorithms and knowing the math isn't that much good if you don't also know how to actually get this stuff to work on problems that you care about. So, we've also spent a lot of time developing exercises for you to implement each of these algorithms and see how they work for yourself. So why is machine learning so prevalent today? It turns out that machine learning is a field that had grown out of the field of AI, or artificial intelligence. We wanted to build intelligent machines and it turns out that there are a few basic things that we could program a machine to do such as how to find the shortest path from A to B. But for the most part we just did not know how to write AI programs to do the more interesting things such as web search or photo tagging or email anti-spam. There was a realization that the only way to do these things was to have a machine learn to do it by itself. So, machine learning was developed as a new capability for computers and today it touches many segments of industry and basic science.

이번 강의에서 여러분들은 최신 머신러닝 알고리즘에 대해 배울 것입니다. 당신이 관심 있는 문제를 해결하기 위해 머신러닝을 활용하는 법을 모른다면, 알고리즘 아는 것과 수학을 아는 것은 중요하지 않습니다. 그래서, 당신이 각각의 알고리즘들을 구현하고 스스로 알고리즘을 작동시킬 수 있는 전문성을 개발할 수 있도록 많은 시간을 투자할 것입니다. 오늘날 왜 머신러닝이 널리 퍼져 있을까요? 머신러닝은 인공지능 AI의 여러 분야 중에서 하나입니다. 우리는 지능적인 기계를 만들기를 원했고, 우리는 기계가 A에서 B로 가는 가장 짧은 경로를 찾는 것과 같은 일을 하도록 프로그램할 수 있습니다. 그러나, 우리는 인공지능 프로그램들이 웹 검색, 사진 태그, 스팸메일 필터와 같은 더 흥미진진한 일들을 하도록 프로그램하는 법을 몰랐습니다. 기계가 더 흥미진진한 일들을 할 수 있도록 하는 유일한 방법은 기계가 스스로 학습하도록 하는 것입니다. 그래서, 머신러닝은 컴퓨터의 신기능으로 개발되었고, 오늘날 기초 과학과 여러 산업 분야에서 사용합니다.

For me, I work on machine learning and in a typical week I might end up talking to helicopter pilots, biologists, a bunch of computer systems people (so my colleagues here at Stanford) and averaging two or three times a week I get email from people in industry from Silicon Valley contacting me who have an interest in applying learning algorithms to their own problems. This is a sign of the range of problems that machine learning touches. There is autonomous robotics, computational biology, tons of things in Silicon Valley that machine learning is having an impact on. Here are some other examples of machine learning. There's database mining.

저는 머신 러닝 분야에서 일하고 있고, 헬리콥터 조종사, 생물학자, 스탠퍼드 대학의 동료인 컴퓨터 시스템 종사자들과 소통합니다. 평균 주 당 두 세 통의 이메일을 실리콘 벨리에서 일하는 사람들로부터 받습니다. 그들은 당면한 문제를 해결하기 위해 학습 알고리즘을 적용합니다. 이것은 머신 러닝이 필요한 분야의 범위가 점점 확대되고 있다는 것을 의미합니다. 머신러닝은 자동화 로보틱스, 컴퓨터 생명 공학, 실리콘 밸리의 수많은 일들에 영향을 주고 있습니다.

One of the reasons machine learning has so pervaded is the growth of the web and the growth of automation All this means that we have much larger data sets than ever before. So, for example tons of Silicon Valley companies are today collecting web click data, also called clickstream data, and are trying to use machine learning algorithms to mine this data to understand the users better and to serve the users better, that's a huge segment of Silicon Valley right now. Medical records. With the advent of automation, we now have electronic medical records, so if we can turn medical records into medical knowledge, then we can start to understand disease better. Computational biology. With automation again, biologists are

collecting lots of data about gene sequences, DNA sequences, and so on, and machines running algorithms are giving us a much better understanding of the human genome, and what it means to be human. And in engineering as well, in all fields of engineering, we have larger and larger, and larger and larger data sets, that we're trying to understand using learning algorithms.

머신러닝이 널리 활용되는 이유 중에 하나는 웹의 성장과 자동화의 성장입니다. 이것은 우리가 전보다 훨씬 더 큰 데이터를 가졌다는 것을 의미합니다. 그래서, 수많은 실리콘 밸리 기업들은 사용자들이 웹을 클릭할 때마다 생성되는 클릭스트림 데이터를 수집합니다. 기업들은 사용자를 더 잘 이해하고 사용자에게 더 나은 서비스를 제공하기 위해 클릭스트림 데이터를 분석할 수 있는 머신 러닝 알고리즘을 사용하려고 시도합니다. 이것은 실리콘밸리에서 현재 벌어지는 일입니다. 의료 기록이 자동화되면서 우리는 전자 의료 기록부를 만들었습니다. 만일 우리가 의료 기록을 의료 지식으로 전환할 수 있다면, 질병을 더 잘 이해할 수 있을 것입니다. 컴퓨터 생명 공학이 자동화되면서, 생물학자들은 유전자 서열과 DNA 서열 등에 대한 많은 데이터를 수집합니다. 머신러닝 알고리즘은 사람들이 인간의 지놈을 더 잘 이해할 수 있게 합니다. 공학 분야에서도 점점 더 큰 데이터 세트들을 생성되고, 우리는 이것을 이해하기 위해 머신러닝 알고리즘을 활용합니다.

A second range of machinery applications is ones that we cannot program by hand. So for example, I've worked on autonomous helicopters for many years. We just did not know how to write a computer program to make this helicopter fly by itself. The only thing that worked was having a computer learn by itself how to fly this helicopter.

두 번째 기계 응용 분야는 수동으로 프로그래밍할 수 없는 분야입니다. 예를 들면, 저는 자동으로 비행하는 헬리콥터를 수년간 연구했습니다. 우리는 헬리콥터가 스스로 날 수 있도록 컴퓨터 프로그래밍을 하는 법을 몰랐습니다. 우리가 했던 것은 단지 컴퓨터 헬리콥터가 날 수 있는 방법을 학습하게 하는 것이었습니다.

Handwriting recognition. It turns out one of the reasons it's so inexpensive today to route a piece of mail across the countries, in the US and internationally, is that when you write an envelope like this, it turns out there's a learning algorithm that has learned how to read your handwriting so that it can automatically route this envelope on its way, and so it costs us a few cents to send this thing thousands of miles.

손글씨 인식. 이것은 오늘날 미국에서 전 세계로 우편 한 통을 보내는 것은 비싸지 않습니다. 당신이 봉투에 쓴 우편번호와 주소를 학습 알고리즘이 당신의 손글씨를 배워서 자동으로 이 봉투를 분류합니다. 이 기술은 몇 센트로 우편을 수천 킬로미터를 보낼 수 있게 합니다.

And in fact if you've seen the fields of natural language processing or computer vision, these are the fields of AI pertaining to understanding language or understanding images. Most of natural language processing and most of computer vision today is applied machine learning.

자연어 처리 기법이나 컴퓨터 비전 분야도 있습니다. 언어를 이해하고 이미지를 이해하는 인공 지능 분야입니다. 머신러닝은 오늘날 대부분의 자연어 처리 기법과 컴퓨터 비전 분야에 적용되었습니다.

Learning algorithms are also widely used for self-customizing programs. Every time you go to Amazon or Netflix or iTunes Genius, and it recommends the movies or products and music to you, that's a learning algorithm. If you think about it they have million users; there is no way to write a million different programs for your million users. The only way to have software give these customized recommendations is to become learn by itself to customize itself to your preferences.

학습 알고리즘은 스스로 프로그래밍을 합니다. 아마존, 넥플릭스, 아이튠즈 지니에 접속할 때마다 당신은 영화, 제품 그리고 음악을 추천받습니다. 이것이 러닝 알고리즘입니다. 수백만명의 사용자가 있다고 생각해 봅시다. 수백만명의 사용자들 위한 개별 프로그래밍을 할 수 있는 방법은 없습니다. 소프트웨어가 사용자별 맞춤형 추천을 할 수 있는 유일한 방법은 오직 기계가 당신의 선호도를 스스로 배우고 직접 프로그래밍하는 것입니다.

Finally learning algorithms are being used today to understand human learning and to understand the brain. We'll talk about how researches are using this to make progress towards the big AI dream.

마지막으로 학습 알고리즘은 인간의 학습을 이해하고 뇌를 이해하기 위해 오늘날 사용됩니다. 우리는 연구자들이 인공지능 꿈을 향해 가는 과정에서 머신러닝을 사용하는 법에 대해 이야기할 것입니다.

A few months ago, a student showed me an article on the top twelve IT skills. The skills that information technology hiring managers cannot say no to. It was a slightly older article, but at the top of this list of the twelve most desirable IT skills was machine learning. Here at Stanford, the number of recruiters that contact me asking if I know any graduating machine learning students is far larger than the machine learning students we graduate each year.

몇 달 전에, 한 학생이 저에게 상위 12개의 IT 스킬에 대한 기사를 보여주었습니다. 그 기사는 IT 분야의 종사자들은 절대로 '아니요'라고 말할 수 없는 것이었습니다. 약간 오래된 기사이지만 12개 스킬들 중에 가장 전망 있는 IT 스킬은 바로 머신러닝 분야였습니다. 채용담당자들이 여기 스탠퍼드에 있는 저에게 자주 연락합니다. 머신러닝 수요는 매년 머신러닝을 배운 졸업생들보다 더 많습니다.

So I think there is a vast, unfulfilled demand for this skill set, and this is a great time to be learning about machine learning, and I hope to teach you a lot about machine learning in this class. In the next video, we'll start to give a more formal definition of what is machine learning. And we'll begin to talk about the main types of machine learning problems and algorithms. You'll pick up some of the main machine learning terminology, and start to get a sense of what are the different algorithms, and when each one might be appropriate.

그래서 지금은 머신러닝을 공부하기 아주 좋은 시절입니다. 이 강의에서 저는 머신러닝에 대해 많은 것을 가르칠 수 있길 바랍니다. 다음 영상에서, 우리는 여러 종류의 머신러닝 문제와 알고리즘에 대해 이야기를 시작할 것입니다. 당신은 주요 머신 러닝 용어를 이해할 것입니다. 른 알고리즘의 차이점을 알고 필요할 때 적용할 것입니다.