brunch

You can make anything
by writing

C.S.Lewis

by 라인하트 Oct 26. 2020

앤드류 응의 머신러닝(8-6):신경망의 논리연산 II

온라인 강의 플랫폼 코세라의 창립자인 앤드류 응 (Andrew Ng) 교수는 인공지능 업계의 거장입니다. 그가 스탠퍼드 대학에서 머신 러닝 입문자에게 한 강의를 그대로 코세라 온라인 강의 (Coursera.org)에서 무료로 배울 수 있습니다. 이 강의는 머신러닝 입문자들의 필수코스입니다. 인공지능과 머신러닝을 혼자 공부하면서 자연스럽게 만나게 되는 강의입니다.

Neural Networks : Representation

신경망 : 표현

Application (응용)

Examples and Intuitions II (예제와 이해 II)

In this video I'd like to keep working through our example to show how a Neural Network can compute complex non linear hypothesis.

이번 강의에서 인공 신경망이 복잡한 비선형 가설을 계산하는 방법을 이해하기 위해 논리 연산 예제를 계속 활용합니다.

In the last video we saw how a Neural Network can be used to compute the functions x1 AND x2, and the function x1 OR x2 when x1 and x2 are binary, that is when they take on values 0,1. We can also have a network to compute negation, that is to compute the function not x1. Let me just write down the ways associated with this network. We have only one input feature x1 in this case and the bias unit +1. And if I associate this with the weights plus 10 and -20, then my hypothesis is computing this h(x) equals sigmoid (10- 20 x1). So when x1 is equal to 0, my hypothesis would be computing g(10- 20 x 0) is just 10. And so that's approximately 1, and when x is equal to 1, this will be g(-10) which is approximately equal to 0. And if you look at what these values are, that's essentially the not x1 function.

지난 강의에서 신경망이 'x1 AND x2'와 'x1 OR x2' 논리 연산을 계산하는 방법을 설명했습니다. 여기 'NOT x1' 함수를 계산하는 신경망이 있습니다.

신경망은 피처 x1과 바이어스 유닛 +1만이 있습니다. 여기에 가중치 10과 -20을 각각 연결시키면 가설 함수 hθ(x) = g(10 -20x1)입니다. 따라서, x1 = 0 일 때 hθ(x) = g(10 -20 * 0) = g(10)이므로 1에 가깝습니다. x1= 1일 때 hθ(x) = g(10 -20 * 1) = g(-10)이므로 0입니다. 따라서 이 함수는 NOT x1입니다.

Cells include negations, the general idea is to put that large negative weight in front of the variable you want to negate. Minus 20 multiplied by x1 and that's the general idea of how you end up negating x1. And so in an example that I hope that you can figure out yourself. If you want to compute a function like this NOT x1 AND NOT x2, part of that will probably be putting large negative weights in front of x1 and x2, but it should be feasible.

So you get a neural network with just one output unit to compute this as well. All right, so this logical function, NOT x1 AND NOT x2, is going to be equal to 1 if and only if x1 equals x2 equals 0. All right since this is a logical function, this says NOT x1 means x1 must be 0 and NOT x2, that means x2 must be equal to 0 as well. So this logical function is equal to 1 if and only if both x1 and x2 are equal to 0 and hopefully you should be able to figure out how to make a small neural network to compute this logical function as well.

셀에는 음의 값이 포함됩니다. 일반적으로 네거티브를 원하는 변수 앞에 네거티브 가중치를 둡니다. 그래서 -20 * x1을 합니다. 스스로 충분히 알아낼 수 있습니다. (NOT x1) AND (NOT x2)와 같은 함수를 계산할 때 x1 및 x2 앞에 네거티브 가중치를 둡니다. 이것이 하나의 유닛을 가진 인공 신경망이 'Not x1'을 계산합니다. 논리 함수 NOT x1은 x1이 0일 때 1이고, NOT x2는 x 2가 0일 때 1입니다. 논리 연산 (NOT x1) AND (NOT x2)는 x1과 x 2가 0일 때 1입니다. 이 논리 함수를 계산하기 위한 작은 신경망을 만드는 방법을 알아낼 수 있어야 합니다.

Now, taking the three pieces that we have put together as the network for computing x1 AND x2, and the network computing for computing NOT x1 AND NOT x2. And one last network computing for computing x1 OR x2, we should be able to put these three pieces together to compute this x1 XNOR x2 function. And just to remind you if this is x1, x2, this function that we want to compute would have negative examples here and here, and we'd have positive examples there and there. And so clearly this will need a non linear decision boundary in order to separate the positive and negative examples.

x1 XNOR x2를 계산하기 위해 세 가지 조각이 필요합니다.

세 가지 함수는 x1 AND x2, (NOT x1) AND (NOT x2), x1 OR x2입니다. 수평축이 x1과 수직축이 x2 인 그래프에서 x1 XNOR x2 함수는 y = 1인 X 좌표는 (0,0), (1,1)이고 y=0인 좌표는 (1,0), (0,1)입니다. 따라서, 0의 예제와 1의 예제를 구분하기 위한 비선형 결정 경계가 필요합니다.

Let's draw the network. I'm going to take my input +1, x1, x2 and create my first hidden unit here. I'm gonna call this a 21 cuz that's my first hidden unit. And I'm gonna copy the weight over from the red network, the x1 and x2. As well so then -30, 20, 20. Next let me create a second hidden unit which I'm going to call a 2 2. That is the second hidden unit of layer two.

I'm going to copy over the cyan that's work in the middle, so I'm gonna have the weights 10 -20 -20. And so, let's pull some of the truth table values. For the red network, we know that was computing the x1 and x2, and so this will be approximately 0 0 0 1, depending on the values of x1 and x2, and for a 2 2, the cyan network. What do we know? The function NOT x1 AND NOT x2, that outputs 1 0 0 0, for the 4 values of x1 and x2.

x1 NXOR x2 함수를 그려봅시다. 입력 +1, x1, x2에서 첫 번째 은닉 유닛을 만듭니다. 첫 번째 은닉 유닛은 a^(2)1이고, 붉은색 네트워크의 가중치 -30, 20, 20을 복사합니다. 두 번째 은닉 유닛 a^(2)2는 2 번째 은닉층에 있고, 하늘색 네트워크의 가중치 10, -20, -20입니다. 진리표 항목을 채웁니다. x1과 x2의 값으로 'x1 AND x2'함수를 계산한 값 a^(2)1 은닉 유닛의 값은 0, 0, 0, 1입니다. x1과 x2의 값으로 (NOT x1) AND (NOT x2) 함수를 계산한 값 a^(2)2 은닉 유닛의 값은 1, 0, 0, 0입니다.

Finally, I'm going to create my output node, my output unit that is a 3 1. This is one more output h(x) and I'm going to copy over the old network for that. And I'm going to need a +1 bias unit here, so you draw that in, And I'm going to copy over the weights from the green networks. So that's -10, 20, 20 and we know earlier that this computes the OR function.So let's fill in the truth table entries. So the first entry is 0 OR 1 which can be 1 that makes 0 OR 0 which is 0, 0 OR 0 which is 0, 1 OR 0 and that falls to 1. And thus h(x) is equal to 1 when either both x1 and x2 are zero or when x1 and x2 are both 1 and concretely h(x) outputs 1

exactly at these two locations and then outputs 0 otherwise.

마지막으로, 출력 노드를 생성합니다. 출력 유닛은 a^(3)1입니다.

가설 함수 hθ(x)의 결과는 논리연산 a^(2)1 OR a^(2)2 입니다. 두 번째 층에 바이어스 유닛을 생성합니다 그리고 녹색 신경망의 가중치 -10, 20, 20을 복사하고 시그모이드 함수를 계산하고 진리표를 정리합니다.

가설 hθ(x)는 x1와 x2 가 둘 중에 하나가 1 이면 1이고, 둘 다 0이면 0입니다.

And thus will this neural network, which has a input layer, one hidden layer, and one output layer, we end up with a nonlinear decision boundary that computes this XNOR function. And the more general intuition is that in the input layer, we just have our four inputs. Then we have a hidden layer, which computed some slightly more complex functions of the inputs that its shown here this is slightly more complex functions. And then by adding yet another layer we end up with an even more complex non linear function.

그리고, 입력층, 은닉층, 출력층을 가진 인공 신경망은 XNOR 함수를 계산하는 비선형 결정 경계를 만듭니다. 입력층에 4 개의 학습 데이터 입력이 있습니다. 약간 더 복잡한 함수를 계산하는 은닉층이 있습니다. 그리고 또 다른 층을 추가하면 훨씬 더 복잡한 비선형 함수가 됩니다.

And this is a sort of intuition about why neural networks can compute pretty complicated functions. That when you have multiple layers you have relatively simple function of the inputs of the second layer. But the third layer I can build on that to complete even more complex functions, and then the layer after that can compute even more complex functions.

이것이 신경망이 매우 복잡한 함수를 계산할 수 있는 이유입니다. 여러 층이 있는 경우 두 번째 층의 입력에 대한 비교적 간단한 피처를 가집니다. 세 번째층은 더 복잡한 함수를 계산합니다. 네 번째 층은 훨씬 더 복잡한 함수를 계산할 수 있습니다.

To wrap up this video, want to show you a fun example of an application of a the Neural Network that captures this intuition of the deeper layers computing more complex features. I want to show you a video of that customer a good friend of mine Yann LeCunj. Yann is a professor at New York University, NYU and he was one of the early pioneers of Neural Network reasearch and is sort of a legend in the field now and his ideas are used in all sorts of products and applications throughout the world now.

이번 강의를 마무리하기 위해 더 복잡한 기능을 계산하는 더 깊은 층에 대한 감각을 포착할 수 있는 인공 신경망의 사례가 있습니다. 제 친구인 얀 레쿤의 영상입니다. 얀은 뉴욕대 교수이며, 인공 신경망 연구의 선구자 중 한 명입니다. 그는 인공 신경망 분야에서 전설이며 그의 아이디어는 전 세계의 모든 종류의 제품과 응용 프로그램에서 사용 중입니다

So I wanna show you a video from some of his early work in which he was using a neural network to recognize handwriting, to do handwritten digit recognition. You might remember early in this class, at the start of this class I said that one of the earliest successes of neural networks was trying to use it to read zip codes to help USPS Laws and read postal codes. So this is one of the attempts, this is one of the algorithms used to try to address that problem. In the video that I'll show you this area here is the input area that shows a canvasing character shown to the network. This column here shows a visualization of the features computed by sort of the first hidden layer of the network. So that the first hidden layer of the network and so the first hidden layer, this visualization shows different features. Different edges and lines and so on detected. This is a visualization of the next hidden layer. It's kinda harder to see, harder to understand the deeper, hidden layers, and that's a visualization of why the next hidden layer is confusing. You probably have a hard time seeing what's going on much beyond the first hidden layer, but then finally, all of these learned features get fed to the upper layer. And shown over here is the final answer, it's the final predictive value for what handwritten digit the neural network thinks it is being shown. So let's take a look at the video.

얀이 손글씨를 인식하기 위해 활용했던 인공 신경망의 초기 작업 비디오 클립입니다. 신경망의 사례로 우편 번호를 읽는 것을 강의 초반에 말했던 적이 있습니다. 이것은 우편 번호를 읽는 문제를 해결하는 여러 알고리즘 중의 하나입니다. 영상에서 숫자가 쓰인 네모칸은 숫자를 보여주는 영역입니다. 맨 왼쪽의 열은 신경망의 첫 번째 은닉층의 종류에 따라 계산된 피처를 시각화합니다. 인공신경망의 첫 번째 은닉층은 서로 다른 모서리와 선들을 감지합니다. 왼쪽에서 두 번째 열은 두 번째 은닉층을 시각화한 것입니다. 깊은 은닉층을 이해하는 것은 어렵고 더 깊은 은닉층을 이해하는 것은 더 어렵습니다. 세 번째 열은 세 번째 은닉층을 시각화한 것입니다. 첫 번째 은닉층을 넘어서면 인공 신경망에서 무슨 일이 벌어지는 지를 알 수 없지만, 마지막으로 학습된 모든 피처가 상위층에 전달합니다. 우측 상단에 표시된 것이 최종 결과입니다. 인공 신경망은 손글씨 숫자에 대한 최종 예측 값입니다. 영상을 봅시다

So I hope you enjoyed the video and that this hopefully gave you some intuition about the source of pretty complicated functions neural networks can learn. In which it takes its input this image, just takes this input, the raw pixels and the first hidden layer computes some set of features. The next hidden layer computes even more complex features and even more complex features. And these features can then be used by essentially the final layer of the logistic classifiers to make accurate predictions without the numbers that the network sees.

여러분들이 영상을 재미있게 보셨기를 바랍니다. 그리고 여러분이 인공 신경망의 매우 복잡한 함수의 근원에 대한 감각을 익혔기를 바랍니다. 이미지를 입력받은 곳에서 입력, 원시 픽셀 및 첫 번째 은닉층이 Feature의 부분 집합을 계산합니다. 다음 은닉층이 훨씬 더 복잡한 Feature를 계산합니다. 그리고 Feature는 로지스틱 분류기의 마지막 층에서 정확한 예측을 수행합니다.

앤드류 응의 머신러닝 동영상 강의

정리하며

입력값의 반대 값을 출력하는 논리 연산 NOT x1를 인공 신경망에서 구현합니다.

즉, x1의 입력 값에 반대 값을 출력합니다. 그래서 인공 신경망이 계산하는 논리 함수는 NOT x1입니다.

다음은 (NOT x1) AND (NOT x2) 논리 연산입니다.

논리 연산 XNOR는 지금까지의 논리 연산처럼 단일 유닛으로 해결할 수 있습니다. x1 AND x2 연산과 (NOT x1) AND (NOT x2) 연산과 x1 OR x2 연산을 순차적으로 진행해야 합니다.

마지막으로 얀 레쿤이 손글씨로 우편번호를 인식하는 인공 신경망을 보았습니다. 왼쪽의 열은 인공신경망의 첫 번째 은닉층에 의해 계산된 Feature를 시각화한 것입니다. 다음으로 두 번째 은닉층과 세 번째 은닉층을 시각화한 것입니다. 은닉층이 많아질수록 이해하는 것은 어렵습니다. 첫 번째 은닉층을 넘어서면 인공 신경망에서 무슨 일이 벌어지는 지를 알 수 없지만, 마지막 출력층은 원하는 값을 정상적으로 출력합니다.