brunch

You can make anything
by writing

C.S.Lewis

by 라인하트 Oct 06. 2020

앤드류 응의 머신러닝 (3-4) : 행렬과 행렬의 곱셈

온라인 강의 플랫폼 코세라의 창립자인 앤드류 응 (Andrew Ng) 교수는 인공지능 업계의 거장입니다. 그가 스탠퍼드 대학에서 머신 러닝 입문자에게 한 강의를 그대로 코세라 온라인 강의 (Coursera.org)에서 무료로 배울 수 있습니다. 이 강의는 머신러닝 입문자들의 필수코스입니다. 인공지능과 머신러닝을 혼자 공부하면서 자연스럽게 만나게 되는 강의입니다.

Linear Algebra Review

선형 대수 리뷰

Matrices and Matrix Multiplication (행렬과 행렬의 곱셉)

In this video we'll talk about matrix-matrix multiplication, or how to multiply two matrices together. When we talk about the method in linear regression for how to solve for the parameters theta 0 and theta 1 all in one shot, without needing an iterative algorithm like gradient descent. When we talk about that algorithm, it turns out that matrix-matrix multiplication is one of the key steps that you need to know.

이번 강의에서 행렬과 행렬의 곱셈을 설명합니다. 지난 강의에서 선형 회귀에서 파라미터 θ0와 θ1의 최적값을 찾기 위한 대한 경사 하강법 알고리즘 없이 한 번에 찾는 방법을 이야기한 적이 있습니다. 이번 강의에서 그 방법을 이야기할 수 있을 것입니다. 우선, 행렬과 행렬의 곱셈을 이해해야 합니다.

So let's, as usual, start with an example. Let's say I have two matrices and I want to multiply them together. Let me again just run through this example and then I'll tell you a little bit of what happened.

여기 행렬이 두 개 있습니다. 두 행렬을 곱하면 어떻게 되는 지를 설명합니다.

So the first thing I'm gonna do is I'm going to pull out the first column of this matrix on the right. And I'm going to take this matrix on the left and multiply it by a vector that is just this first column. And it turns out, if I do that, I'm going to get the vector [11, 9]. So this is the same matrix-vector multiplication as you saw in the last video. I worked this out in advance, so I know it's 11, 9. And then the second thing I want to do is I'm going to pull out the second column of this matrix on the right. And I'm then going to take this matrix on the left, so take that matrix, and multiply it by that second column on the right. So again, this is a matrix-vector multiplication step which you saw from the previous video. And it turns out that if you multiply this matrix and this vector you get 10, 14. And by the way, if you want to practice your matrix-vector multiplication, feel free to pause the video and check this product yourself. Then I'm just gonna take these two results and put them together, and that'll be my answer. So it turns out the outcome of this product is gonna be a two by two matrix. And the way I'm gonna fill in this matrix is just by taking my elements 11, 9, and plugging them here. And taking 10, 14 and plugging them into the second column, okay? So that was the mechanics of how to multiply a matrix by another matrix. You basically look at the second matrix one column at a time and you assemble the answers.

우선, 지난 강의에서 배운 행렬과 벡터의 곱셈을 활용합니다. 오른쪽 행렬에서 첫 번째 열만으로 이루어진 벡터 [1; 0; 5]를 만듭니다. 왼쪽 2 X 3 행렬과 벡터를 곱하면 [11; 9] 성분을 가진 2차원 벡터를 얻습니다. 다음으로 오른쪽 행렬에서 두 번째 열만으로 이루어진 벡터 [3; 1; 2]를 만듭니다. 왼쪽 2 X 3 행렬과 벡터를 곱하면 [10; 14] 성분을 가진 2차원 벡터를 얻습니다. 잠시 영상을 멈추고 행렬과 벡터의 곱셈을 직접 연습하는 것이 좋습니다.

여기 [11; 9]와 [10;14] 2개의 결과가 있습니다. 첫 번째 결과 벡터 [11; 9]를 결과 행렬의 첫 번째 열에 넣고, 두 번째 결과 벡터 [10, 14]를 결과 행렬의 두 번째 열에 넣습니다. 즉, 곱셈의 결과는 2 X 2 행렬입니다. 이것이 행렬과 행렬을 곱하는 기계적인 방법입니다. 기본적으로 두 번째 행렬을 열 단위 벡터로 만들어 각각 계산한 후에 합칩니다.

And again, we'll step through this much more carefully in a second. But I just want to point out also, this first example is a 2x3 matrix. Multiply that by a 3x2 matrix, and the outcome of this product turns out to be a 2x2 matrix. And again, we'll see in a second why this was the case. All right, that was the mechanics of the calculation. Let's actually look at the details and look at what exactly happened. Here are the details. I have a matrix A and I want to multiply that with a matrix B and the result will be some new matrix C. It turns out you can only multiply together matrices whose dimensions match. So A is an m x n matrix, so m rows, n columns. And we multiply with an n x o matrix. And it turns out this n here must match this n here. So the number of columns in the first matrix must equal to the number of rows in the second matrix. And the result of this product will be a m x o matrix, like the matrix C here. And in the previous video everything we did corresponded to the special case of o being equal to 1. That was to the case of B being a vector. But now we're gonna deal with the case of values of o larger than 1.

계산 과정을 다시 한번 정리합니다. 첫 번째 예제에서 A 행렬은 2x3 행렬이고, B 행렬은 3 X 2 행렬입니다. 두 행렬을 곱한 결과는 2 X 2 행렬입니다. 이렇게 되는 이유를 설명합니다.

세부적인 사항을 살펴보면서 계산 과정을 점검합니다. 행렬 A와 행렬 B를 곱하면 행렬 C가 만들어집니다. 오직 차원의 크기가 일치하는 2개의 행렬만을 곱할 수 있습니다. 행렬 A는 m X n 행렬로 m은 행의 수이고 n은 열의 수입니다. 행렬 B는 n X o 행렬로 n 은 행의 수이고 o는 열의 수입니다. 행렬 A의 n과 행렬 B의 n은 반드시 일치해야 합니다. 첫 번째 행렬의 열의 수가 두 번째 행렬의 행의 수와 반드시 일치해야 합니다. 결과는 m x o 차원의 행렬 C입니다 결국, 행렬과 벡터의 곱셈은 행렬 B가 o = 1 인 벡터입니다. 행렬과 행렬의 곱셈은 o가 1보다 큰 경우입니다.

So here's how you multiply together the two matrices. What I'm going to do is I'm going to take the first column of B and treat that as a vector, and multiply the matrix A by the first column of B. And the result of that will be a n by 1 vector, and I'm gonna put that over here. Then I'm gonna take the second column of B, right? So this is another n by 1 vector. So this column here, this is n by 1. It's an n-dimensional vector. Gonna multiply this matrix with this n by 1 vector. The result will be a m-dimensional vector, which we'll put there, and so on. And then I'm gonna take the third column, multiply it by this matrix. I get a m-dimensional vector. And so on, until you get to the last column. The matrix times the last column gives you the last column of C. Just to say that again, the ith column of the matrix C is obtained by taking the matrix A and multiplying the matrix A with the ith column of the matrix B for the values of i = 1, 2, up through o. So this is just a summary of what we did up there in order to compute the matrix C.

여기 2개의 행렬을 곱하는 방법이 있습니다. 행렬 B의 첫 번째 열은 벡터입니다. 행렬 A와 행렬 B의 첫 번째 열 벡터를 곱합니다. 결과 벡터는 n x 1차원이고 행렬 C의 첫 번째 열에 위치합니다. 행렬 C의 녹색 박스입니다.

행렬 B의 두 번째 열은 벡터입니다. 행렬 A와 행렬 B의 두 번째 열로 구성된 백터를 곱합니다. 이 결과는 n x 1차원이고 행렬 C의 두 번째 열에 위치합니다. 행렬 C의 분홍색 박스입니다.

행렬 B의 세 번째 열은 벡터입니다. 행렬 A와 행렬 B의 세 번째 열로 구성된 백터를 곱합니다. 이 결과는 n x 1차원이고 행렬 C의 세 번째 열에 위치합니다. 행렬 C의 주황색 박스입니다.

계속해서 마지막 열까지 m차원 벡터를 행렬 C에 마지막 열까지 위치합니다. 즉, 행렬 C의 i번째 열은 행렬 A를 가져오고 행렬 B의 i번째 열과 곱하는 것입니다. 1부터 o번 까지 이 과정을 반복합니다. 이것이 행렬 C를 계산하는 과정입니다.

Let's look at just one more example. Let's say I want to multiply together these two matrices.

다음 예제를 봅시다. 2개의 행렬을 곱한다고 가정합시다.

So what I'm going to do is first pull out the first column of my second matrix. That was my matrix B on the previous slide and I therefore have this matrix times that vector. And so, oh, let's do this calculation quickly. This is going to be equal to the 1, 3 x 0, 3, so that gives 1 x 0 + 3 x 3. And the second element is going to be 2, 5 x 0, 3, so that's gonna be 2 x 0 + 5 x 3. And that is 9, 15. Oh, actually let me write that in green. So this is 9, 15. And then next I'm going to pull out the second column of this and do the corresponding calculations. So that's this matrix times this vector 1, 2. Let's also do this quickly, so that's 1 x 1 + 3 x 2, so that was that row. And let's do the other one. So let's see, that gives me 2 x 1 + 5 x 2 and so that is going to be equal to, lets see, 1 x 1 + 3 x 1 is 7 and 2 x 1 + 5 x 2 is 12. So now I have these two and so my outcome, the product of these two matrices, is going to be this goes here and this goes here. So I get 9, 15 and 4, 12. [It should be 7,12] And you may notice also that the result of multiplying a 2x2 matrix with another 2x2 matrix, the resulting dimension is going to be that first 2 times that second 2.So the result is itself also a 2x2 matrix.

계산해 봅시다

우선, 첫 번째 행렬과 두 번째 행렬의 첫 번째 열 벡터를 곱합니다.

첫 번째 성분은 1x0 + 3x3 = 9입니다.

두 번째 성분은 2x0 + 5x3 =15입니다.

그 결과는 [9; 15]이고, 결과 행렬의 첫 번째 열에 적습니다.

첫 번째 행렬과 두 번째 행렬의 두 번째 열 벡터를 곱합니다.

첫 번째 성분은 1x1 + 3x2 = 4입니다.

두 번째 성분은 2x1 + 5x2 = 12입니다.

그 결과는 [7,12]이고, 결과 행렬의 두 번째 열에 적습니다.

2 X 2 행렬과 2 X 2 행렬의 곱셈 결과의 차원은 첫 번째 행렬의 첫 2와 두 번째 행렬의 뒤 2를 합한 것입니다. 결과 행렬은 2 X 2차원 행렬입니다.

Finally, let me show you one more neat trick that you can do with matrix-matrix

multiplication. Let's say, as before, that we have four houses whose prices we wanna predict. Only now, we have three competing hypotheses shown here on the right. So if you want to apply all three competing hypotheses to all four of your houses, it turns out you can do that very efficiently using a matrix-matrix multiplication. So here on the left is my usual matrix, same as from the last video where these values are my housing prices [he means housing sizes] and I've put 1s here on the left as well.

마지막으로 행렬과 행렬의 곱셈이 머신러닝에서 어떻게 사용되는 지를 보겠습니다. 주택 가격을 예측하기 위한 4개의 주택 크기 데이터가 있습니다. 오른쪽 경쟁력 있는 가설이 3개 있습니다. 4개의 데이터를 모든 가설에 적용할 때 행렬과 행렬의 곱셈은 매우 유용합니다. 행렬과 행렬의 곱셈으로 데이터 행렬 X 파라미터 행렬로 만들어 봅니다.

And what I am going to do is construct another matrix where here, the first column is this -40 and 0.25 and the second column is this 200, 0.1 and so on. And it turns out that if you multiply these two matrices, what you find is that this first column, I'll draw that in blue. Well, how do you get this first column? Our procedure for matrix-matrix multiplication is, the way you get this first column is you take this matrix and you multiply it by this first column. And we saw in the previous video that this is exactly the predicted housing prices of the first hypothesis, right, of this first hypothesis here. And how about the second column? Well, [INAUDIBLE] second column. The way you get the second column is, well, you take this matrix and you multiply it by this second column. And so the second column turns out to be the predictions of the second hypothesis up there, and similarly for the third column. And how about the second column? Well, [INAUDIBLE] second column. The way you get the second column is, well, you take this matrix and you multiply it by this second column. And so the second column turns out to be the predictions of the second hypothesis up there, and similarly for the third column. And how about the second column? Well, [INAUDIBLE] second column. The way you get the second column is, well, you take this matrix and you multiply it by this second column. And so the second column turns out to be the predictions of the second hypothesis up there, and similarly for the third column.

오른쪽에는 주택 크기에 대한 데이터 행렬을 만듭니다. 두 번째 행렬은 파라미터 θ 행렬을 만듭니다.

첫 번째 열은 어떻게 계산할 수 있을까요? 첫 번째 열을 구하는 방법은 첫 번째 행렬과 두 번째 행렬의 첫 번째 열을 곱하는 것입니다. 바로 전 강의에서 [-40; 0.25] 벡터에 대해 계산했습니다. 파란색 박스의 부분입니다.

두 번째 열은 어떻게 계산할까요? 두 번째 열을 구하는 방법은 첫 번째 행렬과 두 번째 행렬의 두 번째 열을 곱하는 것입니다. 결과 행렬의 빨간색 박스 부분입니다. 세 번째 열도 같은 방법으로 계산합니다. 결과 행렬의 분홍색 박스 부분입니다.

모든 과정을 자세히 계산하지 않았지만, 스스로 계산하여 값이 맞는지 확인하는 것이 좋습니다. 두 개의 행렬을 만들 수 있습니다. 3개의 가설을 이용하여 파라미터 행렬을 만들고, 4개 주택 크기 데이터를 이용하여 데이터 행렬을 만듭니다.

So with just one matrix multiplication step you managed to make 12 predictions.

And even better, it turns out that in order to do that matrix multiplication, there are lots of good linear algebra libraries in order to do this multiplication step for you. And so pretty much any reasonable programming language that you might be using. Certainly all the top ten most popular programming languages will have great linear algebra libraries. And there'll be good linear algebra libraries that are highly optimized in order to do that matrix-matrix multiplication very efficiently. Including taking advantage of any sort of parallel computation that your computer may be capable of, whether your computer has multiple cores or multiple processors. Or within a processor sometimes there's parallelism as well called SIMD parallelism that your computer can take care of. And there are very good free libraries that you can use to do this matrix-matrix multiplication very efficiently, so that you can very efficiently make lots of predictions with lots of hypotheses.

하나의 행렬 곱셈으로 12개의 예상 가격을 계산하였습니다. 더 좋은 것은 행렬 곱셈을 도와주는 많은 선형 대수 라이브러리가 있다는 것입니다. 여러분들이 사용할 수 있는 괜찮은 프로그램 언어들은 좋은 선형 대수 라이브러리가 있습니다. 효율적으로 행렬 곱셈을 계산하는 것에 최적화되어 있습니다. 선형대수 라이브러리는 병렬 계산의 이점을 활용하여 멀티 코어 컴퓨터를 활용하기도 합니다. 원 코어를 가진 컴퓨터도 SIMD라고 불리는 병행 프로그램을 이용할 수 있습니다. 실제로 효율적으로 행렬 곱셈을 사용하기 위한 많은 무료 라이브러리가 있습니다. 많은 가설들을 이용하여 효율적으로 예측할 수 있습니다.