brunch

You can make anything
by writing

C.S.Lewis

by 라인하트 Oct 09. 2020

앤드류 응의 머신러닝 (3-6) : 역행렬과 전치 행렬

온라인 강의 플랫폼 코세라의 창립자인 앤드류 응 (Andrew Ng) 교수는 인공지능 업계의 거장입니다. 그가 스탠퍼드 대학에서 머신 러닝 입문자에게 한 강의를 그대로 코세라 온라인 강의 (Coursera.org)에서 무료로 배울 수 있습니다. 이 강의는 머신러닝 입문자들의 필수코스입니다. 인공지능과 머신러닝을 혼자 공부하면서 자연스럽게 만나게 되는 강의입니다.

Linear Algebra Review

선형 대수 리뷰

Inverse and Transpose (역행렬과 전치 행렬 )

In this video, I want to tell you about a couple of special matrix operations, called the matrix inverse and the matrix transpose operation.

이번 강의에서 역행렬과 전치 행렬을 설명합니다.

Let's start by talking about matrix inverse, and as usual we'll start by thinking about how it relates to real numbers. In the last video, I said that the number one plays the role of the identity in the space of real numbers because one times anything is equal to itself. It turns out that real numbers have this property that very number have an, that each number has an inverse, for example, given the number three, there exists some number, which happens to be three inverse so that that number times gives you back the identity element one. And so to me, inverse of course this is just one third. And given some other number, maybe twelve there is some number which is the inverse of twelve written as twelve to the minus one, or really this is just one twelve. So that when you multiply these two things together. the product is equal to the identity element one again. Now it turns out that in the space of real numbers, not everything has an inverse. For example the number zero does not have an inverse, right? Because zero's a zero inverse, one over zero that's undefined. Like this one over zero is not well defined.

먼저 역행렬을 설명합니다. 역행렬을 이해하기 위해 실수에서 역수를 간단히 설명합니다. 지난 강의에서 실수의 항등원 1을 배웠습니다. 항등원 1은 어떤 수와 곱해도 같습니다. 실수에 역수를 곱하면 항등원 1이 됩니다. 예를 들면, 3이라는 숫자를 항등원 1로 바꿔주는 3의 역수 1/3입니다. 또, 12의 역수는 12^-1이나 1/12입니다. 두 수 중에 어느 것을 곱해도 결과는 항등원 1입니다. 모든 실수가 반드시 역수가 있는 것은 아닙니다. 예를 들어 0은 역수가 없습니다. 왜냐하면 0의 역수 0^-1과 1/0은 정의되지 않았습니다.

And what we want to do, in the rest of this slide, is figure out what does it mean to compute the inverse of a matrix. Here's the idea: If A is a n by n matrix, and it has an inverse, I will say. a bit more about that later, then the inverse is going to be written A to the minus one and A times this inverse, A to the minus one, is going to equal to A inverse times A, is going to give us back the identity matrix. Okay? Only matrixs that are. m by m for some the idea of M having inverse. So, a matrix is m by m, this is also called a square matrix and. it's called square because the number of rows is equal to the number of columns. Right and it turns out only square matrices have inverses, so A is a square matrix, is m by m, on inverse this equation over here.

역행렬은 행렬의 역수로 어떤 행렬에 역행렬을 곱한 결과는 항등 행렬입니다. 예를 들면, 행렬 A가 n X n 차원 행렬일 때 역행렬은 A^-1으로 표기합니다. 그리고 행렬 A와 역행렬 A^-1을 곱해서 항등 행렬을 만듭니다. 'A x A^-1 = A^-1 x A = I'입니다. 따라서, m X m 행렬만이 역행렬이 있습니다. m x m 행렬 또는 정방 행렬은 행과 열의 수가 같습니다. 오직 정방 행렬만이 역행렬이 있습니다.

Let's look at a concrete example, so let's say I have a matrix, three, four, two, sixteen. So this is a two by two matrix, so it's a square matrix and so this may just could have an and it turns out that I happen to know the inverse of this matrix is zero point four, minus zero point one, minus zero point zero five, zero zero seven five. And if I take this matrix and multiply these together it turns out what I get is the two by two identity matrix, I, this is I two by two. Okay? And so on this slide, you know this matrix is the matrix A, and this matrix is the matrix A-inverse. And it turns out if that you are computing A times A-inverse, it turns out if you compute A-inverse times A you also get back the identity matrix. So how did I find this inverse or how did I come up with this inverse over here? It turns out that sometimes you can compute inverses by hand but almost no one does that these days. And it turns out there is very good numerical software for taking a matrix and computing its inverse. So again, this is one of those things where there are lots of open source libraries that you can link to from any of the popular programming languages to compute inverses of matrixs.

여기 좀 더 구체적인 예가 있습니다.

2 X 2 정방 행렬이 있습니다. 정방 행렬은 역행렬이 있습니다. 두 행렬을 곱하면 2 X 2 항등 행렬을 만들수 있습니다. 정방 행렬과 역행렬을 곱한 결과는 항등 행렬입니다. 그렇다면, 정방행렬의 역행렬은 어떻게 계산할까요? 직접 역행렬을 계산할 수 있지만 아무도 직접 계산하지 않습니다. 왜냐하면 많은 프로그램들이 정방행렬의 역행렬을 계산합니다. 프로그래밍 언어는 역행렬을 계산하는 많은 오픈 소스 라이브리러가 있습니다.

Let me show you a quick example. How I actually computed this inverse, and what I did was I used software called Optive. So let me bring that up. We will see a lot about Optive later. Let me just quickly show you an example. Set my matrix A to be equal to that matrix on the left, type three four two sixteen, so that's my matrix A right. This is matrix 3,4, 2,16 that I have down here on the left. And, the software lets me compute the inverse of A very easily. It's like P over A equals this. And so, this is right, this matrix here on my four minus, on my one, and so on. This given the numerical solution to what is the inverse of A. So let me just write, inverse of A equals P inverse of A over that I can now just verify that A times A inverse the identity is, type A times the inverse of A and the result of that is this matrix and this is one one on the diagonal and essentially ten to the minus seventeen, ten to the minus sixteen, so Up to numerical precision, up to a little bit of round off error that my computer had in finding optimal matrices and these numbers off the diagonals are essentially zero so A times the inverse is essentially the identity matrix. Can also verify the inverse of A times A is also equal to the identity, ones on the diagonals and values that are essentially zero except for a little bit of round dot error on the off diagonals.

옥타브 프로그램에서 간단하게 역행렬을 구하는 예제를 보여드리겠습니다. 옥타브 프로그램 창을 띄웁니다.

A = [3,4.; 2,16]; % 행렬 A의 성분을 정의

inverseOfA = pinv(A) % 역행렬을 구하는 pinv() 함수에 행렬 A를 입력

단 한 줄의 코드로 역행렬을 계산했습니다. A와 A의 역행렬을 곱하면 항등행렬이 나오는 지 확인합니다.'

A * inverseOfA % 행렬 A와 역행렬 inverseOfA를 곱셈

여기서 출력은 [1, 0; -0,1]입니다. 이 행렬은 대각선 성분은 1이고, 다른 성분은 약간의 오차를 제외하고 0이거나 거의 0입니다. 컴퓨터는 최적화 행렬을 계산하였고 대각선이 아닌 요소들의 거의 0에 가깝기 때문에 이 행렬은 항등 행렬입니다.

If a definition that the inverse of a matrix is, I had this caveat first it must always be a square matrix, it had this caveat, that if A has an inverse, exactly what matrix have an inverse is beyond the scope of this linear algebra for review that one intuition you might take away that just as the number zero doesn't have an inverse, it turns out that if A is say the matrix of all zeros, then this matrix A also does not have an inverse because there's no matrix there's no A inverse matrix so that this matrix times some other matrix will give you the identity matrix so this matrix of all zeros, and there are a few other matrices with properties similar to this. That also don't have an inverse. But it turns out that in this review I don't want to go too deeply into what it means matrix have an inverse but it turns out for our machine learning application this shouldn't be an issue or more precisely for the learning algorithms where this may be an to namely whether or not an inverse matrix appears. And I will tell when we get to those learning algorithms just what it means for an algorithm to have or not have an inverse and how to fix it in case. Working with matrix that don't have inverses. But the intuition if you want is that you can think of matricx as not have an inverse that is somehow too close to zero in some sense. So, just to wrap up the terminology, matrix that don't have an inverse Sometimes called a singular matrix or degenerate matrix and so this matrix over here is an example zero zero zero matrix. is an example of a matrix that is singular, or a matrix that is degenerate.

정방 행렬은 역행렬이 존재하고 역행렬은 정방행렬입니다. 역행렬을 가진 행렬이 무엇인지는 선형대수의 범위를 벗어납니다. 실수 영역에서 0은 역수가 없었습니다. 행렬 A의 값이 모두 0 인 경우 역행렬이 없습니다. 왜냐하면 다른 행렬과 곱해서 항등 행렬을 만들수 없기 때문입니다. 역행렬이 없는 행렬들이 몇 개 더 있지만, 여기서 자세히 공부하지 않을 것입니다. 머신 러닝 분야에서 학습 알고리즘을 이해하는 것에 문제가 되지 않습니다. 단지 역행렬이 있는 경우와 없는 경우에 따라 학습 알고리즘이 어떻게 해결할 지를 다룹니다. 역행렬이 없는 행렬은 거의 0에 가깝다고 생각합니다. 그래서 용어 정리가 필요합니다. 역행렬을 가지지 않은 행렬은 특이 행렬(singular matrix) 또는 degenerate matrix이라고 합니다. 예를 들면, [ 0, 0, 0, 0] 행렬은 특이 행렬입니다.

Finally, the last special matrix operation I want to tell you about is to do matrix transpose. So suppose I have matrix A, if I compute the transpose of A, that's what I get here on the right. This is a transpose which is written and A superscript T, and the way you compute the transpose of a matrix is as follows.

마지막으로 전치 행렬(Matrix Transpose)을 설명합니다. 행렬 A의 전치 행렬은 행으로 열로 열을 행으로 전치한 행렬입니다. 행렬 A이 전치행렬은 A^T로 표기합니다. 다음은 전치 행렬을 만드는 방법입니다.

To get a transpose I am going to first take the first row of A one to zero. That becomes this first column of this transpose. And then I'm going to take the second row of A, 3 5 9, and that becomes the second column. of the matrix A transpose.

행렬 A의 첫 번째 행 [1, 2, 0 ] 은 전치 행렬의 첫 열 [1; 2; 0]입니다. 녹색 타원입니다.행렬 A의 두 번째 행 [3, 5, 9]은 전치 행렬의 두 번째 열 [3; 5; 9] 입니다. 파란색 타원입니다.

And another way of thinking about how the computer transposes is as if you're taking this sort of 45 degree axis and you are mirroring or you are flipping the matrix along that 45 degree axis. so here's the more formal definition of a matrix transpose. Let's say A is a m by n matrix. And let's let B equal A transpose and so BA transpose like so. Then B is going to be a n by m matrix with the dimensions reversed so here we have a 2x3 matrix. And so the transpose becomes a 3x2 matrix, and moreover, the BIJ is equal to AJI. So the IJ element of this matrix B is going to be the JI element of that earlier matrix A. So for example, B 1 2 is going to be equal to, look at this matrix, B 1 2 is going to be equal to this element 3 1st row, 2nd column. And that equal to this, which is a two one, second row first column, right, which is equal to two and some [It should be 3] of the example B 3 2, right, that's B 3 2 is this element 9, and that's equal to a two three which is this element up here, nine. And so that wraps up the definition of what it means to take the transpose of a matrix and that in fact concludes our linear algebra review.

그리고 전치 행렬을 만드는 다른 방법이 있습니다. 행렬을 45도 각도 만큼 돌리는 것입니다. 전치 행렬의 정의는 다음과 같습니다. 행렬 A가 m X n 차원행렬일 때, A의 전치 행렬 B는 n X m 행렬입니다. 예를 들면, 행렬 A가 2 X 3 차원 행렬일때 A의 전치행렬 B는 3 X2 차원 행렬입니다.

Bij = Aji가 성립합니다. 예를 들면, 행렬 B의 1행 2열의 성분 B12는 행렬 A의 2행 1열의 성분 A21과 같습니다. 두 값 모두 3입니다. B32는 A23과 같은 9입니다. 이것이 전치 행렬입니다.

So by now hopefully you know how to add and subtract matrix as well as multiply them and you also know how, what are the definitions of the inverses and transposes of a matrix and these are the main operations used in linear algebra for this course. In case this is the first time you are seeing this material. I know this was a lot of linear algebra material all presented very quickly and it's a lot to absorb but if you there's no need to memorize all the definitions we just went through and if you download the copy of either these slides or of the lecture notes from the course website. and use either the slides or the lecture notes as a reference then you can always refer back to the definitions and to figure out what are these matrix multiplications, transposes and so on definitions. And the lecture notes on the course website also has pointers to additional resources linear algebra which you can use to learn more about linear algebra by yourself.

지금까지 행렬의 사칙연산을 배웠습니다. 역행렬과 전치 행렬도 배웠습니다. 선형 대수를 빠르고 간략하게 훑어 보았습니다. 모든 내용이 중요하지만 암기할 필요는 없습니다. 필요할 때마다 참조할 수 있습니다. 어디에 있는 지를 아는 것이 필요합니다.

And next with these new tools. We'll be able in the next few videos to develop more powerful forms of linear regression that can view of a lot more data, a lot more features, a lot more training examples and later on after the new regression we'll actually continue using these linear algebra tools to derive more powerful learning algorithims as well

다음 강의 주제는 옥타브 프로그램 툴을 다룹니다. 옥타브 프로그램에서 선형 대수의 강력한 공식을 계산할 수 있습니다. 더 많은 데이터, 더 많은 피처, 더 많은 학습 셋을 다룰 수 있습니다. 더 강력한 학습 알고리즘을 개발하기 위해 지속적으로 선형 대수를 사용할 것입니다.