brunch

매거진 데이터 사이언티스트가 되자

라이킷 11 댓글

You can make anything
by writing

C.S.Lewis

계정을 잊어버리셨나요?

by 라인하트 Oct 13. 2020

앤드류 응의 머신러닝 정리 (5-2):옥타브 데이터관리

A ([1 3] , :)' 명령어는 첫 번째 인덱스가 1 또는 3 인 모든 성분을 반환합니다

온라인 강의 플랫폼 코세라의 창립자인 앤드류 응 (Andrew Ng) 교수는 인공지능 업계의 거장입니다. 그가 스탠퍼드 대학에서 머신 러닝 입문자에게 한 강의를 그대로 코세라 온라인 강의 (Coursera.org)에서 무료로 배울 수 있습니다. 이 강의는 머신러닝 입문자들의 필수코스입니다. 인공지능과 머신러닝을 혼자 공부하면서 자연스럽게 만나게 되는 강의입니다.

Octave / Matlab Tutorial

옥타브 / 매트랩 튜토리얼

Moving Data Around (데이터 이동 및 조작하기)

In this second tutorial video on Octave, I'd like to start to tell you how to move data around in Octave. So, if you have data for a machine learning problem, how do you load that data in Octave? How do you put it into matrix? How do you manipulate these matrices? How do you save the results? How do you move data around and operate with data? Here's my Octave window as before, picking up from where we left off in the last video.

이번 두 번째 튜토리얼 강의는 데이터를 옥타브 프로그램으로 옮기는 방법을 설명합니다. 머신 러닝에 사용할 데이터를 어떻게 옥타브 프로그램에 업로드할까요? 어떻게 행렬에 집어넣을까요? 어떻게 이 행렬들을 조작할까요? 어떻게 결과를 저장할까요? 어떻게 데이터를 옮기고 다룰까요? 옥타브 프로그램 창이 있습니다. 지난 강의가 끝난 부분부터 시작하겠습니다.

If I type A, that's the matrix so we generate it, right, with this command equals one, two, three, four, five, six, and this is a three by two matrix. The size command in Octave lets you, tells you what is the size of a matrix. So size(A) returns three, two. It turns out that this size command itself is actually returning a one by two matrix. So you can actually set SZ equals size of A and SZ is now a one by two matrix where the first element of this is three, and the second element of this is two. So, if you just type size of SZ. Does SZ is a one by two matrix whose two elements contain the dimensions of the matrix A. You can also type size A one to give you back the first dimension of A, size of the first dimension of A. So that's the number of rows and size A two to give you back two, which is the number of columns in the matrix A.

A = [1 2; 3 4; 5 6] % 행렬 A 를 선언하고 값을 할당

sz = size(A) % 행렬 A가 3 X 2 차원이므로 3과 2의 값을 반환

size(sz) % 행렬 sz는 1 X 2 차원아무로 1과 2의 값을 반환

size(A,1) % 행렬 A의 크기 값 중 행의 개수를 반환

size(A,2) % 행렬 A의 크기 값 중 열의 개수를 반환

행렬 A는 3 X 2 차원 행렬입니다. 'size()' 명령어는 행렬의 크기를 알려줍니다. 그래서 size(A)는 3 2를 반환합니다. size() 명령은 실제로 1 X 2 행렬을 반환합니다. 'sz = size(A)'는 첫 번째 행렬 성분은 3, 두 번째 행렬 성분은 2 인 1 X 2 행렬입니다. 'size(sz)' 명령어는 1 X 2 행렬의 차원을 반환합니다. 또한, 'size(A,1)' 명령어는 행렬의 크기 값 중에서 첫 번째 성분인 행의 개수를 나타내고, 'size(A,2)' 명령어는 행렬의 크기 값 중에서 두 번째 성분인 열의 개수를 나타냅니다.

If you have a vector V, so let's say V equals one, two, three, four, and you type length V. What this does is it gives you the size of the longest dimension. So you can also type length A and because A is a three by two matrix, the longer dimension is of size three, so this should print out three. But usually we apply length only to vectors. So you know, length one, two, three, four, five, rather than apply length to matrices because that's a little more confusing.

v = [1 2 3 4] % 벡터 v를 선언하고 값을 할당

length(v) % 행렬의 차원을 나타내는 행과 열의 값 중에서 가장 긴 길이를 반환

length([1; 2;3; 4; 5]) % 벡터의 열이 5개이므로 5를 반환

벡터 v 가 있고, v = [1 2 3 4]라고 가정합니다. 'length(v)' 명령어는 행렬의 차원을 나타내는 행과 열의 값 중에서 가장 긴 길이를 반환합니다. 예를 들어, 'length(A)'를 입력하면 3을 반환합니다. 행렬 A는 3 X 2 행렬이기 때문에 더 긴 값인 3이기 때문입니다. 그러나, 보통 벡터의 길이를 찾기 위해 length() 명령어를 씁니다. 'length([1; 2; 3; 4; 5])' 명령은 잘 사용지 않습니다. 혼동되기 때문입니다.

Now, let's look at how the load data and find data on the file system. When we start an Octave we're usually, we're often in a path that is, you know, the location of where the Octave location is. So the PWD command shows the current directory, or the current path that Octave is in. So right now we're in this maybe somewhat off scale directory.

The CD command stands for change directory, so I can go to C:/Users/Ang/Desktop, and now I'm in, you know, in my Desktop and if I type ls, ls is, it comes from a Unix or a Linux command. But, ls will list the directories on my desktop and so these are the files that are on my Desktop right now.

pwd % 현재 디렉토리 또는 옥타브 프로그램이 위치한 현재 경로를 표시

cd C:/Users/Ang/Desktop % 디렉토리를 변경

ls % 현재 디렉토리를 보여 줌

데이터를 불러오고 파일에서 데이터를 찾는 방법을 알아보겠습니다 옥타브를 시작할 때 보통 옥타브가 설치된 디렉토리에 있습니다. 'pwd' 명령어는 현재 디렉토리 또는 옥타브가 위치한 현재 경로를 보여줍니다. 지금은 데이터 위치에서 조금 먼 디렉토리에 있습니다. 'cd' 명령어는 디렉토리를 변경합니다.

'cd C:/Users/Ang/Desktop'에 이동할 수 있습니다. 이제 제 Desktop에 왔습니다. ls 명령어는 유닉스와 리눅스에서 유래했습니다. 'ls' 명령어는 현재 디렉토리를 보여줍니다. 데스크톱에 있는 파일들이 보입니다.

In fact, on my desktop are two files: Features X and Price Y that's maybe come from a machine learning problem I want to solve. So, here's my desktop. Here's Features X, and Features X is this window, excuse me, is this file with two columns of data. This is actually my housing prices data. So I think, you know, I think I have forty-seven rows in this data set.

And so the first house has size two hundred four square feet, has three bedrooms; second house has sixteen hundred square feet, has three bedrooms; and so on. And Price Y is this file that has the prices of the data in my training set. So, Features X and Price Y are just text files with my data. How do I load this data into Octave?

데스크톱에 파일 두 개가 있습니다. 머신 러닝 문제를 풀기 위한 featureX와 priceY 데이터입니다. FeatureX는 두 개의 열로 된 파일로 실제 집값 데이터입니다. 데이터는 47 행이고, 첫 번째 집은 2104 제곱피트 크기고 3개의 침실이 있습니다. 두 번째 집은 1600 제곱피트이고 3개의 침실이 있습니다, 등등 priceY는 학습 셋에 있는 가격 데이터입니다. featuresX와 priceY는 데이터를 가진 단순 텍스트 파일입니다. 어떻게 이 데이터를 옥타브 프로그램에 업로드할까요?

Well, I just type the command load Features X dot dat and if I do that, I load the Features X and can load Price Y dot dat. And by the way, there are multiple ways to do this. This command if you put Features X dot dat on that in strings and load it like so. This is a typo there. This is an equivalent command. So you can, this way I'm just putting the file name of the string in the founding in a string and in an Octave use single quotes to represent strings, like so. So that's a string, and we can load the file whose name is given by that string.

load FeaturesX.txt % featuresX.txt 파일을 현재의 디렉토리로 업로드

load PriceY.txt % priceY.txt 파일을 현재의 디렉토리로 업로드

load('load FeaturesX.txt') % 파일명을 문자열로 인지한 후 디렉토리로 업로드

그럼, 'load featuresX.txt' 명령어는 featuresX.txt 파일을 현재의 디렉토리로 업로드합니다. 'load priceY.txt' 도 할 수 있습니다. 데이터를 불러오는 여러 가지 방법이 있습니다. load('load featuresX.txt') 이 명령어는 문자열로 인지하여 불러옵니다. 이것은 동일한 명령입니다. 그래서 이 방법으로 그냥 파일 이름의 문자열을 넣고 초기화할 때 문자열로 나타냅니다. 옥타브는 작은따옴표를 문자열을 나타내기 위해 사용합니다. 그래서 주어진 문자열로 시작하는 파일을 불러올 수 있습니다

Now the WHO command now shows me what variables I have in my Octave workspace. So Who shows me whether the variables that Octave has in memory currently. Features X and Price Y are among them, as well as the variables that, you know, we created earlier in this session.

who % 옥타브 워크스페이스에서 있는 변수들을 표시

'who' 명령어는 메모리에 있는 옥타브의 의 변수들을 보여있습니다. 변수들 중에 FeaturesX와 priceY가 있고, 앞서 만들었던 변수들도 있습니다.

So I can type FeatureX to display featureX. And there's my data.

FeatureX % FeatureX.txt 파일의 내용을 표시

'FeatureX'를 입력하면 'FeatureX.txt 파일에 있는 데이터들을 출력합니다.

And I can type size features X and that's my 47 by two matrix. And some of these size, press Y, that gives me my 47 by one vector. This is a 47 dimensional vector. This is all common vector that has all the prices Y in my training set.

size(FeatureX) % FeatureX 의 데이터를 행렬로 인식하고 47 X 2 차원으로 반환

size(PriceY) % PriceY의 데이터를 행렬로 인식하고 47 X 1 차원으로 반환

여기서, PriceY 행렬은 47 X 1 행렬이고, 47차원 벡터입니다. 이것은 학습 데이터 셋에 있는 모든 주택 가격 y에 대한 벡터입니다.

Now the who function shows you one of the variables that, in the current workspace. There's also the whos variable that gives you the detailed view. And so this also, with an S at the end this also lists my variables except that it now lists the sizes as well. So A is a three by two matrix and features X as a 47 by 2 matrix. Price Y is a 47 by one matrix. Meaning this is just a vector. And it shows, you know, how many bytes of memory it's taking up. As well as what type of data this is. Double means double position floating point so that just means that these are real values, the floating point numbers.

who % 옥타브 워크스페이스에서 있는 변수들을 표시

whos % 각 변수에 대한 상세 보기를 표시

'who' 함수는 현재 워크스페이스에 있는 변수를 보여줍니다. 'whos' 함수는 각 변수에 대한 상세 보기를 제공합니다. 's'를 붙이면 변수들을 나열하는데 크기도 나옵니다. 행렬 A는 3 X 2 행렬이고, FeatureX는 47 X 2 행렬, PriceY는 47 x 1 행렬이고 벡터입니다. 그리고 메모리를 얼마나 차지하는지 보여줍니다. 행렬이 어떤 데이터 유형 인지도 보여줍니다 double은 'double position형의 부동소수점을 뜻합니다. 즉 실수, 부동소수점 수를 뜻합니다

Now if you want to get rid of a variable you can use the clear command. So clear features X and type whose again. You notice that the features X variable has now disappeared.

clear FeatureX % 변수이자 행렬 FeatureX를 제거

'clear FeatureX'를 입력하고 whos를 다시 입력합니다. FeatureX 변수가 사라졌습니다.

And how do we save data? Let's see. Let's take the variable V and say that it's a price Y 1 colon 10. This sets V to be the first 10 elements of vector Y. So let's type who or whos. hereas Y was a 47 by 1 vector. V is now 10 by 1.

v = PriceY(1:10) % 변수 v를 선언하고 벡터 PriceY의 1부터 10 번째 성분을 반환

그러면 어떻게 데이터를 저장할까요? 한번 봅시다. 'v = PriceY(1:10)' 명령어는 변수 v의 값을 벡터 PriceY의 1부터 10 번째 성분의 값으로 설정합니다. 'whos'를 입력하면 PriceY는 47 x 1 벡터이고 v는 10 x 1 벡터입니다.

v equals PriceY, one column ten that sets it to the just the first ten elements of Y. Let's say I wanna save this to date to disc the command save, hello.mat V. This will save the variable V into a file called hello.mat. So let's do that. And now a file has appeared on my Desktop, you know, called Hello.mat. I happen to have MATLAB installed in this window, which is why, you know, this icon looks like this because Windows is recognized as it's a MATLAB file, but don't worry about it if this file looks like it has a different icon on your machine

save hello.mat v; % 변수 v의 값을 hello.mat 파일로 저장

'v = PriceY(1:10)'로 설정하면 v는 PriceY의 앞에서 10개의 요소를 데이터로 갖습니다. v를 하드디스크로 저장합니다. 'save hello.mat v;' 명령어는 변수 v의 값을 hello.mat 파일로 저장합니다. 한번 시도해 보겠습니다. 이 파일이 제 바탕 화면에 나타났습니다. 이름은 hello.mat입니다. 컴퓨터에 MATLAB이 설치되어 있기 때문에 아이콘 보입니다. 윈도우즈 OS가 MATLAB 파일을 인식합니다. 하지만 이 파일의 아이콘이 다르게 생겼어도 걱정할 필요는 없습니다.

And let's say I clear all my variables. So, if you type clear without anything then this actually deletes all of the variables in your workspace. So there's now nothing left in the workspace.

clear % 워크스페이스에 있는 모든 변수를 제거

모든 변수들을 모두 지웁니다. 'clear' 명령어는 워크스페이스에 있는 모든 변수를 제거합니다. 이제 아무것도 없습니다.

And if I load hello.mat, I can now load back my variable v, which is the data that I previously saved into the hello.mat file. So, hello.mat, what we did just now to save hello.mat to view, this save the data in a binary format, a somewhat more compressed binary format. So if v is a lot of data, this, you know, will be somewhat more compressing. Will take off less the space.

'load hello.mat'를 입력하면 자신의 변수 v를 다시 불러옵니다. v는 전에 hello.mat 파일을 저장했던 변수입니다. 조금 전에 hello.mat을 저장했었습니다. 이것은 데이터를 이진 형식으로 저장합니다. 약간 더 압축된 이진 형식입니다. 그래서 v 가 많은 데이터를 가지고 있으면, 더 많이 압축될 것이고 적은 공간을 차지합니다

If you want to save your data in a human readable format then you type save hello.text the variable v and then -ascii. So, this will save it as a text or as ascii format of text. And now, once I've done that, I have this file. Hello.text has just appeared on my desktop, and if I open this up, we see that this is a text file with my data saved away. So that's how you load and save data.

save hello.txt v -ascii % 사람이 읽을 수 있는 형식으로 데이터를 저장

이 명령어는 데이터를 ASCII형식의 텍스트로 저장합니다. 이 파일이 디렉토리에 보입니다. 이 파일은 데이터가 저장된 텍스트 파일입니다. 열여서 확인합니다. 이것이 데이터를 불러오고 저장하는 방법입니다

Now let's talk a bit about how to manipulate data. Let's set a equals to that matrix again so is my three by two matrix. So as indexing. So type A 3, 2. This indexes into the 3, 2 elements of the matrix A. So, this is what, you know, in normally, we will write this as a subscript 3, 2 or A subscript, you know, 3, 2 and so that's the element and third row and second column of A which is the element of six. I can also type A to comma colon to fetch everything in the second row. So, the colon means every element along that row or column. So, a of 2 comma colon is this second row of a. Right. And similarly, if I do a colon comma 2 then this means get everything in the second column of A. So, this gives me 2 4 6. Right this means of A. everything, second column. So, this is my second column A, which is 2 4 6.

A = [1,2; 3,4; 5,6]. % 행렬 A를 선언하고 값을 할당

A(3,2) % 행렬 A의 3행 2열의 성분을 반환

A(2,:) % 행렬 A의 2행의 모든 성분을 반환

A(:,2) % 행렬 A의 2열의 모든 성분을 반환

이제 데이터를 조작하는 방법에 대해 조금 이야기해 보겠습니다. 행렬 A = [1 2; 3 4; 5 6]을 3 X 2 행열로 정의합니다. 행렬 A의 성분을 찾아봅니다. 'A(3,2)'명령어는 행렬 A의 3행 2열의 성분을 반환합니다. 이것을 보통 A(3,2)라고 적습니다. 'A(3,2)'의 값 6을 반환합니다. 또한 'A(2,:)' 명령어는 두 번째 행의 모든 성분을 반환합니다. ':'은 행 또는 열에 있는 모든 성분를 의미합니다. 'A(2,:)' 은 두 번째 행입니다. 비슷하게 'A(:,2) 명령어는 두 번째 열 전체의 모든 성분을 반환합니다. 행렬 A의 두 번째 열은 [2; 4; 6]입니다.

Now, you can also use somewhat most of the sophisticated index in the operations. So, we just click each of an example. You do this maybe less often, but let me do this A 1 3 comma colon. This means get all of the elements of A who's first indexes one or three. This means I get everything from the first and third rows of A and from all columns. So, this was the matrix A and so A 1 3 comma colon means get everything from the first row and from the second row and from the third row and the colon means, you know, one both of first and the second columns and so this gives me this 1 2 5 6. Although, you use the source of more subscript index operations maybe somewhat less often.

A = [1,2; 3,4; 5,6]. % 행렬 A를 선언하고 값을 할당

A(3,2) % 행렬 A의 3행 2열의 성분을 반환

A ([1 3],:) % 행렬 A를 선언하고 1행과 3행의 모든 성분을 반환

A(2,:) % 행렬 A의 2행의 모든 성분을 반환

이제 연산에서 복잡한 인덱싱을 사용할 수 있습니다. 예제를 하나씩 해 보겠습니다. 아마도 자주 쓰지는 않겠지만 'A ([1 3] , :)' 명령어는 1행과 3행의 모든 성분을 반환합니다. A ([1 3] , :)'은 1 번째 행인 1 2와 3 번째 행 5 6을 반환합니다. 이런 고급 첨자 인덱스의 사용빈도는 다소 낮을지도 모릅니다.

To show you what else we can do. Here's the A matrix and this source A colon, to give me the second column. You can also use this to do assignments. So I can take the second column of A and assign that to 10, 11, 12, and if I do that I'm now, you know, taking the second column of a and I'm assigning this column vector 10, 11, 12 to it. So, now a is this matrix that's 1, 3, 5. And the second column has been replaced by 10, 11, 12. And here's another operation. Let's set A to be equal to A comma 100, 101, 102 like so and what this will do is depend another column vector to the right. So, now, oops. I think I made a little mistake. Should have put semicolons there and now A is equals to this. Okay? I hope that makes sense. So this 100, 101, 102. This is a column vector and what we did was we set A, take A and set it to the original definition. And then we put that column vector to the right and so, we ended up taking the matrix A and--which was these six elements on the left. So we took matrix A and we appended another column vector to the right; which is now why A is a three by three matrix that looks like that. And finally, one neat trick that I sometimes use if you do just a and just a colon like so. This. is a somewhat special case syntax. What this means is that put all elements with A into a. single column vector and this gives me a 9 by 1 vector. They adjust the other ones are. combined together.

A = [1,2; 3,4; 5,6]. % 행렬 A를 선언하고 값을 할당

A(3,2) % 행렬 A의 3행 2열의 성분을 반환

A(:,2) = [10 ; 11; 12] % 행렬 A의 2 열을 [10 ; 11; 12]로 재할당

A = [A, [100; 101; 102]]. % 행렬 A에 열 벡터 [100; 101; 102]를 오른쪽에 추가

A(:) % 행렬 A에 있는 모든 성분을 열 백터로 변환

이 외에도 할 수 있는 것들이 있습니다. 여기 A 행렬이 있습니다. 'A( : , 2)' 명령어는 두 번째 열을 반환합니다. 'A( : , 2) = [10 ; 11; 12]' 명령어는 행렬 A의 두 번째 열을 가져와서 [10 ; 11; 12]를 재할당합니다. 행렬 A의 첫 번째 열은 1 3 5이고, 두 번째 열은 10 11 12로 바뀌었습니다. 그리고 다른 연산이 있습니다 'A = [A, [100; 101; 102]]' 명령어는 100 101 102 숫자를 다른 열 벡터로 오른쪽에 붙입니다. 이것은 열 벡터이고 행렬 A를 가져와서 정의한 대로 열 백터를 추가합니다. 따라서, 'size(A)'의 값은 3 X 3 행렬입니다. 마지막으로 이런 트릭을 가끔 사용합니다. 'A(:)' 명령어는 행렬 A에 있는 모든 성분을 하나의 열 백터로 변환합니다. 특별한 경우의 문법입니다. 여기서는 9 X1 행렬의 값들을 반환합니다. 단순히 모든 성분을 합친 것입니다.

Just a couple more examples. Let's see. Let's say I set A to be equal to 123456, okay? And let's say I set a B to B equal to 11, 12, 13, 14, 15, 16. I can create a new matrix C as A B. This just means my Matrix A. Here's my Matrix B and I've set C to be equal to AB. What I'm doing is I'm taking these two matrices and just concatenating onto each other. So the left, matrix A on the left. And I have the matrix B on the right. And that's how I formed this matrix C by putting them together. I can also do C equals A semicolon B. The semi colon notation means that I go put the next thing at the bottom. So, I'll do is a equals semicolon B. It also puts the matrices A and B together except that it now puts them on top of each other. so now I have A on top and B at the bottom and C here is now in 6 by 2 matrix. So, just say the semicolon thing usually means, you know, go to the next line. So, C is comprised by a and then go to the bottom of that and then put b in the bottom and by the way, this A B is the same as A, B and so you know, either of these gives you the same result.

A = [1 2; 3 4; 5 6]; % 행렬 A를 선언하고 값을 할당

B = [11 12; 13 14; 15 16]; % 행렬 B를 선언하고 값을 할당

C = [A B] % 행렬 A와 행렬 B를 좌우로 연결하여 행렬 C를 생성

C = [A;B] % 행렬 A와 행렬 B를 상하로 연결하여 행렬 C를 생성

행렬 A와 행렬 B로 새로운 행렬을 만들 수 있습니다 'C = [A B]' 명령어는 행렬 A와 행렬 B를 좌우로 연결하여 새로운 행렬 C를 생성합니다. 왼쪽에 행렬 A가 있고, 오른쪽에 행렬 B가 있습니다. 마찬가지로 'C = [A ; B]' 명령어는 행렬 A와 행렬 B를 위아래로 연결하여 새로운 행렬 C를 만드는 것입니다. 위쪽에 행렬 A가 있고, 아래쪽에 행렬 B가 있습니다. 'size(C)'의 값은 6 X 2 행렬입니다. 세미콜론은 보통 다음 행으로 가는 것을 의미합니다. 행렬 A 아래에 행렬 B가 배치되는 구성입니다. [A B]는 [A, B]와 동일합니다. 둘 다 같은 결과를 줍니다

So, with that, hopefully you now know how to construct matrices and hopefully starts to show you some of the commands that you use to quickly put together matrices and take matrices and, you know, slam them together to form bigger matrices, and with just a few lines of code, Octave is very convenient in terms of how quickly we can assemble complex matrices and move data around. So that's it for moving data around. In the next video we'll start to talk about how to actually do complex computations on this, on our data. So, hopefully that gives you a sense of how, with just a few commands, you can very quickly move data around in Octave. You know, you load and save vectors and matrices, load and save data, put together matrices to create bigger matrices, index into or select specific elements on the matrices. I know I went through a lot of commands, so I think the best thing for you to do is afterward, to look at the transcript of the things I was typing. You know, look at it. Look at the coursework site and download the transcript of the session from there and look through the transcript and type some of those commands into Octave yourself and start to play with these commands and get it to work.And obviously, you know, there's no point at all to try to memorize all these commands. It's just, but what you should do is, hopefully from this video you have gotten a sense of the sorts of things you can do. So that when later on when you are trying to program a learning algorithms yourself, if you are trying to find a specific command that maybe you think Octave can do because you think you might have seen it here, you should refer to the transcript of the session and look through that in order to find the commands you wanna use.

지금까지 배운 명령어를 활용하여 행렬을 생성, 변경, 삭제를 할 수 있고, 행렬을 합치거나 나눌 수 있습니다. 옥타브 프로그램은 단 몇 줄의 코드로 복잡한 행렬을 빠르고 쉽게 만들고 데이터를 조작할 수 있습니다. 지금까지 데이터를 옮기는 방법이었습니다.

다음 강의에서 실제로 데이터를 활용하여 복잡한 연산을 하는 방법을 다룰 것입니다. 옥타브 프로그램에서 단 몇 개의 명령어로 데이터를 빠르게 조작하는 감각을 익힐 것입니다. 벡터와 행렬의 데이터를 불러오거나 저장했습니다. 행렬을 연결해서 더 큰 행렬을 생성하고 특정 행렬 성분을 검색하고 선택했습니다. 매우 많은 명령어를 다뤘습니다. 제일 좋은 공부 방법은 스크립트를 보면서 반복적으로 명령어를 직접 입력하는 것입니다. 강의를 보고 세션의 스크립트를 다운로드합니다. 옥타브에서 스스로 명령어를 입력하면서 익히세요. 분명히 모든 명령어를 외우려는 시도는 의미가 없습니다. 이 강의에서 옥타브 프로그램을 사용하는 법에 대한 감각을 익히는 것입니다. 나중에 직접 학습 알고리즘을 짤 때 옥타브 프로그램에서 필요한 명령어를 찾을 수 있기를 바랍니다. 한 번 훑어보면 알 수 있을 것입니다.

So, that's it for moving data around and in the next video what I'd like to do is start to tell you how to actually do complex computations on our data, and how to compute on the data, and actually start to implement learning algorithms.

지금까지 데이터를 옮기는 거에 대한 것을 배웠습니다. 다음 강의에서 데이터를 가지고 복잡한 연산을 하는 법과 학습 알고리즘을 실행하는 방법을 설명할 것입니다.

앤드류 응의 머신러닝 강의 동영상

옥타브 프로그램으로 따라 할 때 필요한 파일들

PriceY.txt

FeatureX.txt

정리하며 - 데이터 관리

1) 행렬 관리

size(A) % ans 3 2 : 행렬 A의 차원을 행 X 열 로 표현 (1*2 벡터)

size(A,1) % A = 2 : 행렬 A의 차원 중 첫 번째 성분 행의 크기

size(A,2) % A = 2. : 행렬 A의 차원 중 열의 크기 알려다.

. length(A) % ans = 6 : 행렬 A의 차원 중 가장 긴 값

length([1;2;3]) % ans = 3 : 행렬의 차원 중 가장 긴 값

2) 행렬 성분 관리

A(3,2) % ans = 6 : 행렬 A의 3행 2열의 성분의 값을 반환

A(2,:) % : 행렬 A의 2행의 모든 성분의 값을 반환

A(:,1) %. : 행렬 A의 1열의 모든 성분의 값을 반환

A([1 3] , :) : 행렬 A의 1행과 3행의 모든 성분의 값을 반환

A( : , 2) = [10 ; 11; 12] : 행렬 A의 2 열의 성분 값을 [10 ; 11; 12]로 대체

A = [A, [100; 101; 102]]. : 행렬 A의 맨 오른쪽 열에 [100 101 102] 숫자를 추가

A(:) : 행렬 A의 모든 행렬 성분을 열 백터로 변환

C = [A B] : 행렬 A와 행렬 B를 좌 우로 연결하여 새로운 행렬 C를 생성

C = [A ; B] : 행렬 A와 행렬 B를 위아래로 연결하여 새로운 행렬 C를 생성

3) 데이터 관리

pwd % ans = /User/bywoo. : 현재 디렉토리 위치

cd '/User/bywoo/Desktop'. % : 디렉토리 이동.

ls : 현재 디렉토리의 내용

load featuresX.txt' : featuresX.txt 파일을 현재의 디렉토리로 업로드

load('load featuresX.txt') : featuresX.txt 파일을 현재의 디렉토리로 업로드

save hello.mat v : 변수 v의 값을 hello.mat 파일로 저장 (압축)

save hello.txt v -ascii : 아스키 토드 형식으로 데이터를 저장(비압축)

4) 변수 및 함수 관리

who : 현재 사용 중인 변수를 나열

whos : 현재 사용 중인 변수의 상세 정보를 나열

clear FeatureX : 현재 사용 중인 변수 FeatureX를 제거

clear : 현재 사용 중인 모든 변수를 제거

v = PriceY(1:10) : 변수 v에 PriceY의 값 중 1부터 10 번째 성분의 값을 가져옴

브런치는 최신 브라우저에 최적화 되어있습니다. IE chrome safari