brunch

You can make anything
by writing

C.S.Lewis

by 라인하트 Jan 07. 2021

머신러닝 옥타브 실습 (3-3) : 손글씨 숫자 인식

온라인 강의 플랫폼 코세라의 창립자인 앤드류 응 (Andrew Ng) 교수는 인공지능 업계의 거장입니다. 그가 스탠퍼드 대학에서 머신 러닝 입문자에게 한 강의를 그대로 코세라 온라인 강의 (Coursera.org)에서 무료로 배울 수 있습니다. 이 강의는 머신러닝 입문자들의 필수코스입니다. 인공지능과 머신러닝을 혼자 공부하면서 자연스럽게 만나게 되는 강의입니다.

Programing Exercise 3 : Multi-class Classification and Neural Networks

프로그래밍 실습 3 : 멀티클래스 분류와 신경망

2. Neural Networks (신경망)

In the previous part of this exercise, you implemented multi-class logistic re- gression to recognize handwritten digits. However, logistic regression cannot form more complex hypotheses as it is only a linear classifier

In this part of the exercise, you will implement a neural network to rec- ognize handwritten digits using the same training set as before. The neural network will be able to represent complex models that form non-linear hypotheses. For this week, you will be using parameters from a neural network that we have already trained. Your goal is to implement the feedforward propagation algorithm to use our weights for prediction. In next week’s ex- ercise, you will write the backpropagation algorithm for learning the neural network parameters.

The provided script, ex3 nn.m, will help you step through this exercise.

바로 전 실습에서 손글씨 숫자를 인식하는 멀티 클래스 로지스틱 회귀를 다루었습니다. 로지스틱 회귀는 선형 분류기이므로 더 복잡한 가설을 만들 수 없습니다.

이번 실습은 전 실습과 동일한 학습 셋을 활용하여 손글씨 숫자를 인식하는 신경망을 구현합니다. 신경망은 매우 복잡한 비선형 가설을 모델링할 수 있습니다. 이미 학습한 신경망 파라미터를 사용합니다. 실습 목표는 예측에 가중치를 사용하는 순전파 (Forward Propagation) 알고리즘을 구현하는 것입니다. 다음 실습에서는 신경망 파라미터를 학습하기 위한 역전파 알고리즘을 사용할 것입니다.

ex3_nn.m 파일은 실습을 도와줄 것입니다.

2.1 Model representation (모델 표현)

Our neural network is shown in Figure 2. It has 3 layers – an input layer, a hidden layer and an output layer. Recall that our inputs are pixel values of digit images. Since the images are of size 20 × 20, this gives us 400 input layer units (excluding the extra bias unit which always outputs +1). As before, the training data will be loaded into the variables X and y.

You have been provided with a set of network parameters (Θ(1),Θ(2)) already trained by us. These are stored in ex3weights.mat and will be loaded by ex3 nn.m into Theta1 and Theta2 The parameters have dimensions that are sized for a neural network with 25 units in the second layer and 10 output units (corresponding to the 10 digit classes).

신경망은 그림 2에 자세히 나와 있습니다. 신경망은 입력층, 은닉층, 출력층으로 구성합니다. 입력은 숫자 이미지의 픽셀 값입니다. 20 X 20 픽셀 크기의 흑백 이미지를 입력하기 위해 400개의 입력층에 유닛이 필요합니다. (항상 1의 값을 가지는 바이어스 유닛은 제외)

이미 신경망 파라미터 Θ^(1)과 Θ^(2)는 이미 계산을 완료하였고, ex3weights.mat 파일에 저장되어 있습니다. ex3_nn.m 파일을 실행하면 자동으로 theta1과 theta2에 로드합니다.

ex3weights.mat 파일은 옥타브 프로드램에 로드될 때 변수 X와 y는 자동으로 할당합니다. 파라미터 두 번째 층에 25개 유닛과 10개의 출력 유닛 (10 개의 멀티 클래스)이 있는 신경망입니다.

<해설>

(1) 데이터 업로드 및 기본 변수 설정

clear; close all; clc

load ('ex3data1.mat'); % 5000X 400의 손글씨 숫자 흑백 이미지 행렬을 업로드

[m, n] = size(X); % 행렬 X가 5000X 400차원일 때 m = 5000, n= 400

(2) 신경망 변수 설정

input_layer_size = 400; % 20x20 이미지를 입력하기 위한 유닛 수

hidden_layer_size = 25; % 25 은닉 유닛의 수

num_labels = 10; % 멀티 클래스의 수, 0은 10으로 처리

(3) 데이터 시각화

rand_index = randperm(m,100);

sel = X (rand_index,:);

displayData (sel);

(4) 파라미터 행렬 Θ^(1)과 Θ^(2) 업로드

load('ex3weights.mat'); % 이미 계산된 파라미터 행렬 Theta1과 Theta2를 업로드

(5) 예측을 위한 predict.m 파일 분석

function p = predict(Theta1, Theta2, X)

%PREDICT 학습된 신경망으로 입력에 대한 레이블을 예측

% p = PREDICT(Theta1, Theta2, X)

% 행렬 X : 데이터 행렬 X

%. Theta1, Theta2 : 신경망아 학습한 가중치 파라미터 행렬

% 변수 초기화

m = size(X, 1);

num_labels = size(Theta2, 1);

% 5000 X 1차원 행렬 p를 초기하고 결과를 반환

p = zeros(size(X, 1), 1);

% ====================== YOUR CODE HERE ======================

% Instructions:

% 학습한 신경망에서 예측을 위한 코드를 작성

% 예측은 1에서 num_labels 사이의 값

% Hint: max function might come in useful. In particular, the max

% 학습 예제에 대한 결과 확률 중에 최대값을 가진 수를 찾기 위한 함수 max()를 활용

% ===========================================================

(6) 예측 함수를 작성

X = [ones(5000,1) X]; % 바이어스 유닛 추가, 행렬 X는 5000 X 401 행렬

p = zeros(size(X, 1), 1); % 예측 p를 초기화

a2 = sigmoid(X * Theta1') ; % 두 번째 층인 은닉층의 활성화 함수를 계산

a2 = [ones(m,1) a2]; % 바이어스 유닛 추가, 행렬 a2는 5000 X 26차원 행렬

h = sigmoid(a2 * Theta2'); % 5000X26차원 행렬 a2와 10 X26 차원 행렬 Theta2 곱셈

for i = 1:m

[dummyp, p(i)] = max(h(i,:));

end

(7) 정확도 측정

mean 함수는 지난 실습에서 다루었습니다.

mean(p == y)*100 ; % 정확도를 측정합니다.

>> mean(p == y)*100

ans = 97.520

인공 신경망을 활용한 손글씨 정확도는 97.5%입니다. 멀티 클래스 로지스틱 회귀 분석의 정확도는 94.9%였습니다. 신경망이 조금 더 효율적인 파라미터 행렬 Theta를 찾았습니다.

(7) 랜덤 한 학습 데이터 1개씩 자동으로 예측

rp = randperm(m); % 5000개의 숫자를 랜덤 하게 배열

pred = predict(Theta1, Theta2, X(rp(i),:));

% predict.m 파일을 호출하여 한 개의 데이터 입력

>> rp = randperm(m);

>> pred = predict(Theta1, Theta2, X(rp(i),:));

error: horizontal dimensions mismatch (5000x1 vs 1x400)

error: called from:

error: /Users/bywoo/Desktop/Study/03_Machine_Learning/Lab/machine-learning-ex(Answer)/ex3/predict.m at line 24, column 3

위와 같이 실행할 경우 에러가 발생합니다. 이유는 predict.m 파일에서 바이어스 유닛 생성 단계인 다음 코드에서 문제를 일으키기 때문입니다.

X = [ones(5000,1) X]; % 바이어스 유닛 추가, 행렬 X는 5000 X 401 행렬

행렬 X에 대해 바이어스 유닛을 추가할 때는 문제가 없지만, 단 하나의 학습 에제만 처리할 때는 바이어스 유닛 생성에 문제를 일으킵니다. 입력 값 X는 다음 코드로 학습 예제의 총 수를 계산합니다.

m = size(X, 1);

변수 m을 이용하여 바이어스 유닛을 생성하도록 코드를 변경합니다.

X = [ones(m,1) X];

계속해서 한 개의 데이터별로 확인하는 식은 다음과 같습니다.

rp = randperm(m); % 5000개의 숫자를 랜덤 하게 배열

for i = 1:m

fprintf('\nDisplaying Example Image\n');

displayData(X(rp(i), :)); % 첫 번째 랜덤 학습 예제를 표시

pred = predict(Theta1, Theta2, X(rp(i),:)); % 첫 번째 랜덤 학습 예제 예측

fprintf('\nNeural Network Prediction: %d (digit %d)\n', pred, mod(pred, 10));

% 키보드 입력을 받음

s = input('Paused - press enter to continue, q to exit:','s');

if s == 'q'

break

end

input() 함수는 키보드 입력을 받기 위해 사용하는 명령어입니다. 키보드의 값을 입력하고 엔터를 치면 그 값이 그대로 저장됩니다.

>> s= input ("Type a number")

Type a number : 555

s = 555

<결과 확인>

clear; close all; clc

load ('ex3data1.mat');

[m, n] = size(X);

input_layer_size = 400;

hidden_layer_size = 25;

num_labels = 10;

load('ex3weights.mat');

X = [ones(5000,1) X];

p = zeros(size(X, 1), 1);

a2 = sigmoid(X * Theta1') ;

a2 = [ones(m,1) a2];

h = sigmoid(a2 * Theta2');

for i = 1:m

[dummyp, p(i)] = max(h(i,:));

end

mean(p == y)*100

옥타브 프로그램은 다음고 같은 결과값을 출력합니다.

Loading Saved Neural Network Parameters ...

Training Set Accuracy: 97.520000

Program paused. Press enter to continue.

<정답>

function p = predict(Theta1, Theta2, X)

%PREDICT 학습된 신경망으로 입력에 대한 레이블을 예측

% p = PREDICT(Theta1, Theta2, X)

% 행렬 X : 데이터 행렬 X

%. Theta1, Theta2 : 신경망아 학습한 가중치 파라미터 행렬

% 변수 초기화

m = size(X, 1);

num_labels = size(Theta2, 1);

% 5000 X 1차원 행렬 p를 초기하고 결과를 반환

p = zeros(size(X, 1), 1);

% ====================== YOUR CODE HERE ======================

% Instructions:

% 학습한 신경망에서 예측을 위한 코드를 작성

% 예측은 1에서 num_labels 사이의 값

% Hint: max function might come in useful. In particular, the max

% 학습 예제에 대한 결과 확률 중에 최대값을 가진 수를 찾기 위한 함수 max()를 활용

X = [ones(m,1) X];

a2 = sigmoid(X * Theta1') ;

a2 = [ones(m,1) a2];

a3 = sigmoid(a2 * Theta2');

[max_p p] = max(a3, [],2);

% ===========================================================

ex3_nn을 실행합니다.

>> ex3_nn

Loading and Visualizing Data ...

Program paused. Press enter to continue.

Loading Saved Neural Network Parameters ...

Training Set Accuracy: 97.520000

Program paused. Press enter to continue.

Displaying Example Image

Neural Network Prediction: 7 (digit 7)

Paused - press enter to continue, q to exit:

Displaying Example Image

Neural Network Prediction: 5 (digit 5)

Paused - press enter to continue, q to exit:q

브런치는 최신 브라우저에 최적화 되어있습니다. IE chrome safari