brunch

Python: m.l with Keras #12

얼굴인식으로 CNN + Transfer-Learning 도전.

by 유윤식

얼굴인식, 마지막.

나는 유윤식, ^^


딥러닝을 활용해 사람의 얼굴을 식별(Recognization) 할 수 있다.


여기 5명의 지원자가 있다.

David가 나다...


디렉토리 구조와 사진의 볼륨,

(224, 224) 사이즈는 입력 이미지의 크기를 뜻한다.


모델,

model_summary.jpg


이렇게 보면 복잡하니까,


'''

Layer (type) Output Shape Param # =================================================================

zero_padding2d_53_input (Inp (None, 224, 224, 3) 0 _________________________________________________________________

zero_padding2d_53 (ZeroPaddi (None, 226, 226, 3) 0 _________________________________________________________________

conv2d_65 (Conv2D) (None, 224, 224, 64) 1792 _________________________________________________________________

zero_padding2d_54 (ZeroPaddi (None, 226, 226, 64) 0 _________________________________________________________________

conv2d_66 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________

max_pooling2d_20 (MaxPooling (None, 112, 112, 64) 0 _________________________________________________________________

zero_padding2d_55 (ZeroPaddi (None, 114, 114, 64) 0 _________________________________________________________________

conv2d_67 (Conv2D) (None, 112, 112, 128) 73856 _________________________________________________________________

zero_padding2d_56 (ZeroPaddi (None, 114, 114, 128) 0 _________________________________________________________________

conv2d_68 (Conv2D) (None, 112, 112, 128) 147584 _________________________________________________________________

max_pooling2d_21 (MaxPooling (None, 56, 56, 128) 0 _________________________________________________________________

zero_padding2d_57 (ZeroPaddi (None, 58, 58, 128) 0 _________________________________________________________________

conv2d_69 (Conv2D) (None, 56, 56, 256) 295168 _________________________________________________________________

zero_padding2d_58 (ZeroPaddi (None, 58, 58, 256) 0 _________________________________________________________________

conv2d_70 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________

zero_padding2d_59 (ZeroPaddi (None, 58, 58, 256) 0

_________________________________________________________________

conv2d_71 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________

max_pooling2d_22 (MaxPooling (None, 28, 28, 256) 0 _________________________________________________________________

zero_padding2d_60 (ZeroPaddi (None, 30, 30, 256) 0 _________________________________________________________________

conv2d_72 (Conv2D) (None, 28, 28, 512) 1180160 _________________________________________________________________

zero_padding2d_61 (ZeroPaddi (None, 30, 30, 512) 0 _________________________________________________________________

conv2d_73 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________

zero_padding2d_62 (ZeroPaddi (None, 30, 30, 512) 0 _________________________________________________________________

conv2d_74 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________

max_pooling2d_23 (MaxPooling (None, 14, 14, 512) 0 _________________________________________________________________

zero_padding2d_63 (ZeroPaddi (None, 16, 16, 512) 0 _________________________________________________________________

conv2d_75 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________

zero_padding2d_64 (ZeroPaddi (None, 16, 16, 512) 0 _________________________________________________________________

conv2d_76 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________

zero_padding2d_65 (ZeroPaddi (None, 16, 16, 512) 0 _________________________________________________________________

conv2d_77 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________

max_pooling2d_24 (MaxPooling (None, 7, 7, 512) 0 _________________________________________________________________

conv2d_78 (Conv2D) (None, 1, 1, 4096) 102764544 ____________________________________________________________________

dropout_14 (Dropout) (None, 1, 1, 4096) 0 _________________________________________________________________

conv2d_79 (Conv2D) (None, 1, 1, 4096) 16781312 ____________________________________________________________________

dropout_15 (Dropout) (None, 1, 1, 4096) 0 _________________________________________________________________

conv2d_80 (Conv2D) (None, 1, 1, 2622) 10742334 ________________________________________________________________________

flatten (Flatten) (None, 2622) 0 ________________________________________________________________________

cl1 (Dense) (None, 256) 671488 _________________________________________________________________

cl2 (Dense) (None, 128) 32896 _________________________________________________________________

dropout_16 (Dropout) (None, 128) 0 _________________________________________________________________

cl3 (Dense) (None, 5) 645 =================================================================

Total params: 145,707,907

Trainable params: 705,029

Non-trainable params: 145,002,878

'''



학습을 진행하면,

tf_keras_face_acc.png 과적합인가?
tf_keras_face_loss.png 정말 과적합인가??!


이전 포스팅에 나온 결과와 거의 비슷하다.

성능 개선은 딱히 없는 것 같다.


테스트,


path = [

'./datasets/faces/test/david/david_404.jpg',

'./datasets/faces/test/MrLee/mrlee_496.jpg',

'./datasets/faces/test/jhok/jhok_491.jpg',

'./datasets/faces/test/jinsu/jinsu_428.jpg',

'./datasets/faces/test/swchoi/swchoi_409.jpg',

]

def preprocess_image(image_path):

img = load_img(image_path, target_size=(224, 224))

img = img_to_array(img)

img = np.expand_dims(img, axis=0)

img /= 255.

return img

for i in path:

%time

res = custom_vgg_model.predict(preprocess_image(i))[0,:]

ans = yhat_list[np.argmax(res, axis=0)]

print(ans)

print(i)

print()



결과,


CPU times: user 5 µs, sys: 3 µs, total: 8 µs Wall time: 16.2 µs

david

./datasets/faces/test/david/david_404.jpg


CPU times: user 11 µs, sys: 0 ns, total: 11 µs Wall time: 22.4 µs

MrLee

./datasets/faces/test/MrLee/mrlee_496.jpg


CPU times: user 8 µs, sys: 0 ns, total: 8 µs Wall time: 16.7 µs

jhok

./datasets/faces/test/jhok/jhok_491.jpg


CPU times: user 11 µs, sys: 0 ns, total: 11 µs Wall time: 21.9 µs

jinsu

./datasets/faces/test/jinsu/jinsu_428.jpg


CPU times: user 7 µs, sys: 5 µs, total: 12 µs Wall time: 24.8 µs

swchoi

./datasets/faces/test/swchoi/swchoi_409.jpg



Validation Score,

tf_keras_face_valscore.png


나쁘지 않다.


마지막으로,

하이라이트는 각 레이어에서 어떻게 특징을 찾았는지 확인해 본다.


본인의 사진 중에서

가장 못생긴 사진으로 테스트 해보면,


tf_keras_face_sample.png


여기서,


layer_outputs = [layer.output for layer in custom_vgg_model.layers[1:-5]]

layer_model = Model(inputs=custom_vgg_model.layers[0].input, outputs=layer_outputs)

res = layer_model.predict(preprocess_image(path[0]))

for i, v in enumerate(res):

plt.matshow(v[0, :, :, 1], cmap='viridis')

plt.show()



요렇게 코드를 돌려주면,

(몇 개의 샘플만 확인.)

tf_keras_face_layer01.png
tf_keras_face_layer06.png
tf_keras_face_layer07.png
tf_keras_face_layer08.png
tf_keras_face_layer09.png
tf_keras_face_layer10.png
tf_keras_face_layer11.png
tf_keras_face_layer12.png
tf_keras_face_layer13.png
tf_keras_face_layer14.png
tf_keras_face_layer15.png
tf_keras_face_layer16.png


점점 추상화되고 이상한 모습의 사진이 그려진다.

'나'를 학습하고 특징을 찾은 것이니 나쁘게 볼 필요가 없다.


여기까지가

이미지 인식, 분류의


끝.


** 참고로 같은 시리즈의 포스팅에 실제 동영상으로 실시간 얼굴 인식 화면을 볼 수 있다.

keyword
작가의 이전글Python: m.l with Keras #11