얼굴인식으로 CNN + Transfer-Learning 도전.
얼굴인식, 마지막.
나는 유윤식, ^^
딥러닝을 활용해 사람의 얼굴을 식별(Recognization) 할 수 있다.
여기 5명의 지원자가 있다.
디렉토리 구조와 사진의 볼륨,
모델,
이렇게 보면 복잡하니까,
'''
Layer (type) Output Shape Param # =================================================================
zero_padding2d_53_input (Inp (None, 224, 224, 3) 0 _________________________________________________________________
zero_padding2d_53 (ZeroPaddi (None, 226, 226, 3) 0 _________________________________________________________________
conv2d_65 (Conv2D) (None, 224, 224, 64) 1792 _________________________________________________________________
zero_padding2d_54 (ZeroPaddi (None, 226, 226, 64) 0 _________________________________________________________________
conv2d_66 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________
max_pooling2d_20 (MaxPooling (None, 112, 112, 64) 0 _________________________________________________________________
zero_padding2d_55 (ZeroPaddi (None, 114, 114, 64) 0 _________________________________________________________________
conv2d_67 (Conv2D) (None, 112, 112, 128) 73856 _________________________________________________________________
zero_padding2d_56 (ZeroPaddi (None, 114, 114, 128) 0 _________________________________________________________________
conv2d_68 (Conv2D) (None, 112, 112, 128) 147584 _________________________________________________________________
max_pooling2d_21 (MaxPooling (None, 56, 56, 128) 0 _________________________________________________________________
zero_padding2d_57 (ZeroPaddi (None, 58, 58, 128) 0 _________________________________________________________________
conv2d_69 (Conv2D) (None, 56, 56, 256) 295168 _________________________________________________________________
zero_padding2d_58 (ZeroPaddi (None, 58, 58, 256) 0 _________________________________________________________________
conv2d_70 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________
zero_padding2d_59 (ZeroPaddi (None, 58, 58, 256) 0
_________________________________________________________________
conv2d_71 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________
max_pooling2d_22 (MaxPooling (None, 28, 28, 256) 0 _________________________________________________________________
zero_padding2d_60 (ZeroPaddi (None, 30, 30, 256) 0 _________________________________________________________________
conv2d_72 (Conv2D) (None, 28, 28, 512) 1180160 _________________________________________________________________
zero_padding2d_61 (ZeroPaddi (None, 30, 30, 512) 0 _________________________________________________________________
conv2d_73 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________
zero_padding2d_62 (ZeroPaddi (None, 30, 30, 512) 0 _________________________________________________________________
conv2d_74 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________
max_pooling2d_23 (MaxPooling (None, 14, 14, 512) 0 _________________________________________________________________
zero_padding2d_63 (ZeroPaddi (None, 16, 16, 512) 0 _________________________________________________________________
conv2d_75 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________
zero_padding2d_64 (ZeroPaddi (None, 16, 16, 512) 0 _________________________________________________________________
conv2d_76 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________
zero_padding2d_65 (ZeroPaddi (None, 16, 16, 512) 0 _________________________________________________________________
conv2d_77 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________
max_pooling2d_24 (MaxPooling (None, 7, 7, 512) 0 _________________________________________________________________
conv2d_78 (Conv2D) (None, 1, 1, 4096) 102764544 ____________________________________________________________________
dropout_14 (Dropout) (None, 1, 1, 4096) 0 _________________________________________________________________
conv2d_79 (Conv2D) (None, 1, 1, 4096) 16781312 ____________________________________________________________________
dropout_15 (Dropout) (None, 1, 1, 4096) 0 _________________________________________________________________
conv2d_80 (Conv2D) (None, 1, 1, 2622) 10742334 ________________________________________________________________________
flatten (Flatten) (None, 2622) 0 ________________________________________________________________________
cl1 (Dense) (None, 256) 671488 _________________________________________________________________
cl2 (Dense) (None, 128) 32896 _________________________________________________________________
dropout_16 (Dropout) (None, 128) 0 _________________________________________________________________
cl3 (Dense) (None, 5) 645 =================================================================
Total params: 145,707,907
Trainable params: 705,029
Non-trainable params: 145,002,878
'''
학습을 진행하면,
이전 포스팅에 나온 결과와 거의 비슷하다.
성능 개선은 딱히 없는 것 같다.
테스트,
path = [
'./datasets/faces/test/david/david_404.jpg',
'./datasets/faces/test/MrLee/mrlee_496.jpg',
'./datasets/faces/test/jhok/jhok_491.jpg',
'./datasets/faces/test/jinsu/jinsu_428.jpg',
'./datasets/faces/test/swchoi/swchoi_409.jpg',
]
def preprocess_image(image_path):
img = load_img(image_path, target_size=(224, 224))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img /= 255.
return img
for i in path:
%time
res = custom_vgg_model.predict(preprocess_image(i))[0,:]
ans = yhat_list[np.argmax(res, axis=0)]
print(ans)
print(i)
print()
결과,
CPU times: user 5 µs, sys: 3 µs, total: 8 µs Wall time: 16.2 µs
david
./datasets/faces/test/david/david_404.jpg
CPU times: user 11 µs, sys: 0 ns, total: 11 µs Wall time: 22.4 µs
MrLee
./datasets/faces/test/MrLee/mrlee_496.jpg
CPU times: user 8 µs, sys: 0 ns, total: 8 µs Wall time: 16.7 µs
jhok
./datasets/faces/test/jhok/jhok_491.jpg
CPU times: user 11 µs, sys: 0 ns, total: 11 µs Wall time: 21.9 µs
jinsu
./datasets/faces/test/jinsu/jinsu_428.jpg
CPU times: user 7 µs, sys: 5 µs, total: 12 µs Wall time: 24.8 µs
swchoi
./datasets/faces/test/swchoi/swchoi_409.jpg
Validation Score,
나쁘지 않다.
마지막으로,
하이라이트는 각 레이어에서 어떻게 특징을 찾았는지 확인해 본다.
본인의 사진 중에서
가장 못생긴 사진으로 테스트 해보면,
여기서,
layer_outputs = [layer.output for layer in custom_vgg_model.layers[1:-5]]
layer_model = Model(inputs=custom_vgg_model.layers[0].input, outputs=layer_outputs)
res = layer_model.predict(preprocess_image(path[0]))
for i, v in enumerate(res):
plt.matshow(v[0, :, :, 1], cmap='viridis')
plt.show()
요렇게 코드를 돌려주면,
(몇 개의 샘플만 확인.)
점점 추상화되고 이상한 모습의 사진이 그려진다.
'나'를 학습하고 특징을 찾은 것이니 나쁘게 볼 필요가 없다.
여기까지가
이미지 인식, 분류의
끝.
** 참고로 같은 시리즈의 포스팅에 실제 동영상으로 실시간 얼굴 인식 화면을 볼 수 있다.