更新于 

系统测试

训练我们的神经网络

设置训练参数如下:

1
2
3
4
5
6
7
#初始化参数
num_classes = 7
width, height = 48, 48
num_epochs = 300
batch_size = 128
num_features = 64
rate_drop = 0.1

进行训练:

1
2
3
4
5
6
7
8
9
10
11
es = EarlyStopping(monitor='val_loss', patience=10, mode='min', restore_best_weights=True)

reduce_lr = ReduceLROnPlateau(monitor='val_accuracy', factor=0.75, patience=5, verbose=1)

history = model.fit(data_generator.flow(train_X, train_Y, batch_size),
# steps_per_epoch=len(train_X) / batch_size,
batch_size=batch_size,
epochs=num_epochs,
verbose=2,
callbacks=[es, reduce_lr],
validation_data=(val_X, val_Y))

注意到,在上述代码中,使用了两个策略监测我们的网络:

  1. 过拟合监测,如果没有更小的验证损失,则网络停止训练
  2. 学习速率监测,如果没有更好的验证精度,则降低学习速率

部分训练输出信息如下:

Output
2021-12-26 05:35:09.313687: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)

Epoch 1/300

2021-12-26 05:35:10.991111: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8005

225/225 - 20s - loss: 2.0907 - accuracy: 0.2503 - val_loss: 1.9219 - val_accuracy: 0.2494

Epoch 2/300

225/225 - 12s - loss: 1.8205 - accuracy: 0.2714 - val_loss: 1.8866 - val_accuracy: 0.2611

Epoch 3/300

225/225 - 12s - loss: 1.6999 - accuracy: 0.3240 - val_loss: 1.8933 - val_accuracy: 0.3090

……

Epoch 00020: ReduceLROnPlateau reducing learning rate to 0.007499999832361937.
……

Epoch 00037: ReduceLROnPlateau reducing learning rate to 0.005624999874271452.
……

Epoch 00048: ReduceLROnPlateau reducing learning rate to 0.004218749818392098.

Epoch 49/300
225/225 - 13s - loss: 0.7174 - accuracy: 0.7382 - val_loss: 1.0229 - val_accuracy: 0.6559

我们观察到,训练过程中存在3次学习速率调整,最终在第49次迭代时提前终止训练。

可视化训练效果

精度曲线和损失曲线
精度曲线和损失曲线

代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
fig, axes = plt.subplots(1, 2, figsize=(18, 6))
# 绘制训练和验证精度曲线
axes[0].plot(history.history['accuracy'])
axes[0].plot(history.history['val_accuracy'])
axes[0].set_title('Model accuracy')
axes[0].set_ylabel('Accuracy')
axes[0].set_xlabel('Epoch')
axes[0].legend(['Train', 'Validation'], loc='upper left')

# 绘制训练和验证损失曲线
axes[1].plot(history.history['loss'])
axes[1].plot(history.history['val_loss'])
axes[1].set_title('Model loss')
axes[1].set_ylabel('Loss')
axes[1].set_xlabel('Epoch')
axes[1].legend(['Train', 'Validation'], loc='upper left')
plt.show()

通过观察曲线,我们可以得知神经网络后期存在轻微的过拟合现象。

评估测试效果

我们对测试数据集,进行评估分析,代码如下:

1
2
3
test_true = np.argmax(test_Y, axis=1)
test_pred = np.argmax(model.predict(test_X), axis=1)
print("CNN Model Accuracy on test set: {:.4f}".format(accuracy_score(test_true, test_pred)))

输出信息如下:

Output
CNN Model Accuracy on test set: 0.6704

最终,我们的VGGNet网络,对各个数据集的准确率,如下表所示。

Accuracy
Train 73.28%
Validation 65.59%
Test 67.04%

使用混淆矩阵进行分析

绘制混淆矩阵,以分析表情之间是否会相互混淆,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
fusion_matrix(y_true, y_pred, classes,
normalize=False,
title=None,
cmap=plt.cm.Blues):
"""
此函数打印和绘制混淆矩阵
可以通过设置“normalize=True”来应用规范化。
"""
if not title:
if normalize:
title = 'Normalized confusion matrix'
else:
title = 'Confusion matrix, without normalization'

# 计算混淆矩阵
cm = confusion_matrix(y_true, y_pred)
# 仅使用数据中显示的标签
classes = classes
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
#print("Normalized confusion matrix")
#else:
#print('Confusion matrix, without normalization')

#print(cm)

fig, ax = plt.subplots(figsize=(12, 6))
im = ax.imshow(cm, interpolation='nearest', cmap=cmap)
ax.figure.colorbar(im, ax=ax)
# 显示所有的标记...
ax.set(xticks=np.arange(cm.shape[1]),
yticks=np.arange(cm.shape[0]),
# ... 用相应的列表条目标记它们
xticklabels=classes, yticklabels=classes,
title=title,
ylabel='True label',
xlabel='Predicted label')

# 旋转x轴标签并设置其对齐方式。
plt.setp(ax.get_xticklabels(), rotation=45, ha="right",
rotation_mode="anchor")

# 在数据维度上循环并创建文本批注
fmt = '.2f' if normalize else 'd'
thresh = cm.max() / 2.
for i in range(cm.shape[0]):
for j in range(cm.shape[1]):
ax.text(j, i, format(cm[i, j], fmt),
ha="center", va="center",
color="white" if cm[i, j] > thresh else "black")
fig.tight_layout()
return ax

# %%
# 绘制归一化混淆矩阵
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
plot_confusion_matrix(test_true, test_pred, classes=emotion_labels, normalize=True, title='Normalized confusion matrix')
plt.show()

输出的混淆矩阵如下图所示。通过分析混淆矩阵,可知:Disgust比较容易和其他表情混淆,这是由于Disgust的样本数本身就很少。

混淆矩阵

实时人脸表情识别

将已经训练好的模型存入本地,使用摄像头实时捕捉人脸,并识别出相应的表情。我们的思路是,从捕获的图像中,先使用人脸检测器,检测出人脸区域,然后将该区域实施灰度化,并将图片大小缩放至,最后送入我们的模型,进行预测,得到相应的表情输出。相应的代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
import cv2 as cv
import numpy as np
from keras import models

model = models.load_model('./FER_Model.h5')

emotion_map = {0: 'Angry', 1: 'Disgust', 2: 'Fear', 3: 'Happy', 4: 'Sad', 5: 'Surprise', 6: 'Neutral'}


cap = cv.VideoCapture(0)
if not cap.isOpened():
print("Can not open camera!")
exit()

while True:
# 逐帧捕获
ret, frame = cap.read()
# 转换成灰度图像
gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
classifier = cv.CascadeClassifier("./haarcascade_frontalface_default.xml")
faceRects = classifier.detectMultiScale(gray, scaleFactor=1.2, minNeighbors=3, minSize=(32, 32))
color = (0, 0, 255)

if len(faceRects): # 大于0则检测到人脸
for faceRect in faceRects: # 单独框出每一张人脸
x, y, w, h = faceRect
# 框出人脸
cv.rectangle(frame, (x, y), (x + h, y + w), color, 2)
# 获取人脸源
src = gray[y:y + w, x:x + h]
# 缩放至48*48
img = cv.resize(src, (48, 48))
# 归一化
img = img / 255.
# 扩展维度
x = np.expand_dims(img, axis=0)
x = np.array(x, dtype='float32').reshape(-1, 48, 48, 1)
# 预测输出
y = model.predict(x)
output_class = np.argmax(y[0])
cv.putText(frame, emotion_map[output_class], (200, 100), cv.FONT_HERSHEY_COMPLEX,
2.0, (0, 0, 250), 5)
cv.imshow("frame", frame)
if cv.waitKey(1) == ord('q'):
break
cap.release()
cv.destroyAllWindows()

上述代码中,haarcascade_frontalface_default.xml是由OpenCV提供的人脸检测器。

识别效果样例如下图所示。