系统测试

训练我们的神经网络

设置训练参数如下：

#初始化参数
num_classes = 7
width, height = 48, 48
num_epochs = 300
batch_size = 128
num_features = 64
rate_drop = 0.1

进行训练：

es = EarlyStopping(monitor='val_loss', patience=10, mode='min', restore_best_weights=True)

reduce_lr = ReduceLROnPlateau(monitor='val_accuracy', factor=0.75, patience=5, verbose=1)

history = model.fit(data_generator.flow(train_X, train_Y, batch_size),
                    # steps_per_epoch=len(train_X) / batch_size,
                    batch_size=batch_size,
                    epochs=num_epochs,
                    verbose=2,
                    callbacks=[es, reduce_lr],
                    validation_data=(val_X, val_Y))

注意到，在上述代码中，使用了两个策略监测我们的网络：

过拟合监测，如果没有更小的验证损失，则网络停止训练
学习速率监测，如果没有更好的验证精度，则降低学习速率

部分训练输出信息如下：

Output

2021-12-26 05:35:09.313687: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)

Epoch 1/300

2021-12-26 05:35:10.991111: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8005

225/225 - 20s - loss: 2.0907 - accuracy: 0.2503 - val_loss: 1.9219 - val_accuracy: 0.2494

Epoch 2/300

225/225 - 12s - loss: 1.8205 - accuracy: 0.2714 - val_loss: 1.8866 - val_accuracy: 0.2611

Epoch 3/300

225/225 - 12s - loss: 1.6999 - accuracy: 0.3240 - val_loss: 1.8933 - val_accuracy: 0.3090

……

Epoch 00020: ReduceLROnPlateau reducing learning rate to 0.007499999832361937.
……

Epoch 00037: ReduceLROnPlateau reducing learning rate to 0.005624999874271452.
……

Epoch 00048: ReduceLROnPlateau reducing learning rate to 0.004218749818392098.

Epoch 49/300
225/225 - 13s - loss: 0.7174 - accuracy: 0.7382 - val_loss: 1.0229 - val_accuracy: 0.6559

我们观察到，训练过程中存在3次学习速率调整，最终在第49次迭代时提前终止训练。

可视化训练效果

精度曲线和损失曲线

代码如下：

%matplotlib inline
%config InlineBackend.figure_format = 'svg'
fig, axes = plt.subplots(1, 2, figsize=(18, 6))
# 绘制训练和验证精度曲线
axes[0].plot(history.history['accuracy'])
axes[0].plot(history.history['val_accuracy'])
axes[0].set_title('Model accuracy')
axes[0].set_ylabel('Accuracy')
axes[0].set_xlabel('Epoch')
axes[0].legend(['Train', 'Validation'], loc='upper left')

# 绘制训练和验证损失曲线
axes[1].plot(history.history['loss'])
axes[1].plot(history.history['val_loss'])
axes[1].set_title('Model loss')
axes[1].set_ylabel('Loss')
axes[1].set_xlabel('Epoch')
axes[1].legend(['Train', 'Validation'], loc='upper left')
plt.show()

通过观察曲线，我们可以得知神经网络后期存在轻微的过拟合现象。

评估测试效果

我们对测试数据集，进行评估分析，代码如下：

1
2
3

test_true = np.argmax(test_Y, axis=1)
test_pred = np.argmax(model.predict(test_X), axis=1)
print("CNN Model Accuracy on test set: {:.4f}".format(accuracy_score(test_true, test_pred)))

输出信息如下：

Output

CNN Model Accuracy on test set: 0.6704

最终，我们的VGGNet网络，对各个数据集的准确率，如下表所示。

	Accuracy
Train	73.28%
Validation	65.59%
Test	67.04%

使用混淆矩阵进行分析

绘制混淆矩阵，以分析表情之间是否会相互混淆，代码如下：

fusion_matrix(y_true, y_pred, classes,
                          normalize=False,
                          title=None,
                          cmap=plt.cm.Blues):
    """
    此函数打印和绘制混淆矩阵
    可以通过设置“normalize=True”来应用规范化。
    """
    if not title:
        if normalize:
            title = 'Normalized confusion matrix'
        else:
            title = 'Confusion matrix, without normalization'

    # 计算混淆矩阵
    cm = confusion_matrix(y_true, y_pred)
    # 仅使用数据中显示的标签
    classes = classes
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        #print("Normalized confusion matrix")
    #else:
    #print('Confusion matrix, without normalization')

    #print(cm)

    fig, ax = plt.subplots(figsize=(12, 6))
    im = ax.imshow(cm, interpolation='nearest', cmap=cmap)
    ax.figure.colorbar(im, ax=ax)
    # 显示所有的标记...
    ax.set(xticks=np.arange(cm.shape[1]),
           yticks=np.arange(cm.shape[0]),
           # ... 用相应的列表条目标记它们
           xticklabels=classes, yticklabels=classes,
           title=title,
           ylabel='True label',
           xlabel='Predicted label')

    # 旋转x轴标签并设置其对齐方式。
    plt.setp(ax.get_xticklabels(), rotation=45, ha="right",
             rotation_mode="anchor")

    # 在数据维度上循环并创建文本批注
    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i in range(cm.shape[0]):
        for j in range(cm.shape[1]):
            ax.text(j, i, format(cm[i, j], fmt),
                    ha="center", va="center",
                    color="white" if cm[i, j] > thresh else "black")
    fig.tight_layout()
    return ax

# %%
# 绘制归一化混淆矩阵
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
plot_confusion_matrix(test_true, test_pred, classes=emotion_labels, normalize=True, title='Normalized confusion matrix')
plt.show()

输出的混淆矩阵如下图所示。通过分析混淆矩阵，可知：Disgust比较容易和其他表情混淆，这是由于Disgust的样本数本身就很少。

混淆矩阵

实时人脸表情识别

将已经训练好的模型存入本地，使用摄像头实时捕捉人脸，并识别出相应的表情。我们的思路是，从捕获的图像中，先使用人脸检测器，检测出人脸区域，然后将该区域实施灰度化，并将图片大小缩放至，最后送入我们的模型，进行预测，得到相应的表情输出。相应的代码如下：

import cv2 as cv
import numpy as np
from keras import models

model = models.load_model('./FER_Model.h5')

emotion_map = {0: 'Angry', 1: 'Disgust', 2: 'Fear', 3: 'Happy', 4: 'Sad', 5: 'Surprise', 6: 'Neutral'}


cap = cv.VideoCapture(0)
if not cap.isOpened():
    print("Can not open camera!")
    exit()

while True:
    # 逐帧捕获
    ret, frame = cap.read()
    # 转换成灰度图像
    gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
    classifier = cv.CascadeClassifier("./haarcascade_frontalface_default.xml")
    faceRects = classifier.detectMultiScale(gray, scaleFactor=1.2, minNeighbors=3, minSize=(32, 32))
    color = (0, 0, 255)

    if len(faceRects):  # 大于0则检测到人脸
        for faceRect in faceRects:  # 单独框出每一张人脸
            x, y, w, h = faceRect
            # 框出人脸
            cv.rectangle(frame, (x, y), (x + h, y + w), color, 2)
            # 获取人脸源
            src = gray[y:y + w, x:x + h]
            # 缩放至48*48
            img = cv.resize(src, (48, 48))
            # 归一化
            img = img / 255.
            # 扩展维度
            x = np.expand_dims(img, axis=0)
            x = np.array(x, dtype='float32').reshape(-1, 48, 48, 1)
            # 预测输出
            y = model.predict(x)
            output_class = np.argmax(y[0])
            cv.putText(frame, emotion_map[output_class], (200, 100), cv.FONT_HERSHEY_COMPLEX,
                       2.0, (0, 0, 250), 5)
    cv.imshow("frame", frame)
    if cv.waitKey(1) == ord('q'):
        break
cap.release()
cv.destroyAllWindows()

上述代码中，haarcascade_frontalface_default.xml是由OpenCV提供的人脸检测器。

识别效果样例如下图所示。