Multi-Modal Emotion Recognition
This project focuses on constructing a multi-modal emotion recognition system with facial and audio features. It employs subspace-based feature fusion methods, including z-score normalization and Canonical Correlation Analysis (CCA), to combine facial expression and audio features. SVM classifiers are trained using spatiotemporal features from 50 videos of 5 participants and evaluated using the remaining data. LOSO cross-validation is implemented for reliable performance estimation.