Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

SL-Swin: A Transformer-Based Deep Learning Approach for Macro- and Micro-Expression Spotting on Small-Size Expression Datasets

Version 1 : Received: 30 May 2023 / Approved: 1 June 2023 / Online: 1 June 2023 (10:58:46 CEST)
Version 2 : Received: 8 June 2023 / Approved: 8 June 2023 / Online: 8 June 2023 (15:06:19 CEST)

A peer-reviewed article of this Preprint also exists.

He, E.; Chen, Q.; Zhong, Q. SL-Swin: A Transformer-Based Deep Learning Approach for Macro- and Micro-Expression Spotting on Small-Size Expression Datasets. Electronics 2023, 12, 2656. He, E.; Chen, Q.; Zhong, Q. SL-Swin: A Transformer-Based Deep Learning Approach for Macro- and Micro-Expression Spotting on Small-Size Expression Datasets. Electronics 2023, 12, 2656.

Abstract

In recent years, the analysis of macro- and micro-expression has drawn the attention of researchers since they provide visual cues of an individual’s emotions for a broad range of potential applications such as lie detection and criminal detection. In this paper, we address the challenge of spotting facial macro- and micro-expression from videos and present compelling results by using a deep learning approach to analyze the optical flow features. Different from other deep learning approaches that are mainly based on Convolutional Neural Networks (CNNs), we propose a Transformer-based deep learning approach that predicts a score indicating the probability of a frame being within an expression interval. Unlike other Transformer-based models that achieve high performance by being pre-trained on large datasets, our deep learning model, called SL-Swin, which applies Shifted Patch Tokenization and Locality Self-Attention to the backbone network Swin Transformer, effectively spots macro- and micro-expressions by being trained from scratch on small-size expression datasets. Our evaluation outcomes surpass the MEGC 2022 spotting baseline result, obtaining an overall F1-score of 0.1366. Additionally, our approach also performs well in the MEGC 2021 spotting task with an overall F1-score of 0.1824 and 0.1357 on CAS(ME)2 and SAMM Long Videos, respectively. The code is publicly available on GitHub (https://github.com/eddiehe99/pytorch-expression-spotting).

Keywords

Macro- and Micro-Expression Spotting; Image Processing; Computer Vision; Artificial Intelligence; Deep Learning; Swin Transformer; Shifted Patch Tokenization; Locality Self-Attention

Subject

Computer Science and Mathematics, Computer Vision and Graphics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.