Preprint Article Version 2 Preserved in Portico This version is not peer-reviewed

SL-Swin: A Transformer-Based Deep Learning Approach for Macro- and Micro-Expression Spotting on Small-Size Expression Datasets

Version 1 : Received: 30 May 2023 / Approved: 1 June 2023 / Online: 1 June 2023 (10:58:46 CEST)
Version 2 : Received: 8 June 2023 / Approved: 8 June 2023 / Online: 8 June 2023 (15:06:19 CEST)

A peer-reviewed article of this Preprint also exists.

He, E.; Chen, Q.; Zhong, Q. SL-Swin: A Transformer-Based Deep Learning Approach for Macro- and Micro-Expression Spotting on Small-Size Expression Datasets. Electronics 2023, 12, 2656. He, E.; Chen, Q.; Zhong, Q. SL-Swin: A Transformer-Based Deep Learning Approach for Macro- and Micro-Expression Spotting on Small-Size Expression Datasets. Electronics 2023, 12, 2656.

Abstract

In recent years, the analysis of macro- and micro-expression has drawn the attention of researchers since they provide visual cues of an individual's emotions for a broad range of potential applications such as lie detection and criminal detection. In this paper, we address the challenge of spotting facial macro- and micro-expression from videos and present compelling results by using a deep learning approach to analyze the optical flow features. Different from other deep learning approaches that are mainly based on Convolutional Neural Networks (CNNs), we propose a Transformer-based deep learning approach that predicts a score indicating the probability of a frame being within an expression interval. In contrast to other Transformer-based models that achieve high performance by being pre-trained on large datasets, our deep learning model, called SL-Swin, which incorporates Shifted Patch Tokenization and Locality Self-Attention into the backbone network Swin Transformer, effectively spots macro- and micro-expressions by being trained from scratch on small-size expression datasets. Our evaluation outcomes surpass the MEGC 2022 spotting baseline result, obtaining an overall F1-score of 0.1366. Additionally, our approach also performs well in the MEGC 2021 spotting task with an overall F1-score of 0.1824 and 0.1357 on CAS(ME)^2 and SAMM Long Videos, respectively. The code is publicly available on GitHub (https://github.com/eddiehe99/pytorch-expression-spotting).

Keywords

Macro- and Micro-Expression Spotting; Image Processing; Computer Vision; Artificial Intelligence; Deep Learning; Swin Transformer; Shifted Patch Tokenization; Locality Self-Attention

Subject

Computer Science and Mathematics, Computer Vision and Graphics

Comments (2)

Comment 1
Received: 8 June 2023
Commenter: Erheng He
Commenter's Conflict of Interests: Author
Comment: 1- In the introduction section, we have added demonstrations about Tranformers and their prior usages in expression spotting. Though the amount of work employing Transfomers for expression spotting is limited, we briefly indicate three prior usages. Please kindly find the last paragraph on page 2.

2- We have redrawn Figure 1 in which the text has been enlarged. Therefore, please kindly see Figure 1 on page 3.

3- We have redrawn both Figure 2 and 3 in which the text has been aligned. Please kindly see Figure 2 on page 5 and Figure 3 on page 8.

4- In section 3.2 (Performance Metrics), we have added description text and equations of other evaluation metrics for Macro-Expression (MaE) and Micro-Expression (ME) spotting experiments, including Precision, Recall, and F1-score.

5- We have deleted the extra word "(F1-score))" in the title of Table 5 of section 4.3.2. (Labeling) on page 14.

6- We have added the achieved results of our approach in the conclusion section (page 15).

7- We have uploaded our code to the given GitHub repository (https://github.com/eddiehe99/pytorch-expression-spotting).

8- In section 3.1.1. (MEGC 2022 Datasets), we have added more insights into both CAS(ME)^3 and SAMM Challenge datasets, including composition, diversity, size, etc.

9- In section 3.1.2. (MEGC 2021 Datasets), we have added more insights into both CAS(ME)^2 and SAMM Long Videos datasets, including composition, diversity, size, etc.

10- Based on the received review report, we have also implemented minor editing of the English language mainly in the Abstract (page 1), Introduction (page 3), and Conclusion (page 15) sections.
+ Respond to this comment
Response 1 to Comment 1
Received: 9 June 2023
Commenter:
The commenter has declared there is no conflict of interests.
Comment: version 2 revisions:
  1. We have changed the counter of Figure A1 to B1. Please kindly see Figure B1 on page 16.
  2. We have changed the counter of Figure A2 to C1. Please kindly see Figure C1 on page 17.
  3. We have changed the file extension of Figure C1 from .svg to .png, which is supposed to fix the display problem.
  4. We have added the Orchid ID of the corresponding author on page 1 (just under the title).

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 2
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.