Submitted:
10 October 2024
Posted:
12 October 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We construct a novel, high quality, diverse and un-skewed dataset of 2184 images consisting of the ten most popular dog breeds worldwide with varying shapes and sizes. These dog breeds are ‘Siberian Husky’, ‘Rottweiler’, ‘Golden Retriever’, ‘Labrador Retriever’, ‘Pug’, ‘Poodle’, ‘Beagle’, ‘German Shepherd’, ‘Pembroke Welsh Corgi’ and ‘French Bulldog’. We construct equally sized groups of these images with the seven Panskepp emotion labels “Exploring”, “Sadness”, “Playing”, “Rage”, “Fear”, “Affectionate”, and “Lust”.
- We leverage the Contrastive Learning frameworks SimCLR (A Simple Framework for Contrastive Learning of Visual Representations) and MoCo (Momentum Contrast for Unsupervised Visual Representation Learning) on our dataset to predict the seven Panksepp emotions using unsupervised learning. We significantly modify the MoCo framework to obtain the best possible results on our hardware. We also test the unsupervised learning models on a publicly available dog emotion dataset to compare relative performance on baseline accuracies.
- We build a supervised learning model based on the ResNet50 architecture and run it on our dataset as well as the publicly available dataset to obtain benchmark results.
2. Method
2.1. Creating the Dataset
2.1.1. Methodology for Labelling Images
2.1.2. Dataset Description


2.2. Supervised Learning Benchmarks
2.3. Unsupervised Learning with Contrastive Learning Frameworks
2.3.1. Why Contrsative Learning?
2.3.2. Experimenting with the SimCLR Framework
2.3.3. Frameworks That Require Less GPU Memory
2.4. Momentum Contrast for Unsupervised Visual Representation Learning
2.4.1. Data Augmentations
- Random Resizing and Cropping.
- Random Horizontal Flip (probability = 0.5).
- Random application of Color Jitter with jitter values of brightness = 0.4, contrast = 0.4, saturation = 0.4, hue = 0.1 and probability = 0.8.

2.4.2. Modification of ResNet-34 and ResNet-18 Architectures
- Replaces the first convolution layer with kernel size = 3 and stride = 1.
- Removes the first pooling layer.
2.5. KNN Classifier
3. Results
3.1. Results Produced by the SimCLR Framework
3.2. Results Produced by the SimCLR Framework
3.2.1. Testing Accuracy and Training Loss for Our Unsupervised Learning Models




3.3. Comparison of Supervised and Unsupervised Results
3.4. Generalizability of Our Results
4. Discussion
4.1. Broader Impact to the Field
4.2. Further Scope of This Research
4.3. Ethical Considerations
5. Conclusions
References
- Perri, A.R.; Feuerborn, T.R.; Frantz, L.A.F.; Larson, G.; Malhi, R.S.; Meltzer, D.J.; Witt, K.E. Dog Domestication and the Dual Dispersal of People and Dogs into the Americas. Proc. Natl. Acad. Sci. 2021, 118, e2010083118. [Google Scholar] [CrossRef] [PubMed]
- Reed, C.A. Animal Domestication in the Prehistoric Near East: The Origins and History of Domestication Are Beginning to Emerge from Archeological Excavations. Science 1959, 130, 1629–1639. [Google Scholar] [CrossRef] [PubMed]
- Panksepp, J. Affective Consciousness: Core Emotional Feelings in Animals and Humans. Conscious. Cogn. 2005, 14, 30–80. [Google Scholar] [CrossRef] [PubMed]
- Chen, H.-Y.; Lin, C.-H.; Lai, J.-W.; Chan, Y.-K. Convolutional Neural Network-Based Automated System for Dog Tracking and Emotion Recognition in Video Surveillance. Appl. Sci. 2023, 13, 4596. [Google Scholar] [CrossRef]
- Ferres, K.; Schloesser, T.; Gloor, P.A. Predicting Dog Emotions Based on Posture Analysis Using DeepLabCut. Future Internet 2022, 14, 97. [Google Scholar] [CrossRef]
- Chavez-Guerrero, V.O.; Perez-Espinosa, H.; Puga-Nathal, M.E.; Reyes-Meza, V. Classification of Domestic Dogs Emotional Behavior Using Computer Vision. Comput. Sist. 2022, 26. [Google Scholar] [CrossRef]
- Hernández-Luquin, F.; Escalante, H.J.; Villaseñor-Pineda, L.; Reyes-Meza, V.; Villaseñor-Pineda, L.; Pérez-Espinosa, H.; Reyes-Meza, V.; Escalante, H.J.; Gutierrez-Serafín, B. Dog Emotion Recognition from Images in the Wild: DEBIw Dataset and First Results. In Proceedings of the Proceedings of the Ninth International Conference on Animal-Computer Interaction; ACM: Newcastle-upon-Tyne United Kingdom, December 5 2022; pp. 1–13.
- Broomé, S.; Feighelstein, M.; Zamansky, A.; Carreira Lencioni, G.; Haubro Andersen, P.; Pessanha, F.; Mahmoud, M.; Kjellström, H.; Salah, A.A. Going Deeper than Tracking: A Survey of Computer-Vision Based Recognition of Animal Pain and Emotions. Int. J. Comput. Vis. 2023, 131, 572–590. [Google Scholar] [CrossRef]
- Hussain, A.; Ali, S.; Abdullah; Kim, H.-C. Activity Detection for the Wellbeing of Dogs Using Wearable Sensors Based on Deep Learning. IEEE Access 2022, 10, 53153–53163. [CrossRef]
- Franzoni, V.; Milani, A.; Biondi, G.; Micheli, F. A Preliminary Work on Dog Emotion Recognition. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence - Companion Volume; ACM: Thessaloniki Greece, October 14, 2019; pp. 91–96. [Google Scholar]
- Kim, D.; Song, B.C. Contrastive Adversarial Learning for Person Independent Facial Emotion Recognition. Proc. AAAI Conf. Artif. Intell. 2021, 35, 5948–5956. [Google Scholar] [CrossRef]
- He, K.; Fan, H.; Wu, Y.; Xie, S.; Girshick, R. Momentum Contrast for Unsupervised Visual Representation Learning 2019.
- Le-Khac, P.H.; Healy, G.; Smeaton, A.F. Contrastive Representation Learning: A Framework and Review. 2020. [CrossRef]
- Shen, Z.; Liu, Z.; Liu, Z.; Savvides, M.; Darrell, T.; Xing, E. Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation Learning. Proc. AAAI Conf. Artif. Intell. 2022, 36, 2216–2224. [Google Scholar] [CrossRef]
- Konok, V.; Nagy, K.; Miklósi, Á. How Do Humans Represent the Emotions of Dogs? The Resemblance between the Human Representation of the Canine and the Human Affective Space. Appl. Anim. Behav. Sci. 2015, 162, 37–46. [Google Scholar] [CrossRef]
- Pasols, A. 20 Most Popular Dog Breeds (2024). Forbes Advis. 2024.
- Kujala, M.V. Canine Emotions: Guidelines for Research. Anim. Sentience 2018, 2. [Google Scholar] [CrossRef]
- Sharma, N.; Jain, V.; Mishra, A. An Analysis Of Convolutional Neural Networks For Image Classification. Procedia Comput. Sci. 2018, 132, 377–384. [Google Scholar] [CrossRef]
- El-Nouby, A.; Izacard, G.; Touvron, H.; Laptev, I.; Jegou, H.; Grave, E. Are Large-Scale Datasets Necessary for Self-Supervised Pre-Training? 2021.
- Tian, Y.; Krishnan, D.; Isola, P. Contrastive Multiview Coding 2019.
- Yang, K.; Zhang, T.; Alhuzali, H.; Ananiadou, S. Cluster-Level Contrastive Learning for Emotion Recognition in Conversations. IEEE Trans. Affect. Comput. 2023, 14, 3269–3280. [Google Scholar] [CrossRef]
- Zhang, D.; Nan, F.; Wei, X.; Li, S.; Zhu, H.; McKeown, K.; Nallapati, R.; Arnold, A.; Xiang, B. Supporting Clustering with Contrastive Learning 2021.
- Jaiswal, A.; Babu, A.R.; Zadeh, M.Z.; Banerjee, D.; Makedon, F. A Survey on Contrastive Self-Supervised Learning. Technologies 2020, 9, 2. [Google Scholar] [CrossRef]
- Shen, X.; Liu, X.; Hu, X.; Zhang, D.; Song, S. Contrastive Learning of Subject-Invariant EEG Representations for Cross-Subject Emotion Recognition. IEEE Trans. Affect. Comput. 2023, 14, 2496–2511. [Google Scholar] [CrossRef]
- Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A Simple Framework for Contrastive Learning of Visual Representations 2020.
- You, Y.; Gitman, I.; Ginsburg, B. Large Batch Training of Convolutional Networks 2017.
- Advances in Neural Information Processing Systems 29: 30th Annual Conference on Neural Information Processing Systems 2016: Barcelona, Spain, 5-10 December 2016; Lee, D.D., Luxburg, U. von, Garnett, R., Sugiyama, M., Guyon, I., Neural Information Processing Systems Foundation, Eds.; Curran Associates, Inc: Red Hook, NY, 2017; ISBN 978-1-5108-3881-9.
- Wang, X.; Zhang, H.; Huang, W.; Scott, M.R. Cross-Batch Memory for Embedding Learning 2019rning 2019.
- Choi, H.; Lee, B.H.; Chun, S.Y.; Lee, J. Towards Accelerating Model Parallelism in Distributed Deep Learning Systems. PLOS ONE 2023, 18, e0293338. [Google Scholar] [CrossRef]
- Grathwohl, W.; Wang, K.-C.; Jacobsen, J.-H.; Duvenaud, D.; Norouzi, M.; Swersky, K. Your Classifier Is Secretly an Energy Based Model and You Should Treat It Like One 2019.
- Sohoni, N.S.; Aberger, C.R.; Leszczynski, M.; Zhang, J.; Ré, C. Low-Memory Neural Network Training: A Technical Report 2019.
- Chen, X.; Fan, H.; Girshick, R.; He, K. Improved Baselines with Momentum Contrastive Learning 2020.
- Chen, X.; Xie, S.; He, K. An Empirical Study of Training Self-Supervised Vision Transformers 2021.
- Zając, M.; Zolna, K.; Jastrzębski, S. Split Batch Normalization: Improving Semi-Supervised Learning under Domain Shift 2019.
- Li, S.; Zhao, Y.; Varma, R.; Salpekar, O.; Noordhuis, P.; Li, T.; Paszke, A.; Smith, J.; Vaughan, B.; Damania, P.; et al. PyTorch Distributed: Experiences on Accelerating Data Parallel Training 2020.
- Wu, Z.; Xiong, Y.; Yu, S.; Lin, D. Unsupervised Feature Learning via Non-Parametric Instance-Level Discrimination 2018.
- Balico, D. Dog Emotions.
- Keshtmand, N.; Santos-Rodriguez, R.; Lawry, J. Understanding the Properties and Limitations of Contrastive Learning for Out-of-Distribution Detection 2022.
- Ferres, K.; Schloesser, T.; Gloor, P.A. Predicting Dog Emotions Based on Posture Analysis Using DeepLabCut. Future Internet 2022, 14, 97. [Google Scholar] [CrossRef]
- Liu, C.; Fu, Y.; Xu, C.; Yang, S.; Li, J.; Wang, C.; Zhang, L. Learning a Few-Shot Embedding Model with Contrastive Learning. Proc. AAAI Conf. Artif. Intell. 2021, 35, 8635–8643. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale 2020.
- Thompkins, A.M.; Lazarowski, L.; Ramaiahgari, B.; Gotoor, S.S.R.; Waggoner, P.; Denney, T.S.; Deshpande, G.; Katz, J.S. Dog–Human Social Relationship: Representation of Human Face Familiarity and Emotions in the Dog Brain. Anim. Cogn. 2021, 24, 251–266. [Google Scholar] [CrossRef]
- Cowen, A.S.; Keltner, D. Self-Report Captures 27 Distinct Categories of Emotion Bridged by Continuous Gradients. Proc. Natl. Acad. Sci. 2017, 114. [Google Scholar] [CrossRef] [PubMed]
- Cosgrove, N. How Many Dogs Are There? US & Worldwide Statistics 2024; 2024.
- Mancini, C. Towards an Animal-Centred Ethics for Animal–Computer Interaction. Int. J. Hum.-Comput. Stud. 2017, 98, 221–233. [Google Scholar] [CrossRef]
- Coghlan, S.; Parker, C. Harm to Nonhuman Animals from AI: A Systematic Account and Framework. Philos. Technol. 2023, 36, 25. [Google Scholar] [CrossRef]
- Ilyena Hirskyj-Douglas; Read, J. The Ethics of How to Work with Dogs in Animal Computer Interaction. Proc. Meas. Behav. 2016 Anim. Comput. Interact. Workshop 2026.
- Paci, P.; Mancini, C.; Nuseibeh, B. The Case for Animal Privacy in the Design of Technologically Supported Environments. Front. Vet. Sci. 2022, 8, 784794. [Google Scholar] [CrossRef] [PubMed]

| Run | Image Resolution |
Encoder | GPU | Batch Size | LR | K Value KNN |
Momentum | Epochs | NT-Xent Temperature | Accuracy |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 96 | ResNet-18 | NVIDIA RTX-3070 |
256 | 0.3 | 150 | 0.99 | 1200 | 0.1 | 40.24% |
| 2 | 96 | ResNet-18 | NVIDIA RTX-3070 | 128 | 0.3 | 200 | 0.99 | 1800 | 0.1 | 42.19% |
| 3 | 96 | ResNet-34 | NVIDIA Tesla P100 | 128 | 0.25 | 150 | 0.95 | 1200 | 0.1 | 39.71% |
| 4 | 96 | ResNet-34 | NVIDIA Tesla P100 | 128 | 0.3 | 200 | 0.99 | 1800 | 0.1 | 43.42% |
| Emotion | Accuracy ResNet50 | Accuracy MoCo |
|---|---|---|
| Caring | 94.74% | 34.61% |
| Exploring | 83.75% | 40.40% |
| Fear | 28.95% | 35.51% |
| Lust | 47.05% | 62.79% |
| Playing | 46.34% | 38.88% |
| Rage | 87.09% | 45.91% |
| Sadness | 78.57% | 44.37% |
| Emotion | Precision | Recall | F1-Score |
|---|---|---|---|
| Caring | 0.9796 | 0.7059 | 0.8205 |
| Exploring | 0.7609 | 0.6364 | 0.6931 |
| Fear | 0.5156 | 0.9429 | 0.6667 |
| Lust | 0.8125 | 0.7091 | 0.7573 |
| Playing | 0.7778 | 0.9403 | 0.8514 |
| Rage | 1.0000 | 0.7037 | 0.8216 |
| Sadness | 0.7600 | 0.5352 | 0.6281 |
| Emotion | Accuracy ResNet50 | Accuracy MoCo |
|---|---|---|
| Angry | 81.08% | 55.50 % |
| Happy | 96.47% | 45.65% |
| Relaxed | 91.13% | 35.90% |
| Sad | 80.00% | 54.55% |
| Emotion | Precision | Recall | F1-Score |
|---|---|---|---|
| Caring | 0.9796 | 0.7059 | 0.8205 |
| Exploring | 0.7609 | 0.6364 | 0.6931 |
| Fear | 0.5156 | 0.9429 | 0.6667 |
| Lust | 0.8125 | 0.7091 | 0.7573 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).