Submitted:
23 May 2025
Posted:
23 May 2025
You are already at the latest version
Abstract
Keywords:
Deciphering the Complexity of Bovine Communication
Biological and Behavioral Significance of Cattle Vocalizations
Traditional Approaches: The Foundations of Vocalization Research
Acoustic Decomposition of Bovine Vocalizations
Breed and Environment Acoustic Variation
Integrating Multimodal Data: Contextualizing Vocalizations
AI-Driven Analysis of Bovine Vocalizations
- They are computationally light, suiting edge devices,
- Their feature weights or decision trees are interpretable,
- And they set baseline expectations for deeper networks.
Deep-Learning Approaches: Spectrograms, Sequences, and Self-Supervision
Event Detection and Edge Deployment
Sequence Modelling and Hybrid Architectures
Self-Supervised and Transfer Learning
Beyond Single Modality
Individual Identification and Spatial Audio
Evaluating Model Robustness and Practicality
Domain Shift: When “One Size” Fits Only One Farm
Class Imbalance Impact
Noise, Overlap, and the “False-Alarm Tax”
Overfitting and Data Poverty
Interpretability and User Trust
Practical Pilots and Edge Constraints
NLP and Large Language Models (LLMs): Revolutionizing Animal Bioacoustics?
The NLP Advantage in Decoding Animal Vocalizations
Translating Vocalizations into Human-Interpretable Language
Leveraging Few-Shot and Zero-Shot Learning
Addressing Technical and Ethical Challenges
Controversies and Anthropomorphism Risks
Industrial Applications and Animal Welfare Impacts
AI-Enhanced Precision Livestock Farming
Real-World Deployments and Use Cases
Early Intervention through Vocalization Analysis
Economic and Productivity Implications
Ethical and Welfare Implications
Identified Gaps and Strategic Research Directions
Addressing the Data Scarcity Challenge
Contextual and Individual Variability in Vocalizations
Overfitting, Reproducibility, and Sensor Calibration
Toward a Cattle-Specific Benchmark
Explainability and Trust: Gaining User Confidence
- Bias reduction and transparency: Proactively identify and correct biases so that the model performs reliably across different herds and conditions ((Stowell, 2022)). This includes diversifying training data and auditing model outputs for systematic errors, ensuring no group of scenarios is overlooked.
- Output calibration: Calibrate the AI’s confidence scores so that probability outputs align with actual accuracy (Stowell, 2022). Users can then interpret a “90%” distress prediction as truly high likelihood, and the system can signal low confidence detections to invite human review rather than acting on a guess.
- User friendly interfaces with explanations: Design intuitive dashboards that present alerts in context. For example, the interface might display a timeline of detected calls and highlight the acoustic features that led to a “distress” classification, along with a brief explanation. (Looking ahead, Figure 8 sketches a future hybrid model that links incremental sensor upgrades to long-term welfare goals.) By visualizing what the model “heard,” such tools let users verify the AI’s reasoning (Stowell, 2022) and learn to trust its alerts
- Preventing over-reliance: Clearly communicate the AI’s supportive role to users so they treat it as an aid, not a decision-maker authority (Bendel et al., 2024). Farmers should be encouraged to continue observing their animals. The AI is an assistant that might catch subtle cues, but human judgment remains crucial for context.
- Field validation and iteration: Rigorously test the system in real farm conditions and iteratively improve it based on those results (Prestegaard-Wilson et al., 2024).
Proposed Framework for Future Research
Conceptual Model for AI-Augmented Bovine Bioacoustics
- Collar sensors: Wearable units with 3-axis accelerometers, gyros, GPS/RFID, and optionally microphones or temperature sensors (Lamanna et al., 2025) (El Moutaouakil et al., 2023).
- Barn monitors: Fixed microphone arrays for group vocalizations, infrared cameras for thermal imaging, environmental sensors (temp/humidity) and video feeds.
- Edge AI modules: Embedded microcontrollers running efficient CNNs (e.g. Mo-bileNet) for on-device detection of calls or behavior (Noda et al., 2024).
- Fusion interfaces: Middleware that time aligns audio, motion and thermal data streams for joint analysis.
- Cloud/Server analytics: Powerful AI models (possibly including transformer-based audio classifiers) that integrate fused data across animals and time.
- Farmer dashboard: Web/mobile apps presenting alerts and summaries. These interfaces should support user feedback or annotation and allow parameter tuning (e.g. alert thresholds) by the farmer (El Moutaouakil et al., 2023) (Noda et al., 2024).


Data Acquisition and Annotation Strategy
Ethical Tensions in Deployment
Model Development: Toward Adaptive and Explainable Systems
- Adaptive AI systems: Cattle vocalization models will encounter varying acoustic environments, herd, and even individual originality. A model trained in one scenario can drop in performance when deployed to another if it cannot adapt. Indeed, cross study evaluations have shown notable drops when applying a classifier to unseen conditions. As seen earlier, a vocalization detector trained solely on Farm A’s data saw its accuracy dip when tested on Farm B, having different noise profile and herd composition (Li et al., 2024). These finding highlights both the challenge and a solution. The models need exposure to diverse data and mechanisms to adapt to new inputs. One approach could be to have a small set of calibration recordings for any new deployment. For example, record a few hours of ambient sounds and some typical calls when installing the system on a farm, then fine-tune the model on that. Another approach is transfer learning, where a base model trained on a large corpus of cow calls can be lightly retrained on a specific herd’s data to personalize it. Even a few dozen labelled samples from the target farm are enough to significantly boost performance to near native levels (Nolasco et al., 2023). Beyond this, future AI systems might employ online learning. They should continuously update their models as new data comes in.
- Explainable AI (XAI): While powerful, these adaptive deep learning models will not give explanations. Farmers are more likely to trust and adopt AI if they can understand the basis of its alerts or recommendations. Imagine an AI alert that simply says, “Cow #108 is distressed.” The farmer’s natural response is to ask why the AI concluded that. Whether, it was a certain type of moo, a pattern of vocalizations over time, or a combination of vocal and movement signals? If the system can’t provide a clear rationale, the farmer may be skeptical or unsure how to act on the alert. On one hand, interpretable models might use human comprehensible features (e.g., call rate, pitch range, duration) in simpler algorithms to output decisions that align with expert knowledge. For example, “high-pitched repeated calls + restless movement -> likely separation anxiety”. In fact, one of the approaches was building an explainable model that used defined acoustic features and AutoML to produce rules for classifying calls, allowing the contribution of each feature to be assessed. The “white-box” model could distinguish high vs. low frequency calls with around 90% accuracy, and importantly, it could highlight which features were contributing each classification . The downside was that a more complex deep learning model slightly outperformed the explainable model on accuracy. It is a common trade-off in AI.
- Acoustic front-end – A lightweight CNN (e.g., MobileNet-Spectro (Vidana-Vila et al., 2023)) transforms each 1-s Mel-spectrogram into a 256 D embedding that captures subtle spectral patterns of individual call types.
- Complementary sensor streams –
- 3.
- Feature bridge – For each 15-s window, the acoustic embedding is concatenated with handcrafted audio features (median F0, call rate) and low dimensional descriptors from video/IMU/THI streams, yielding a unified vector that retains human intuitive cues (pitch, activity) while injecting multimodal context (Peng et al., 2024).
- 4.
- Surrogate decision-tree layer – A shallow gradient boosted decision tree (GBDT) trained on the unified vector yields if-then rules (e.g., high pitch call + pacing + THI > 72 → heat-stress risk). The tree enforces monotonic splits that follow domain logic.
- 5.
- LLM reasoning agent – The rule output and sensor summary are passed to a lightweight LLM (prompt-orchestration concept adapted from AudioGPT (Huang et al., 2023)). The LLM rewrites the rule in plain language, adds context (“pacing + THI 78”), and suggests actions (e.g., activate fans).
- 6.
- Explanation & feedback – The LLM returns a concise rationale:

- 7.
- Farmer UI – A dashboard (mobile / web) displays the alert, underlying rule, and LLM recommendation. Farmers can accept, snooze, or label the alert, generating feedback for online adaptation: the CNN fine-tunes on new audio, and the tree/LLM prompt templates update incrementally, sustaining accuracy across changing farm conditions.

Future Directions and Vision
Towards Two-Way Communication Systems
Cross-Species AI Communication Frameworks
Integration into IoT and Precision Farming Ecosystems
From Isolated Sensors to the Cow’s “Digital Twin”
Cost–Benefit: Accuracy vs. Silicon
Farmer Adoption and Participatory Design
Policy and Regulatory Alignment
Edge AI and Infrastructure Gaps
Farmer Adoption Barriers and Workflow Redesign
- Year 1 (2025): Initiate large scale data collection, prototyping sensor hardware, and establishing initial partnerships with farms. Develop a basic audio recognition model (e.g. CNN for call detection) and a pilot dashboard.
- Year 2 (2026): Enhance model robustness via transfer learning and domain adaptation. Begin creating open benchmark datasets (inspired by BEANS (Hagiwara et al., 2022)) for cattle vocalizations to standardize performance evaluation. Conduct limited field trials to tune algorithms to on-farm conditions.
- Year 3 (2027): Focus on cross-domain generalization and explainability. Integrate multimodal learning (fusing sound with video and accelerometer/thermal inputs) to reduce false alarms. Develop explainable AI interfaces so farmers see “why” an alert was raised. Begin miniaturization of sensing hardware and optimizing power (e.g. sub-1W edge chips).
- Year 4 (2028): Scale to multiple farms and conditions. Publish comprehensive performance benchmarks on unseen herds. Engage with standard setting bodies to define best practices for animal sound datasets. Refine edge/cloud orchestration for low-latency alerts. Work toward certifying systems for herd health monitoring.
- Year 5 (2029): Achieve widespread adoption. Demonstrate that AI audio systems reduce animal welfare issues (e.g. earlier sickness detection). Further miniaturize and reduce cost of sensors. Ensure models handle real-world variability (weather, new breeds). Launch farmer education programs to interpret AI feedback correctly.
Benchmark Dataset Proposal
Policy Integration Proposal
Ethical Considerations in Deployment
Conclusions
Abbreviations
- AI – Artificial Intelligence
- ACI / HCI – Animal-Computer / Human-Computer Interaction
- ASR – Automatic Speech Recognition
- AVES – Audio-Visual Encoder for Species (self-supervised transformer pre-training model)
- BEANS – Benchmark of Animal Sounds (cross-species evaluation suite)
- BRD – Bovine Respiratory Disease
- CNN – Convolutional Neural Network
- FN / TP – False Negative / True Positive (classification metrics)
- GAN – Generative Adversarial Network
- GBDT – Gradient-Boosted Decision Tree
- HEAM – Hybrid Explainable Acoustic Model
- HMM – Hidden Markov Model
- IMU – Inertial Measurement Unit (accelerometer + gyroscope)
- IoT – Internet of Things
- k-NN – k-Nearest Neighbour
- LLM – Large Language Model
- LSTM – Long Short-Term Memory Network
- MFCC – Mel-Frequency Cepstral Coefficient
- MobileNet – Mobile-Optimised Convolutional Network (Lightweight CNN archi-tecture for edge devices)
- NRFAR – Noise-Robust Foraging Activity Recognition
- PLF – Precision Livestock Farming
- PRISMA – Preferred Reporting Items for Systematic Reviews and Meta-Analyses
- RF – Random Forest
- RNN – Recurrent Neural Network
- SNR – Signal-to-Noise Ratio
- SSL – Self-Supervised Learning
- SVM – Support Vector Machine
- THI – Temperature–Humidity Index (heat-stress metric)
- TTS – Text-to-Speech (synthesis module in AudioGPT)
- XAI – Explainable Artificial Intelligence
References
- Alonso, S. (2020). “An Intelligent Edge-IoT Platform for Monitoring Livestock and Crops in a Dairy Farming Scenario.” Ad Hoc Networks, vol. 98, Mar. 2020, p. 102047. DOI.org (Crossref). [CrossRef]
- Alsina-Pagès, M. , Llonch, P., Ginovart-Panisello, J., G., Guevara, R., Freixes, M., Castro, M.,, Mainau, E. (2021). (2021). Dairy Cattle Welfare through Acoustic Analysis: Preliminary results of acoustic environment description. Proceedings of Euronoise 2021, 25–27 October, Madeira, Portugal.
- Arablouei, Reza, al., e. (2024). “Cattle Behavior Recognition from Accelerometer Data: Leveraging in-Situ Cross-Device Model Learning.” Computers and Elec-tronics in Agriculture, vol. 227, Dec. 2024, p. 109546. [CrossRef]
- Araújo, M. (2025). “AI-Powered Cow Detection in Complex Farm Environments.” Smart Agricultural Technology, vol. 10, Mar. 2025, p. 100770. [CrossRef]
- Aubé, L. (2025). “Validation of Qualitative Behaviour Assessment for Dairy Cows at Pasture.” Applied Animal Behaviour Science, vol. 283, Feb. 2025, p. 106489. [CrossRef]
- Avanzato, R., Avondo, M., Beritelli, F., Franco, D., F., , Tumino, S. (2023). (2023). Detecting the Number of Bite Prehension of Grazing Cows in an Extensive System Using an Audio Recording Method. In Proceedings of the 8th International Conference of Yearly Reports on Informatics, Mathematics, and Engineering (ICYRIME 2023) (pp. 27–31). Naples, Italy. CEUR-WS.org, Vol. 3684.
- Bendel, Oliver,, Zbinden., N. (2024). “The Animal Whisperer Project.” Proceedings of the International Conference on Animal-Computer Interaction, ACM, 2024, pp. 1–9. [CrossRef]
- Bertelsen, Maja,, Jensen., M. B. (2023). “Comparing Weaning Methods in Dairy Calves with Different Dam Contact Levels.” Journal of Dairy Science, vol. 106, no. 12, Dec. 2023, pp. 9598–612. [CrossRef]
- Bloch, Victor, al., e. (2023). “Development and Analysis of a CNN- and Transfer-Learning-Based Classification Model for Automated Dairy Cow Feeding Behavior Recognition from Accelerometer Data.” Sensors, vol. 23, no. 5, Feb. 2023, p. 2611. [CrossRef]
- Brady, Beth, al., e. (2022). “Manatee Calf Call Contour and Acoustic Structure Varies by Species and Body Size.” Scientific Reports, vol. 12, no. 1, Nov. 2022, p. 19597. [CrossRef]
- Burnham, Rianna. (2023). “Animal Calling Behaviours and What This Can Tell Us about the Effects of Changing Soundscapes.” Acoustics, vol. 5, no. 3, July 2023, pp. 631–52. DOI.org (Crossref). [CrossRef]
- Castillejo, Pedro, al., e. (2019). “The AFarCloud ECSEL Project.” 2019 22nd Euromicro Conference on Digital System Design (DSD), IEEE, 2019, pp. 414–19. DOI.org (Crossref). [CrossRef]
- Cetintav, Bekir, al., e. (2025). “Generative AI Meets Animal Welfare: Evaluating GPT-4 for Pet Emotion Detection.” Animals, vol. 15, no. 4, Feb. 2025, p. 492. [CrossRef]
- Chelotti, O., J. , Martinez-Rau, L. S., al., e. (2024). “Livestock Feeding Behaviour: A Review on Automated Systems for Ruminant Monitoring.” Biosystems Engineering, vol. 246, Oct. 2024, pp. 150–77. [CrossRef]
- Chelotti, O., J., Vanrell, S. R., Martinez-Rau, L. S., al., e. (2023). “Using Segment-Based Features of Jaw Movements to Recognise Foraging Activities in Grazing Cattle.” Biosystems Engineering, vol. 229, May 2023, pp. 69–84. [CrossRef]
- Chelotti, O., J., Vanrell, S. R., Rau, L. S. M., al., e. (2020). “An Online Method for Estimating Grazing and Rumination Bouts Using Acoustic Signals in Grazing Cattle.” Computers and Electronics in Agriculture, vol. 173, June 2020, p. 105443. [CrossRef]
- Chen, Yiming, arXiv, e. a. V. B. L. V. A., 2024, D., (2024). [CrossRef]
- Clarke, A. (2024). “Bison Mother–Offspring Acoustic Communication.” Journal of Mammalogy, edited by Timothy Smyser, vol. 105, no. 5, Sept. 2024, pp. 1182–89. [CrossRef]
- Cornips, Leonie. (2024). “The Semiotic Repertoire of Dairy Cows.” Language in Society, Oct. 2024, pp. 1–25. [CrossRef]
- Dimov, Dimo, al., e. (2023). “Importance of Noise Hygiene in Dairy Cattle Farming—A Review.” Acoustics, vol. 5, no. 4, Nov. 2023, pp. 1036–45. [CrossRef]
- Dixhoorn, V. D., al., e. (2023). “Behavioral Patterns as Indicators of Resilience after Parturition in Dairy Cows.” Journal of Dairy Science, vol. 106, no. 9, Sept. 2023, pp. 6444–63. [CrossRef]
- Eckhardt, Regina, al., e. (2024). “Modelling Climate Change Impacts on Cattle Behavior Using Generative Artificial Intelligence: A Pathway to Adaptive Live-stock Management.” 2024 Anaheim, California July 28-31, 2024, American Society of Agricultural and Biological Engineers, 2024. [CrossRef]
- Eddicks, Matthias, al., e. (2024). “Monitoring of Respiratory Disease Patterns in a Multimicrobially Infected Pig Population Using Artificial Intelligence and Aggregate Samples.” Viruses, vol. 16, no. 10, Oct. 2024, p. 1575. DOI.org (Crossref). [CrossRef]
- Eriksson, H. (2022). “Strategies for Keeping Dairy Cows and Calves Together – a Cross-Sectional Survey Study.” Animal, vol. 16, no. 9, Sept. 2022, p. 100624. [CrossRef]
- Ferrero, Mariano, al., e. (2023). “A Full End-to-End Deep Approach for Detecting and Classifying Jaw Movements from Acoustic Signals in Grazing Cattle.” Engi-neering Applications of Artificial Intelligence, vol. 121, May 2023, p. 106016. [CrossRef]
- Fuchs, Patricia, al., e. (2024). “Stress Indicators in Dairy Cows Adapting to Virtual Fencing.” Journal of Animal Science, vol. 102, Jan. 2024, p. skae024. [CrossRef]
- Gavojdian, Dinu, Lazebnik, T., arXiv, e. a. B. M. L. f. V. A. o. D. C. u. N. A. S., 2023, J., (2023). [CrossRef]
- Gavojdian, Dinu, Mincu, M., al., e. (2024). “BovineTalk: Machine Learning for Vocalization Analysis of Dairy Cattle under the Negative Affective State of Isolation.” Frontiers in Veterinary Science, vol. 11, Feb. 2024, p. 1357109. [CrossRef]
- Geng, Hongbo, al., e. (2024). “Motion Focus Global–Local Network: Combining Attention Mechanism with Micro Action Features for Cow Behavior Recognition.” Computers and Electronics in Agriculture, vol. 226, Nov. 2024, p. 109399. [CrossRef]
- Gr, in, Temple. (2021). “The Visual, Auditory, and Physical Environment of Livestock Handling Facilities and Its Effect on Ease of Movement of Cattle, Pigs, and Sheep.” Frontiers in Animal Science, vol. 2, Oct. 2021, p. 744207. [CrossRef]
- Green, Alex, C., r., Clark, C. E. F., al., e. (2020). “Context-Related Variation in the Peripartum Vocalisations and Phonatory Behaviours of Holstein-Friesian Dairy Cows.” Applied Animal Behaviour Science, vol. 231, Oct. 2020, p. 105089. [CrossRef]
- Green, Alex, C., r., Lidfors, L. M., al., e. (2021). “Vocal Production in Postpartum Dairy Cows: Temporal Organization and Association with Maternal and Stress Behaviors.” Journal of Dairy Science, vol. 104, no. 1, Jan. 2021, pp. 826–38. [CrossRef]
- Hagiwara, arXiv, M. A. A. V. E. B. o. S., 2022, O., (2022). [CrossRef]
- Hagiwara, Masato, Cusimano, M., arXiv, e. a. M. A. V. t. S., 2022, O., (2022). [CrossRef]
- Hagiwara, Masato, Hoffman, B., arXiv, e. a. B. T. B. o. A. S., 2022, O., (2022). [CrossRef]
- Hasenpusch, P., al. (2024). “Dairy Cow Personality: Consistency in a Familiar Testing Environment.” JDS Communications, vol. 5, no. 5, Sept. 2024, pp. 511–15. [CrossRef]
- Holinger, Mirjam, al., e. (2024). “Behavioural Changes to Moderate Heat Load in Grazing Dairy Cows under On-Farm Conditions.” Livestock Science, vol. 279, Jan. 2024, p. 105376. [CrossRef]
- Huang, Rongjie, Underst, e. a. A., ing, Speech, G., Music, Sound,, arXiv, T. H., 2023, A., (2023). [CrossRef]
- Islam, Muhaiminul, M.,, Scott., S. D. (2022). “Exploring the Effects of Precision Livestock Farming Notification Mechanisms on Canadian Dairy Farmers.” Science and Technologies for Smart Cities, edited by Sara Paiva et al., vol. 442, Springer International Publishing, 2022, pp. 247–66. DOI.org (Crossref). [CrossRef]
- Jobarteh, Bubacarr, Acoustic, e. a. M. M. I. F. o., arXiv, L. D. f. D. D. C. V. i. A. W. A., DOI.org, 2. (2024). (Datacite). [CrossRef]
- Johnsen, Føske, J., Johanssen, J. R. E., al., e. (2021). “Investigating Cow−calf Contact in Cow-Driven Systems: Behaviour of the Dairy Cow and Calf.” Journal of Dairy Research, vol. 88, no. 1, Feb. 2021, pp. 52–55. [CrossRef]
- Johnsen, Føske, J., Sørby, J., al., e. (2024). “Effect of Debonding on Stress Indicators in Cows and Calves in a Cow-Calf Contact System.” JDS Communica-tions, vol. 5, no. 5, Sept. 2024, pp. 426–30. [CrossRef]
- Jung, Dae-Hyun, Kim, N. Y., Moon, S. H., Jhin, C., al., e. (2021). “Deep Learning-Based Cattle Vocal Classification Model and Real-Time Livestock Monitoring System with Noise Filtering.” Animals, vol. 11, no. 2, Feb. 2021, p. 357. [CrossRef]
- Jung, Dae-Hyun, Kim, N. Y., Moon, S. H., Kim, H. S., al., e. (2021). “Classification of Vocalization Recordings of Laying Hens and Cattle Using Convolutional Neural Network Models.” Journal of Biosystems Engineering, vol. 46, no. 3, Sept. 2021, pp. 217–24. [CrossRef]
- Karmiris, Ilias, al., e. (2021). “Estimating Livestock Grazing Activity in Remote Areas Using Passive Acoustic Monitoring.” Information, vol. 12, no. 8, July 2021, p. 290. [CrossRef]
- Kim, Eunbeen, al., e. (2023). “DualDiscWaveGAN-Based Data Augmentation Scheme for Animal Sound Classification.” Sensors, vol. 23, no. 4, Feb. 2023, p. 2024. DOI.org (Crossref). [CrossRef]
- Kok, Akke, al., e. (2023). “Do You See the Pattern? Make the Most of Sensor Data in Dairy Cows.” Journal of Dairy Research, vol. 90, no. 3, Aug. 2023, pp. 252–56. [CrossRef]
- Lamanna, Martina, Bovo, M.,, Applica-tions, D. C. ". C. T. f. D. C. A. S. R. o. t. C., 15.3, F. I. i. P. L. F. A. a. O. A. J. f. M. (2025). (2025): 458. [CrossRef]
- Lange, Annika, al., e. (2020). “Talking to Cows: Reactions to Different Auditory Stimuli During Gentle Human-Animal Interactions.” Frontiers in Psychology, vol. 11, Oct. 2020, p. 579346. [CrossRef]
- Lardy, R., al. (2022). “Understanding Anomalies in Animal Behaviour: Data on Cow Activity in Relation to Health and Welfare.” Animal - Open Space, vol. 1, no. 1, Dec. 2022, p. 100004. [CrossRef]
- Laurijs, A., K. (2021). “Vocalisations in Farm Animals: A Step towards Positive Welfare Assessment.” Applied Animal Behaviour Science, vol. 236, Mar. 2021, p. 105264. [CrossRef]
- Lefèvre, A., R. (2025). “Machine Learning Algorithms Can Predict Emotional Valence across Ungulate Vocalizations.” iScience, vol. 28, no. 2, Feb. 2025, p. 111834. [CrossRef]
- Lenner, Ádám, al., e. (2023). “Calming Hungarian Grey Cattle in Headlocks Using Processed Nasal Vocalization of a Mother Cow.” Animals, vol. 14, no. 1, Dec. 2023, p. 135. [CrossRef]
- Lenner, Ádám, al., e. (2025). “Analysis of Sounds Made by Bos Taurus and Bubalus Bubalis Dams to Their Calves.” Frontiers in Veterinary Science, vol. 12, Mar. 2025, p. 1549100. DOI.org (Crossref). [CrossRef]
- Li, Baihan, arXiv, e. a. D. L. A. T. C. f. D. A. G., 2024, J., (2024). [CrossRef]
- Li, Chao, Minati, L., al., e. (2022). “Integrated Data Augmentation for Accelerometer Time Series in Behavior Recognition: Roles of Sampling, Balancing, and Fourier Surrogates.” IEEE Sensors Journal, vol. 22, no. 24, Dec. 2022, pp. 24230–41. [CrossRef]
- Li, Chao, Tokgoz, K., al., e. (2021). “Data Augmentation for Inertial Sensor Data in CNNs for Cattle Behavior Classification.” IEEE Sensors Letters, vol. 5, no. 11, Nov. 2021, pp. 1–4. [CrossRef]
- Li, Guoming, al., e. (2021). “Classifying Ingestive Behavior of Dairy Cows via Automatic Sound Recognition.” Sensors, vol. 21, no. 15, Aug. 2021, p. 5231. [CrossRef]
- Linstädt, Jenny, al., e. (2024). “Animal-Based Welfare Indicators for Dairy Cows and Their Validity and Practicality: A Systematic Review of the Existing Literature.” Frontiers in Veterinary Science, vol. 11, July 2024, p. 1429097. [CrossRef]
- Liu, Jiefei, al., e. (2024). “Development of a Novel Classification Approach for Cow Behavior Analysis Using Tracking Data and Unsupervised Machine Learning Techniques.” Sensors, vol. 24, no. 13, June 2024, p. 4067. [CrossRef]
- Liu, ting, T., al., e. (2020). “Development Process of Animal Image Recognition Technology and Its Application in Modern Cow and Pig Industry.” IOP Conference Series: Earth and Environmental Science, vol. 512, no. 1, June 2020, p. 012090. [CrossRef]
- Mac, E., S., al., e. (2023). “Behavioral Responses to Cow and Calf Separation: Separation at 1 and 100 Days after Birth.” Animal Bioscience, vol. 36, no. 5, May 2023, pp. 810–17. [CrossRef]
- Mahmud, Sultan, M., al., e. (2021). “A Systematic Literature Review on Deep Learning Applications for Precision Cattle Farming.” Computers and Electronics in Agriculture, vol. 187, Aug. 2021, p. 106313. [CrossRef]
- Martinez-Rau, S., L. , Chelotti, J. O., Ferrero, M., Galli, J. R., al., e. (2025). “A Noise-Robust Acoustic Method for Recognizing Foraging Activities of Grazing Cattle.” Computers and Electronics in Agriculture, vol. 229, Feb. 2025, p. 109692. [CrossRef]
- Martinez-Rau, S., L. , Chelotti, J. O., Ferrero, M., Utsumi, S. A., al., e. (2023). “Daylong Acoustic Recordings of Grazing and Rumination Activi-ties in Dairy Cows.” Scientific Data, vol. 10, no. 1, Nov. 2023, p. 782. [CrossRef]
- Martinez-Rau, S., L. , Chelotti, J. O., Giovanini, L. L., al., e. (2024). “On-Device Feeding Behavior Analysis of Grazing Cattle.” IEEE Transactions on Instrumentation and Measurement, vol. 73, 2024, pp. 1–13. [CrossRef]
- Martinez-Rau, S., L. , Chelotti, J. O., Vanrell, S. R., al., e. (2022). “A Robust Computational Approach for Jaw Movement Detection and Classification in Grazing Cattle Using Acoustic Signals.” Computers and Electronics in Agriculture, vol. 192, Jan. 2022, p. 106569. [CrossRef]
- Martinez-Rau, Sebastian, L., al., e. (2023). “Real-Time Acoustic Monitoring of Foraging Behavior of Grazing Cattle Using Low-Power Embedded Devices.” 2023 IEEE Sensors Applications Symposium (SAS), IEEE, 2023, pp. 01–06. [CrossRef]
- Marumo, L., J. (2024). “Behavioural Variability, Physical Activity, Rumination Time, and Milk Characteristics of Dairy Cattle in Response to Regrouping.” Animal, vol. 18, no. 3, Mar. 2024, p. 101094. [CrossRef]
- McManus, Rosemary, al., e. (2022). “Thermography for Disease Detection in Livestock: A Scoping Review.” Frontiers in Veterinary Science, vol. 9, Aug. 2022, p. 965622. DOI.org (Crossref). [CrossRef]
- Meen, H., G. (2015). “Sound Analysis in Dairy Cattle Vocalisation as a Potential Welfare Monitor.” Computers and Electronics in Agriculture, vol. 118, Oct. 2015, pp. 111–15. DOI.org. [CrossRef]
- Mehdizadeh, A. , Saman, al., e. (2023). “Classifying Chewing and Rumination in Dairy Cows Using Sound Signals and Machine Learning.” Animals, vol. 13, no. 18, Sept. 2023, p. 2874. [CrossRef]
- Miao, Zhongqi, Review, e. a. Z. T. f. W. B. D. I., 2023, A., (2023). [CrossRef]
- Miron, Marius, arXiv, e. a. B. A. V. D. w. A. t. C. D., 2025, M., (2025). [CrossRef]
- Moreira, Madruga, S., al., e. (2023). “Auditory Sensitivity in Beef Cattle of Different Genetic Origin.” Journal of Veterinary Behavior, vol. 59, Jan. 2023, pp. 67–72. DOI.org (Crossref). [CrossRef]
- Moutaouakil, E. , Khalid,, Falih., N. (2023). “A Design of a Smart Farm System for Cattle Monitoring.” Indonesian Journal of Electrical Engineering and Computer Science, vol. 32, no. 2, Nov. 2023, p. 857. DOI.org (Crossref). [CrossRef]
- Neethirajan, 6. Neethirajan, 6.4, S. ". a. L. T. M. t. D. C. V. A. N. A. A. t. P. W. A. (2025). (2025): 65. [CrossRef]
- Neethirajan, Suresh. (2022). “Affective State Recognition in Livestock—Artificial Intelligence Approaches.” Animals, vol. 12, no. 6, Mar. 2022, p. 759. [CrossRef]
- Noda, T. , Koizumi, T., Yukitake, sources, N. e. a. A. s. l. a. a. s. f. e. c. o. s., 14, d. t. f. m. n. u. s. S. R., 6394 (6394). (2024). [CrossRef]
- Nolasco, Ines, al., e. (2023). “Learning to Detect an Animal Sound from Five Examples.” Ecological Informatics, vol. 77, Nov. 2023, p. 102258. [CrossRef]
- Ntalampiras, Stavros, al., e. (2020). “Automatic Detection of Cow/Calf Vocalizations in Free-Stall Barn.” 2020 43rd International Conference on Telecommunica-tions and Signal Processing (TSP), IEEE, 2020, pp. 41–45. DOI.org. [CrossRef]
- Nurcholis,, Sumaryanti., L. (2021). “Reproductive Behavior’s: Audiovisual Detection of Oestrus after Synchronization Using Prostaglandin F2 Alpha (PGF2α).” E3S Web of Conferences, edited by I.H.A. Wahab et al., vol. 328, 2021, p. 04021. [CrossRef]
- Oestreich, K., W. (2024). “Listening to Animal Behavior to Understand Changing Ecosystems.” Trends in Ecology & Evolution, vol. 39, no. 10, Oct. 2024, pp. 961–73. [CrossRef]
- Özmen, Güzin, al., e. (2022). “Sound Analysis to Recognize Cattle Vocalization in a Semi-Open Barn.” Gazi Journal of Engineering Sciences, vol. 8, no. 1, May 2022, pp. 158–67. [CrossRef]
- P, eya, Raj, Y., Bhattarai, B.,, Lee., J. (2020). “Visual Object Detector for Cow Sound Event Detection.” IEEE Access, vol. 8, 2020, pp. 162625–33. [CrossRef]
- P, eya, Raj, Y., Bhattarai, B., Afzaal, U., al., e. (2022). “A Monophonic Cow Sound Annotation Tool Using a Semi-Automatic Method on Au-dio/Video Data.” Livestock Science, vol. 256, Feb. 2022, p. 104811. [CrossRef]
- Page, J., M. (2020). “PRISMA 2020 Explanation and Elaboration: Updated Guidance and Exemplars for Reporting Systematic Reviews.” BMJ, Mar. 2021, p. n160. DOI.org (Crossref). [CrossRef]
- Pallottino, Federico, al., e. (2025). “Applications and Perspectives of Generative Artificial Intelligence in Agriculture.” Computers and Electronics in Agriculture, vol. 230, Mar. 2025, p. 109919. [CrossRef]
- Patil, Ruturaj, al., e. (2024). “Identifying Indian Cattle Behaviour Using Acoustic Biomarkers:” Proceedings of the 13th International Conference on Pattern Recog-nition Applications and Methods, SCITEPRESS - Science and Technology Publications, 2024, pp. 594–602. [CrossRef]
- Pei, Te, Clustering, e. a. G. A. D. T. F. I. H., arXiv, L. L. M. f. E. C., DOI.org, 2. (2025). (Datacite). [CrossRef]
- Peng, Yingqi, Chen, Y., al., e. (2024). “A Multimodal Classification Method: Cow Behavior Pattern Classification with Improved EdgeNeXt Using an Inertial Measurement Unit.” Computers and Electronics in Agriculture, vol. 226, Nov. 2024, p. 109453. [CrossRef]
- Peng, Yingqi, Wul, ari, al., e. (2023). “Japanese Black Cattle Call Patterns Classification Using Multiple Acoustic Features and Machine Learning Models.” Com-puters and Electronics in Agriculture, vol. 204, Jan. 2023, p. 107568. [CrossRef]
- Prestegaard-Wilson, Jacquelyn,, Vitale., J. (2024). “Generative Artificial Intelligence in Extension: A New Era of Support for Livestock Producers.” Animal Frontiers, vol. 14, no. 6, Dec. 2024, pp. 57–59. [CrossRef]
- Pérez-Granados, Cristian, , Schuchmann., K. (2023). “The Sound of the Illegal: Applying Bioacoustics for Long-Term Monitoring of Illegal Cattle in Pro-tected Areas.” Ecological Informatics, vol. 74, May 2023, p. 101981. [CrossRef]
- Pérez-Torres, L., al. (2021). “Short- and Long-Term Effects of Temporary Early Cow–Calf Separation or Restricted Suckling on Well-Being and Performance in Zebu Cattle.” Animal, vol. 15, no. 2, Feb. 2021, p. 100132. [CrossRef]
- Radford, Alec, 2022, e. a. R. S. R. v. L. W. S., https://arxiv.org/abs/2212.04356. (2022).
- Ramos, Angel, E., al., e. (2023). “Antillean Manatee Calves in Captive Rehabilitation Change Vocal Behavior in Anticipation of Feeding.” Zoo Biology, vol. 42, no. 6, Nov. 2023, pp. 723–29. [CrossRef]
- Riaboff, L., al. (2022). “Predicting Livestock Behaviour Using Accelerometers: A Systematic Review of Processing Techniques for Ruminant Behaviour Predic-tion from Raw Accelerometer Data.” Computers and Electronics in Agriculture, vol. 192, Jan. 2022, p. 106610. [CrossRef]
- Robinson, David, arXiv, e. a. N. A. A. F. M. f. B., 2024, N., (2024). [CrossRef]
- Rohan, Ali, al., e. (2024). “Application of Deep Learning for Livestock Behaviour Recognition: A Systematic Literature Review.” Computers and Electronics in Agriculture, vol. 224, Sept. 2024, p. 109115. [CrossRef]
- Rubenstein, K., P. , Speak, e. a. A. A. L. L. M. T. C., arXiv, L., 2023, J., (2023). [CrossRef]
- Russel, Shebiah, N., , Selvaraj., A. (2024). “Decoding Cow Behavior Patterns from Accelerometer Data Using Deep Learning.” Journal of Veteri-nary Behavior, vol. 74, July 2024, pp. 68–78. [CrossRef]
- Röttgen, V. (2020). “Automatic Recording of Individual Oestrus Vocalisation in Group-Housed Dairy Cattle: Development of a Cattle Call Monitor.” Animal, vol. 14, no. 1, 2020, pp. 198–205. [CrossRef]
- Sattar, Farook. (2022). “A Context-Aware Method-Based Cattle Vocal Classification for Livestock Monitoring in Smart Farm.” The 1st International Online Con-ference on Agriculture—Advances in Agricultural Science and Technology, MDPI, 2022, p. 89. [CrossRef]
- Schillings, J., al. (2024). “Managing End-User Participation for the Adoption of Digital Livestock Technologies: Expectations, Performance, Relationships, and Support.” The Journal of Agricultural Education and Extension, vol. 30, no. 2, Mar. 2024, pp. 277–95. DOI.org (Crossref). [CrossRef]
- Schnaider, Alice, M., al., e. (2022). “Vocalization and Other Behaviors Indicating Pain in Beef Calves during the Ear Tagging Procedure.” Journal of Veterinary Behavior, vol. 47, Jan. 2022, pp. 93–98. [CrossRef]
- Schnaider, Ma, al., e. (2022). “Vocalization and Other Behaviors as Indicators of Emotional Valence: The Case of Cow-Calf Separation and Reunion in Beef Cattle.” Journal of Veterinary Behavior, vol. 49, Mar. 2022, pp. 28–35. [CrossRef]
- Sert, P. d., N. , Hurst, V., Ahluwalia, 16, A. e. a. T. A. g. 2. U. g. f. r. a. r. B. V. R., 242 (2020). (2020). [CrossRef]
- Seyfarth, M., R. ,, Cheney., D. L. (2003). “Signalers and Receivers in Animal Communication.” Annual Review of Psychology, vol. 54, no. 1, Feb. 2003, pp. 145–73. DOI.org. [CrossRef]
- Sharma, Sug, ha,, Kadyan., V. (2023). “Detection of Estrus through Automated Classification Approaches Using Vocalization Pattern in Murrah Buffa-loes.” 2023 3rd International Conference on Artificial Intelligence and Signal Processing (AISP), IEEE, 2023, pp. 1–6. [CrossRef]
- Shi, Zhonghao, al., e. (2024). “Classifying and Understanding of Dairy Cattle Health Using Wearable Inertial Sensors With Random Forest and Explainable Artifi-cial Intelligence.” IEEE Sensors Letters, vol. 8, no. 3, Mar. 2024, pp. 1–4. [CrossRef]
- Shorten, R., P. (2023). “Acoustic Sensors for Detecting Cow Behaviour.” Smart Agricultural Technology, vol. 3, Feb. 2023, p. 100071. [CrossRef]
- Shorten, R., P., , Hunter., L. B. (2023). “Acoustic Sensors for Automated Detection of Cow Vocalization Duration and Type.” Computers and Electronics in Agri-culture, vol. 208, May 2023, p. 107760. [CrossRef]
- Shorten, R., P. ,, Hunter., L. B. (2024). “Acoustic Sensors to Detect the Rate of Cow Vocalization in a Complex Farm Environment.” Applied Animal Be-haviour Science, vol. 278, Sept. 2024, p. 106377. [CrossRef]
- Silva, D. , Martins, M., al., e. (2024). “Acoustic-Based Models to Assess Herd-Level Calves’ Emotional State: A Machine Learning Approach.” Smart Agricultural Technology, vol. 9, Dec. 2024, p. 100682. [CrossRef]
- Slob, Naftali, al., e. (2021). “Application of Machine Learning to Improve Dairy Farm Management: A Systematic Literature Review.” Preventive Veterinary Medi-cine, vol. 187, Feb. 2021, p. 105237. [CrossRef]
- Stachowicz, Joanna, al., e. (2022). “Can We Detect Patterns in Behavioral Time Series of Cows Using Cluster Analysis?” Journal of Dairy Science, vol. 105, no. 12, Dec. 2022, pp. 9971–81. [CrossRef]
- Stowell, Dan. (2022). “Computational Bioacoustics with Deep Learning: A Review and Roadmap.” PeerJ, vol. 10, Mar. 2022, p. e13152. [CrossRef]
- Stygar, H., A., al., e. (2022). “How Far Are We From Data-Driven and Animal-Based Welfare Assessment? A Critical Analysis of European Quality Schemes.” Frontiers in Animal Science, vol. 3, May 2022, p. 874260. [CrossRef]
- Sun, Yifei, al., e. (2023). “Free-Ranging Livestock Changes the Acoustic Properties of Summer Soundscapes in a Northeast Asian Temperate Forest.” Biological Conservation, vol. 283, July 2023, p. 110123. [CrossRef]
- Takefuji, Yoshiyasu. (2024). “Unveiling Livestock Trade Trends: A Beginner’s Guide to Generative AI-Powered Visualization.” Research in Veterinary Science, vol. 180, Nov. 2024, p. 105435. [CrossRef]
- Torre, P. D. L. , Mónica, al., e. (2015). “Acoustic Analysis of Cattle (Bos Taurus) Mother–Offspring Contact Calls from a Source–Filter Theory Perspective.” Applied Animal Behaviour Science, vol. 163, Feb. 2015, pp. 58–68. DOI.org. [CrossRef]
- Tuyttens, AM, F., Molento, C. F.,, farming, S. B. ". t. o. p. l. (2022). (PLF) for animal welfare." Frontiers in Veteri-nary Science 9 (2022): 889623. [CrossRef]
- V, ermeulen, Joris, al., e. (2016). “Early Recognition of Bovine Respiratory Disease in Calves Using Automated Continuous Monitoring of Cough Sounds.” Computers and Electronics in Agriculture, vol. 129, Nov. 2016, pp. 15–26. DOI.org (Crossref). [CrossRef]
- Vidal, Gema, al., e. (2023). “Comparative Performance Analysis of Three Machine Learning Algorithms Applied to Sensor Data Registered by a Leg-Attached Accelerometer to Predict Metritis Events in Dairy Cattle.” Frontiers in Animal Science, vol. 4, Apr. 2023, p. 1157090. [CrossRef]
- Vidaña-Vila, E., Malé, J., Freixes, M., Solís-Cifré, M., Jiménez, M., Larrondo, C., Guevara, R., Mir, a, J., Duboc, L., Mainau, E., Llonch, P., , Alsina-Pagès, M., R. (2023). (2023). Automatic Detection of Cow Vocalizations Using Convolutional Neural Networks. In Proceedings of the 8th Detection and Classification of Acoustic Scenes and Events Workshop (DCASE2023) (pp. 206–210). Tampere, Finland.
- Vogt, Anina, al., e. (2025). “Dairy Cows’ Responses to 2 Separation Methods after 3 Months of Cow-Calf Contact.” Journal of Dairy Science, vol. 108, no. 2, Feb. 2025, pp. 1940–63. [CrossRef]
- Volkmann, N., al., e. (2021). “On-Farm Detection of Claw Lesions in Dairy Cows Based on Acoustic Analyses and Machine Learning.” Journal of Dairy Science, vol. 104, no. 5, May 2021, pp. 5921–31. [CrossRef]
- Vranken, E. , Mounir, M., Norton, T. (2023). (2023). Sound-Based Monitoring of Livestock. In: Zhang, Q. (eds) Encyclopedia of Digital Agricultural Technologies. Springer, Cham. [CrossRef]
- Vu, H. , Prabhune, O., Raskar, U., P, itharatne, D., Chung, H., Choi, Y., C.,, Kim, Y. (2024). (2024). MmCows: A Multimodal Dataset for Dairy Cattle Monitoring. In Advances in Neural Information Processing Systems, 37, 59451–59467.
- Wang, Bin, arXiv, e. a. A. A. U. B. f. A. L. L. M., 2024, N., (2024). [CrossRef]
- Wang, Jun, Chen, H., al., e. (2023). “Identification of Oestrus Cows Based on Vocalisation Characteristics and Machine Learning Technique Using a Du-al-Channel-Equipped Acoustic Tag.” Animal, vol. 17, no. 6, June 2023, p. 100811. [CrossRef]
- Wang, Jun, Si, Y., al., e. (2023). “Discrimination Strategy Using Machine Learning Technique for Oestrus Detection in Dairy Cows by a Dual-Channel-Based Acoustic Tag.” Computers and Electronics in Agriculture, vol. 210, July 2023, p. 107949. [CrossRef]
- Wang, Ziwei, al., e. (2022). “Multi-Modal Sensing for Behaviour Recognition.” Proceedings of the 28th Annual International Conference on Mobile Computing And Networking, ACM, 2022, pp. 900–02. [CrossRef]
- Watts, M., J. ,, Stookey., J. M. (2000). “Vocal Behaviour in Cattle: The Animal’s Commentary on Its Biological Processes and Welfare.” Applied Animal Behaviour Science, vol. 67, no. 1–2, Mar. 2000, pp. 15–33. DOI.org. [CrossRef]
- Welk, Allison, al., e. (2024). “Invited Review: The Effect of Weaning Practices on Dairy Calf Performance, Behavior, and Health—A Systematic Review.” Journal of Dairy Science, vol. 107, no. 8, Aug. 2024, pp. 5237–58. [CrossRef]
- Wu, Haibin, arXiv, e. a. T. A. L. M. –. a. O., 2024, F., (2024). [CrossRef]
- Wu, Yiqi, Underst, e. a. G. V. P. P. o. M. L. L. M. i. P. A., arXiv, i., 2024, J., (2024). [CrossRef]
- Yang, Qian, arXiv, e. a. A. B. L. A. M. v. G. C., 2024, J., (2024). [CrossRef]
- Yoshihara, Yu,, Oya., K. (2021). “Characterization and Assessment of Vocalization Responses of Cows to Different Physiological States.” Journal of Applied Animal Research, vol. 49, no. 1, Jan. 2021, pp. 347–51. [CrossRef]
- Z, Baig, T.,, Ch, Shastry., r. (2022). “Parturition and Estrus Detection in Cows and Heifers with WSN and IoT.” 2022 2nd International Con-ference on Technological Advancements in Computational Sciences (ICTACS), IEEE, 2022, pp. 201–08. [CrossRef]






| Vocalization Type | Dominant Frequency (Hz) | Typical Duration (s) | Typical Mouth/Posture | Principal Behavioural Context | Practical Welfare Interpretation |
|---|---|---|---|---|---|
| Maternal contact (lowing/closed-mouth call) (Green and Alexandra C, 2021) | F0 ~ 120-280 Hz (mean ~180 Hz) | ~ 0.8–2.5 s | Closed or partially open, head lowered toward calf | Cow-calf proximity, gentle bonding, reassurance | Indicates calm social contact and maternal bonding, normally a positive welfare cue |
| Calf isolation distress call (Mac et al., 2023) | F0 ~ 450-780 Hz | ~ 1-4s (modal ~ 2 s) |
Open-mouth, elevated head, often repeated bouts | Calf separated from dam/herd | Signals acute distress, should trigger rapid reunion or comfort |
| Adult distress / pain call (Martinez-Rau et al., 2025) | F0 ~ 600-1200 Hz | > 2 s (mean ~ 3.1 s) |
Fully open mouth, tense neck | Pain (e.g., lameness, injury) or extreme fear | High-urgency alert, immediate welfare check required |
| Hunger / feed-anticipation call (Sattar and Farook, 2022) | F0 ~ 220–380 Hz | ~ 0.5–2.0 s | Open-mouth, pacing near feed-gate | Imminent feeding, empty trough | Indicates motivational state (feed expectation) |
| Estrus (heat) call (Sharma et al., 2023) | F0 ~160–320 Hz (rich harmonic stack) | ~ 0.8–3 s | Extended vocal tract, head raised | Reproductive behaviour, seeking mates | Reliable cue for breeding/AI scheduling, positive management indicator |
| Social affiliative call (Schnaider et al., 2022) | F0 ~110-260 Hz | ~ 0.4–1.2 s | Closed-mouth, nasal | Group re-joining, mild excitement | Normal herd cohesion signal, neutral/positive welfare |
| Alarm / novel object call (Miron et al., 2025) | F0 ~ 650-1100 Hz | - | Sudden, sharp, head-up stance | Perceived predator, startling event | Short-term fear, monitor environment and animal safety |
| Cough / respiratory (Sattar and Farook, 2022) | Broadband burst 200–1 200 Hz | ~ 0.12–0.35 s | Forced exhalation, closed glottis | Respiratory irritation or disease onset | Early health-risk indicator (e.g., BRD), triggers clinical exam |
| Pain-related moan (low-frequency) (Volkmann et al., 2021) | F0 ~ 90–190 Hz | ~ 1.5–5 s | Mouth partially open, minimal movement | Chronic discomfort (lameness, parturition) | Persistent occurrence warrants veterinary assessment |
| Play/excitement call (Vogt et al., 2025) | F0 ~ 260–450 Hz | ~ 0.3–0.9 s | Short bursts during running/bucking | Calf play, social excitement | Positive affect indicates good welfare environment |
| # | Sensor (mount) | Signal + Edge Load* | Audio-Synergy Example (welfare alert) | Field Limitation |
|---|---|---|---|---|
| 1 | Tri-axial ACC (collar / ear) (Martinez-Rau et al., 2023), (Peng et al., 2024) | 100 Hz; Low (3 Kbps) | Chew rate high + high-F₀ “feed call” → early feeding cue | Battery life: collar fit |
| 2 | UWB / RFID (tag grid) (Wang et al., 2022) | Distance events; Low | >10 m isolation + distress bawl → weaning-stress alert | Antenna cost; metal interference |
| 3 | Thermal cam (fixed) (Slob et al., 2021) | 5–15 fps; Med (0.5 Mbps) | Eye-temp high + panting sound → heat-stress risk | Night IR; occlusion |
| 4 | RGB cam (overhead) (Röttgen et al., 2020) | 25 fps; High unless pruned | Limp posture + low-F moan → lameness warning | Bandwidth; privacy; dirt |
| 5 | 4 mic array (ceiling beamformer) (Röttgen et al., 2020) | 48 kHz; Med | Source-located call + ACC ID → pinpoint distressed cow | Cabling; calibration drift |
| 6 | NH₃ / CO₂ gas (wall) (Pérez-Granados et al., 2023) | 1 Hz; Low | Gas spike + drop in calling → respiratory risk | Sensor drift |
| 7 | Water trough pressure mat (Shi et al., 2024) | Sip events; Low. |
Few sips + thirst call → blocked drinker alert | Hardware wear |
| # | Reference | Context | Recording Setup | Algorithm Applied | Data Volume (calls/ hours) | Performance Metric(s) | Major Insight / Key Finding |
|---|---|---|---|---|---|---|---|
| 1 | Mac et al., 2023 | Calf distress at weaning | 3 chest-high mics, 44.1 kHz, indoor pen | k-NN on MFCC mean ± SD | 600 calls | 94 % accuracy | High-pitched, long calls reliably indicated distress |
| 2 | Sharma et al., 2023 | Dairy estrus detection | Neck collar mic, 16 kHz | SVM, RF comparison | 2000 calls | SVM 95 % accuracy | Estrus vocalization has signature harmonic pattern |
| 3 | Vidana-Vila et al., 2023 | Continuous barn monitoring | 12 ceiling mics, 8 kHz | MobileNet CNN detector | 25 h audio | AUROC 0.93 | Real-time detection feasible on edge device |
| 4 | Patil et al., 2024 | Hunger vs. cough vs. estrus | Hand-held recorder, 48 kHz | 7-layer CNN | 5200 clips | 0.97 accuracy | Deep CNN discriminates four intent categories |
| 5 | Ferrero et al., 2023 | 6-class health dataset | Static barn mic array | CNN-LSTM hybrid | 7800 segments | 0.80 macro-F1 | Temporal context boosts recall on rare classes |
| 6 | Röttgen et al., 2020 | Individual ID in group | 4-mic beamformer array | Source-localization + DNN | 1350 events | 87 % correct cow ID | Multi-mic geometry enables caller identification |
| 7 | Hagiwara and Masato, 2022 | Self-supervised AVES | Mixed-species archive, cow subset | Transformer encoder | 160 h unlabelled +800 labelled | +7 pp F1 vs. CNN | SSL cuts annotation cost, improves few-shot |
| 8 | Martinez-Rau et al., 2023 | Chew detection collar | Collar mic + accel | RF on chewing spectra | 4 h per cow × 20 | 92 % chew vs. rumination | Detects feeding bouts for intake estimation |
| 9 | Gavojdian et al., 2024 | Stress isolation study | Lav-mics, 22 kHz | Bi-LSTM | 3000 sequences | 0.91 F1 | Sequence model spots stress more reliably |
| 10 | Sattar and Farook, 2022 | Multi-intent cough/food/estrus | 6 mics, 48 kHz | Spectrogram CNN | 4400 clips | 0.82 macro-F1 | Combined dataset demonstrates multi-class viability |
| 11 | Peng et al., 2024 | Behaviour fusion EdgeNeXt | Audio + ACC | EdgeNeXt + fusion | 220 h | 95 % behaviour acc. | Multimodal fusion > single modality |
| # | AI Approach / Architecture | Typical Training Data Volume | Key Input Representation | Reported Best Accuracy / F1 | Strengths in Reviewed Studies | Main Limitations / Failure Modes | Representative Use-Case(s) |
|---|---|---|---|---|---|---|---|
| 1 | Random Forest (RF) | ≈ 500–3 000 labelled calls | Hand-crafted MFCC + temporal stats | 88 – 93 % F1 (distress vs. non-distress) | Robust to noise, interpretable feature importance | Needs manual feature engineering, weak on temporal context | Estrus-call detection (Sharma et al., 2023) |
| 2 | Support Vector Machine (SVM) | 200–2 000 calls | MFCC mean ± SD, fundamental F0 | 86–95 % accuracy (estrus vs. baseline) | Performs well on small datasets, strong margins | Sensitive to parameter tuning, scales poorly with >10 k samples | Early estrus detection wearables (Peng et al., 2023) |
| 3 | k-Nearest Neighbour (k-NN) | 600 calls | Spectral centroid, duration, energy | 94 % accuracy for open- vs. closed-mouth calls | Simple, no training time | Storage heavy, cannot model sequence | Call-type classifier in Japanese Black cattle (Peng et al., 2023) |
| 4 | CNN (2-D spectrogram) | ≥ 5000 call segments | Mel-spectrogram images (128 bins) | 97 % accuracy, 0.96 F1 (multi-class-4) | Learns spectral patterns, no manual features | Needs GPU & large data, poor temporal memory alone | Multi-intent classifier (hunger, cough, estrus, normal) (Patil et al., 2024) |
| 5 | Lightweight CNN (MobileNet) | 25 h continuous barn audio | 64-bin log-mel | AUROC 0.93 at 1 s stride | Fast edge inference (<20 ms), low power | Precision drops in heavy machinery noise | Real-time call detection collar (Vidana-Vila et al., 2023) |
| 6 | LSTM / Bi-LSTM | 3000 labelled sequences | Per-frame MFCC + delta MFCC (time series) | 91 % F1 (calf isolation vs. contact) | Captures temporal dynamics, good on sequences | Over-fitting on short clips, GPU-heavy | Isolation stress monitor (Martinez-Rau et al., 2025) |
| 7 | Hybrid CNN + LSTM | 7 800 segments (6 classes) | CNN spectrograms embedding -> LSTM | 80 % overall F1, +6 pp over CNN-only on rare classes | Combines spectrum + sequence info | Needs >10 k samples to beat pure CNN | Multi-class health event detector (Ferrero et al., 2023) |
| 8 | Transformer Audio Encoder (AVES) | 160 h unlabelled pretrain + 800 labels finetune | Raw 16 kHz waveform | 3–7 pp increases F1 over baseline CNN | Self-supervised, strong few shots; domain adaptable | Needs GPU for pretrain, complex | Few-shot call classification after self-pre-training (Hagiwara and Masato, 2022) |
| 9 | EdgeNeXt multi-Sensor Fusion | 220 cow-hours (ACC, audio) | Spectrogram + 6-DoF inertial images | 95 % accuracy behaviour classification | Multimodal, noise-robust | Needs synchronized sensors, heavy preprocessing | Social licking vs. ruminating (Peng et al., 2024) |
| 10 | Explainable AutoML DT/Rule set | 1200 calls | 24 acoustic stats features | 90 % accuracy, full rule trace | Human-readable decision paths | 3-4 pp lower F1 vs. deep nets | White-box distress detection |
| # | Technique / Model | Up-Stream Pre-training Base | Fine-Tuning Data (Bovine) | Key Output / Capability | Demonstrated Advantage | Current Limitations |
|---|---|---|---|---|---|---|
| 1 | Wav2Vec 2.0 (SSL) | 960 h Librispeech human speech | 2 h labelled cow calls | 768-dim latent embeddings -> downstream classifier | Cuts labelled data need by ≈ 70 % (Hagiwara and Masato, 2022) | Requires long GPU pre-train, bovine prosody differs, latent units not explainable |
| 2 | HuBERT-style Audio LM | 60 k h Youtube-Audio8M | 5 h cow distress calls | Discrete token stream for LLM conditioning | Self-supervised tokens improve LLM prompt ability | - |
| 3 | Whisper (large-v2) | 680 k h multilingual speech | Zero-shot (no cow data) | “Transcript” string + log-prob | Noise-robust segmentation, auto-timestamp | Tokenizer trained on words -> outputs nonsense on raw moos; needs post-filter |
| 4 | AudioGPT Controller | GPT-4 (text) + plug-in ASR / encoders | 50 labelled prompts (few-shot) | Multi-step reasoning over acoustic embeddings | Flexible zero-shot Q&A about herd sounds | Heavy compute, pipeline latency, still prototype |
| 5 | CNN Encoder + GPT-2 Decoder | ImageNet CNN weights | 7 k spectrograms w/ text tags | Generates sentence caption (e.g., “hungry calf call”) | Early end-to-end audio-caption success | Needs dataset of paired call + explanation, currently small |
| 6 | Prompt-Tuned GPT-J | 6 B-param code GPT | 400 synthetic “call→meaning” pairs | Rapid adaptation to cow vocabulary (<1 epoch) | Works with minimal GPU | Synthetic pairs risk bias; real validation pending |
| 7 | Spec-BERT | 100 h farm audio (masked) | 800 labelled segments | Predicts masked time-frequency patches, improves downstream F1 +4 pp | Learns robust representations under barn noise | Mask strategy sensitivity; limited to short clips |
| # | Dataset / Benchmark | Scope & Modality Snapshot | Best-fit Models & Intended Task | Strengths for Model Development | Main Limitation |
|---|---|---|---|---|---|
| 1 | CowVox-2023 Mini (Sharma et al., 2023) | 8 h audio, 10k labelled calls, 2 Holstein farms | SVM / RF for estrus-call detection | Clean labels, free download (CC-BY) |
Narrow breed & low noise |
| 2 | DeepSound26 Archive (Ferrero et al., 2023) | 120 h audio + 5 h collar IMU, 4 farms / 3 breeds | CNN-LSTM fusion for multimodal health events | Synchronised streams; individual IDs | Non-standard file names; requires resync |
| 3 | BEANS bovine subset (Hagiwara et al., 2022) | 6 h cow audio inside 35 h multi-species corpus | Wav2Vec 2.0 or AVES SSL encoder for zero-/few-shot stress detection. |
Noise-rich clips; ready for SSL pre-train | Sparse bovine labels; class imbalance |
| 4 | Agri-LLM Pilot Set (Chen et al., 2024) | 200 paired “call → English tag” clips, Jersey herd | AudioGPT / GPT-J prompt-tuning for captioning | Paired acoustic–semantic examples | Tiny; heavy text bias |
| 5 | SmartFarm Open-Noise (Martinez-Rau et al., 2025) | 40 h barn ambience (negative class), 5 barn layouts | Spec-BERT masking or NRFAR denoiser pre-train | Diverse negative class for contrastive learning | No positive calls; must be combined with other sets |
| # | Challenge / Pain-Point | Underlying Cause(s) & Typical Manifestation | Impact on Research / Farm Adoption | Proposed Technical / Operational Solutions |
|---|---|---|---|---|
| 1 | Data scarcity & class imbalance | Costly, time-consuming manual labelling of calls Rare yet critical events (e.g., pain bawls, calving distress) under-represented Farm privacy limits data sharing |
Over-fitting, poor generalisation Models ignore rare but critical classes |
Large open acoustic repositories (multi-farm, multi-breed) Self-supervised pre-training (Wav2Vec, AVES) to cut labels by ~70 % Synthetic data via generative models (GAN vocoders) to upsample rare calls Transfer learning to share model weights, not raw audio |
| 2 | Cross-farm variability & domain shift | Differences in barn acoustics, microphone type, breed dialects, management routines | Performance drop when models deployed outside training site, farmer distrust | Domain-adversarial training, feature-space alignment Calibration period & incremental fine-tuning on each new farm Capture meta-data (mic height, barn SNR) for conditional normalisation |
| 3 | Background noise & multi-speaker overlap | Machinery, wind, multiple cows calling simultaneously | High false positives/negatives, missed welfare events | Beam-forming or multi-mic arrays for source separation Bi-spectral denoising + mask-based enhancement Event-wise confidence scoring & noise-aware thresholds |
| 4 | Limited interpretability (AI) | Deep nets learn latent features not visible to users | Farmers hesitant to trust alerts, regulators demand transparency | SHAP/LIME heatmaps on spectrograms Rule-extraction or surrogate decision trees Dashboard displays “Top 3 acoustic drivers” behind each alert |
| 5 | Sparse contextual labelling (why a call occurred?) | Audio often logged without behavioural or physiological context | Misclassification of benign calls as distress (or vice-versa) | Multimodal fusion sync audio with accelerometer, video. Mobile annotation apps for on-farm event tagging |
| 6 | Real-time processing on resource-constrained edge devices | GPU-heavy models vs. limited power / connectivity in barns | Latency or dropout; costly cloud fees | Lightweight architectures (MobileNet, DistilBERT-audio) On-device quantisation & pruning |
| 7 | Ethical risk of anthropomorphism & over-interpretation | AI may project human emotion labels inaccurately Farmers may act on unverified alerts |
Questionable welfare interventions, misleading claims | Cross-validation against physiological stress markers Expert-in-the-loop verification before deploying new labels |
| 8 | Farmer adoption & usability barriers | Alert fatigue, complex interfaces, unclear ROI | System ignored despite accuracy; missed welfare benefit | Tiered alerting (red/high vs. yellow/medium) ROI calculators (savings on vet costs, improved conception) Hands-on training and local language interfaces |
| 9 | Data privacy & ownership concerns | Audio streams may reveal proprietary operations | Reluctance to share data, slows collaborative progress | Federated or encrypted model updates Clear data-use agreements; farmer retains raw-data ownership On-premises processing options |
| 10 | Regulatory alignment & standardisation gaps | No harmonised acoustic welfare metrics yet | Hard to benchmark systems; variable certification hurdles | Develop ISO-style standards for recording & annotation Open benchmarking datasets and leaderboards Engage policymakers early to shape guidelines |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).