Submitted:
13 December 2023
Posted:
14 December 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
- With the long-term historical data available on cattle prices, agricultural commodity market, economical indices, and international trade features, we structure price prediction as a machine learning problem, which can be more accurate, consistent, and efficient than traditional time series statistical methods.
- Three multivariate machine learning algorithms; support vector regression, random forest regression, and adaboost regressor models and three univariate time series algorithms; ARIMA, SARIMA, and SARIMAX were applied to long-term historical fed cattle price datasets (2005 to 2023) in Alberta. We assess the performance of these algorithms against observed fed cattle prices and identify the best machine learning algorithm for fed cattle price prediction.
- The multivariate machine learning approach offers a feasible alternative to structural multivariate autoregressive modeling and can efficiently combine fed beef price and exchange rate dynamics, minimizing modeling errors and data requirements
2. Background Literature
| Autor/s | Year | Study Domain | Considered Parameter/s | Machine Learning Technique |
|---|---|---|---|---|
| Jeong et al. | 2022 | South Korea | Rice yield | five different structures of deep learning [15] |
| Sharma et al. | 2021 | Review paper | Precision agriculture | a comprehensive review [16] |
| Liu et al. | 2021 | Review paper | Precision agriculture | a systematic literature review [17] |
| Maroli et al. | 2021 | Review paper | sustainability in agricultural sector | a comprehensive review [18] |
| Meshram et al. | 2021 | Review paper | Pre-harvesting, harvesting and post-harvesting parameters | a comprehensive review on all ML techniques [19] |
| Chen et al. | 2021 | Malaysia | Agriculture commodity price | ARIMA, support vector regression (SVR), Prophet, extreme gradient boosting (XGBoost), long short-term memory (LSTM) [20] |
| Tian et al. | 2021 | Shaanxi, China | Wheat yield | long short- term memory (LSTM), back propagation neural network (BPNN), support vector machine (SVM) [21] |
| Divisekara et al. | 2020 | Canada, Saskatchewan | Forecasting the red lentils commodity market price | SARIMA models [22] |
| Sharma et al. | 2020 | Review paper | Sustainable agriculture supply chain performance | a systematic literature review [23] |
| Kamir et al. | 2020 | Australia | Wheat yield | random forest (RF), cubist (CU), XGBoost (XGB), multi-layer perceptron (MLP), support vector regression linear (SVMl), support vector regression radial (SVMr), Gaussian process regression (GPR), k-nearest neighbor (kNN), multivariate adaptive regression (MARS) [24] |
| Yamaç & Todorovic | 2020 | Bari, Southern Italy | Daily potato crop evapotranspiration | k-nearest neighbour (kNN), artificial neural networks (ANN), adaptive boosting (AdaBoost) [25] |
| Van Klompenburg et al. | 2020 | Review paper | Crop yield prediction | a systematic literature review on artificial neural network (ANN) methods [26] |
| Vidyarthi et al. | 2020 | Kettleman, California | Size and mass of pistachio kernels | random forest (RF) [27] |
| Cai et al. | 2019 | Australia | Wheat yield | LASSO, support vector machine (SVM), random forest (RF), neural network (NN) [28] |
| Kouadio et al. | 2018 | Southern Vietnam | Robusta coffee yield | extreme learning machine (ELM), multiple linear regression (MLR), random forests (RF) [29] |
| Prajapati & Kathiriya | 2016 | Vadodara in western India | Soil health card | K-nearest neighbor (kNN) classification using nine different similarity measures [30] |
2.1. Agricultural Commodity Price information and sustainability
3. Materials and Methods
3.1. Study Domain

3.2. Data Acquisition & Description & Exploration
3.3. Data Preprocessing & Partitioning (Train - Test) & Tunning
3.4. An Introduction to Machine Learning Algorithms and Description
3.4.1. Multivariate Analysis
3.4.2. Univariate Analysis
3.5. Validation Methods
4. Results and Discussions
4.1. Feature Selection & Hyperparameter Tuning
4.2. Validation of Multivariate and Univariate Machine Learning Models
4.3. Applying Probabilistic Modeling Approach to the Selected Machine Learning Model
5. Conclusions
- Multivariate modeling, incorporating additional key variables as predictors, demonstrated a notable advantage over univariate approaches in our investigation.
- Probabilistic modeling has an advantage compared to deterministic modeling. By employing probabilistic modeling, we consider uncertainties and incorporate them into deterministic RF predictions, providing a more realistic context for the predicted values with the selected RF model. This process should be more routinely applied to other machine learning modeling studies.
- Lastly, in the comparison between ML algorithms, Adaboost and Random Forest models showed similar and robust validation performance concerning our variables.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
| 1 | |
| 2 | Research (canfax.ca) |
| 3 | |
| 4 | |
| 5 | |
| 6 |
References
- Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D.J.S. Machine learning in agriculture: A review. 2018, 18, 2674. [CrossRef]
- Ouyang, H.; Wei, X.; Wu, Q.J.J.o.A.E. Agricultural commodity futures prices prediction via long-and short-term time series network. 2019, 22, 468-483. [CrossRef]
- Chen, Z.; Li, C.; Sun, W. Bitcoin price prediction using machine learning: An approach to sample dimension engineering. Journal of Computational and Applied Mathematics 2020, 365. [Google Scholar] [CrossRef]
- Rahmani, E.; Friederichs, P.; Keller, J.; Hense, A. Development of an effective and potentially scalable weather generator for temperature and growing degree days. Theoretical and Applied Climatology 2016, 124, 1167–1186. [Google Scholar] [CrossRef]
- Colino, E.V.; Irwin, S.H.J.A.J.o.A.E. Outlook vs. futures: Three decades of evidence in hog and cattle markets. 2010, 92, 1–15. [Google Scholar] [CrossRef]
- Marsh, J.M. US feeder cattle prices: effects of finance and risk, cow-calf and feedlot technologies, and Mexican feeder imports. Journal of Agricultural Resource Economics 2001, 463–477. [Google Scholar]
- Tomek, W.G. Commodity prices revisited. Agricultural Resource Economics Review 2000, 29, 125–137. [Google Scholar] [CrossRef]
- Zhao, H. Futures price prediction of agricultural products based on machine learning. Neural Computing Applications 2021, 33, 837–850. [Google Scholar] [CrossRef]
- Blank, S.C.; Saitone, T.L.; Sexton, R.J.J.J.o.A.; Economics, R. Calf and yearling prices in the western United States: Spatial, quality, and temporal factors in satellite video auctions. 2016, 458-480. [CrossRef]
- Sanders, D.R.; Manfredo, M.R.J.J.o.A.; Economics, R. USDA livestock price forecasts: A comprehensive evaluation. 2003, 316-334. [CrossRef]
- Oliveira, R.A.; O'connor, C.W.; Smith, G.W. Short-run forecasting models of beef prices. Western Journal of Agricultural Economics 1979, 45–55. [Google Scholar] [CrossRef]
- Zapata, H.O.; Garcia, P.J.W.J.o.A.E. Price forecasting with time-series methods and nonstationary data: An application to monthly US cattle prices. 1990, 123-132. [CrossRef]
- Linnell, P.B. Forecasting Fed Cattle Prices: Errors and Performance During Periods of High Volatility. Colorado State University, 2017.
- Kohzadi, N.; Boyd, M.S.; Kermanshahi, B.; Kaastra, I.J.N. A comparison of artificial neural network and time series models for forecasting commodity prices. 1996, 10, 169-181. [CrossRef]
- Jeong, S.; Ko, J.; Yeom, J.M. Predicting rice yield at pixel scale through synthetic use of crop and deep learning models with satellite data in South and North Korea. Sci Total Environ 2022, 802, 149726. [Google Scholar] [CrossRef] [PubMed]
- Sharma, A.; Jain, A.; Gupta, P.; Chowdary, V. Machine Learning Applications for Precision Agriculture: A Comprehensive Review. IEEE Access 2021, 9, 4843–4873. [Google Scholar] [CrossRef]
- Liu, W.; Shao, X.-F.; Wu, C.-H.; Qiao, P. A systematic literature review on applications of information and communication technologies and blockchain technologies for precision agriculture development. Journal of Cleaner Production 2021, 298. [Google Scholar] [CrossRef]
- Maroli, A.; Narwane, V.S.; Gardas, B.B. Applications of IoT for achieving sustainability in agricultural sector: A comprehensive review. J Environ Manage 2021, 298, 113488. [Google Scholar] [CrossRef] [PubMed]
- Meshram, V.; Patil, K.; Meshram, V.; Hanchate, D.; Ramkteke, S.D. Machine learning in agriculture domain: A state-of-art survey. Artificial Intelligence in the Life Sciences 2021, 1. [Google Scholar] [CrossRef]
- Chen, Z.; Goh, H.S.; Sin, K.L.; Lim, K.; Chung, N.K.H.; Liew, X.Y. Automated Agriculture Commodity Price Prediction System with Machine Learning Techniques. Advances in Science, Technology and Engineering Systems Journal 2021, 6, 376–384. [Google Scholar] [CrossRef]
- Tian, H.; Wang, P.; Tansey, K.; Zhang, J.; Zhang, S.; Li, H. An LSTM neural network for improving wheat yield estimates by integrating remote sensing data and meteorological data in the Guanzhong Plain, PR China. Agricultural and Forest Meteorology 2021, 310. [Google Scholar] [CrossRef]
- Divisekara, R.W.; Jayasinghe, G.; Kumari, K.J.S.B. ; Economics. Forecasting the red lentils commodity market price using SARIMA models. 2021, 1, 1–13. [Google Scholar] [CrossRef]
- Sharma, R.; Kamble, S.S.; Gunasekaran, A.; Kumar, V.; Kumar, A.J.C.; Research, O. A systematic literature review on machine learning applications for sustainable agriculture supply chain performance. 2020, 119, 104926. [CrossRef]
- Kamir, E.; Waldner, F.; Hochman, Z. Estimating wheat yields in Australia using climate records, satellite image time series and machine learning methods. ISPRS Journal of Photogrammetry and Remote Sensing 2020, 160, 124–135. [Google Scholar] [CrossRef]
- Yamaç, S.S.; Todorovic, M. Estimation of daily potato crop evapotranspiration using three different machine learning algorithms and four scenarios of available meteorological data. Agricultural Water Management 2020, 228. [Google Scholar] [CrossRef]
- van Klompenburg, T.; Kassahun, A.; Catal, C. Crop yield prediction using machine learning: A systematic literature review. Computers and Electronics in Agriculture 2020, 177. [Google Scholar] [CrossRef]
- Vidyarthi, S.K.; Tiwari, R.; Singh, S.K.; Xiao, H.W. Prediction of size and mass of pistachio kernels using random Forest machine learning. Journal of Food Process Engineering 2020, 43. [Google Scholar] [CrossRef]
- Cai, Y.; Guan, K.; Lobell, D.; Potgieter, A.B.; Wang, S.; Peng, J.; Xu, T.; Asseng, S.; Zhang, Y.; You, L.; et al. Integrating satellite and climate data to predict wheat yield in Australia using machine learning approaches. Agricultural and Forest Meteorology 2019, 274, 144–159. [Google Scholar] [CrossRef]
- Kouadio, L.; Deo, R.C.; Byrareddy, V.; Adamowski, J.F.; Mushtaq, S.; Phuong Nguyen, V. Artificial intelligence approach for the prediction of Robusta coffee yield using soil fertility properties. Computers and Electronics in Agriculture 2018, 155, 324–338. [Google Scholar] [CrossRef]
- Prajapati, B.P.; Kathiriya, D.R.J.I.J.o.C.A. Towards the new Similarity Measures in Application of Machine Learning Techniques on Agriculture Dataset. 2016, 156. 156. [CrossRef]
- Shepherd, A. Market information services: Theory and practice; Food & Agriculture Org.: 1997.
- Dorward, A. Agricultural labour productivity, food prices and sustainable development impacts and indicators. Food policy 2013, 39, 40–50. [Google Scholar] [CrossRef]
- CanadaBeef. Canada’s Beef Industry FAST FACTS; CANADA BEEF, 2021.
- Jumin, E.; Basaruddin, F.B.; Yusoff, Y.B.M.; Latif, S.D.; Ahmed, A.N. Solar radiation prediction using boosted decision tree regression model: A case study in Malaysia. Environ Sci Pollut Res Int 2021, 28, 26571–26583. [Google Scholar] [CrossRef] [PubMed]
- Maulud, D.; Abdulazeez, A.M. A Review on Linear Regression Comprehensive in Machine Learning. Journal of Applied Science and Technology Trends 2020, 1, 140–147. [Google Scholar] [CrossRef]
- Athey, S.; Imbens, G.W.J.A.R.o.E. Machine learning methods that economists should know about. 2019, 11, 685-725. [CrossRef]
- Cortes, C.; Vapnik, V.J.M.l. Support-vector networks. 1995, 20, 273-297.
- Sharifzadeh, M.; Sikinioti-Lock, A.; Shah, N. Machine-learning methods for integrated renewable power generation: A comparative study of artificial neural networks, support vector regression, and Gaussian Process Regression. Renewable and Sustainable Energy Reviews 2019, 108, 513–538. [Google Scholar] [CrossRef]
- Wu, X.; Kumar, V.; Ross Quinlan, J.; Ghosh, J.; Yang, Q.; Motoda, H.; McLachlan, G.J.; Ng, A.; Liu, B.; Yu, P.S.; et al. Top 10 algorithms in data mining. Knowledge and Information Systems 2007, 14, 1–37. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. 2001, 45, 5-32.
- Biau, G.; Scornet, E. A random forest guided tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
- Freund, Y.; Schapire, R.; Abe, N. A short introduction to boosting. Journal-Japanese Society For Artificial Intelligence 1999, 14, 1612. [Google Scholar] [CrossRef]
- Drucker, H. Improving regressors using boosting techniques. In Proceedings of the Icml; 1997; pp. 107–115. [Google Scholar]
- Kummer, N.; Najjaran, H. Adaboost. MRT: Boosting regression for multivariate estimation. Artif. Intell. Res. 2014, 3, 64–76. [Google Scholar] [CrossRef]
- Lu, J.; Hu, H.; Bai, Y. Generalized radial basis function neural network based on an improved dynamic particle swarm optimization and AdaBoost algorithm. Neurocomputing 2015, 152, 305–315. [Google Scholar] [CrossRef]
- Sevinç, E. An empowered AdaBoost algorithm implementation: A COVID-19 dataset study. Computers & Industrial Engineering 2022, 165, 107912. [Google Scholar] [CrossRef]
- Awad, M.; Khanna, R. Efficient learning machines: theories, concepts, and applications for engineers and system designers; Springer nature: 2015. [CrossRef]
- Greener, J.G.; Kandathil, S.M.; Moffat, L.; Jones, D.T. A guide to machine learning for biologists. Nature Reviews Molecular Cell Biology 2022, 23, 40–55. [Google Scholar] [CrossRef] [PubMed]
- Kumari, S.N.; Tan, A.J.T.S. Modeling and forecasting volatility series: with reference to Gold price. 2018, 16, 77-63.
- Mahmoudzadeh, H.; Matinfar, H.R.; Taghizadeh-Mehrjardi, R.; Kerry, R. Spatial prediction of soil organic carbon using machine learning techniques in western Iran. Geoderma Regional 2020, 21. [Google Scholar] [CrossRef]
- Rahmani, E. The effect of climate variability on wheat in Iran. 2015.
- Vijh, M.; Chandola, D.; Tikkiwal, V.A.; Kumar, A.J.P.C.S. Stock closing price prediction using machine learning techniques. 2020, 167, 599-606. [CrossRef]
- Dubey, A.K.; Kumar, A.; García-Díaz, V.; Sharma, A.K.; Kanhaiya, K.J.S.E.T. ; Assessments. Study and analysis of SARIMA and LSTM in forecasting time series data. 2021, 47, 101474. [Google Scholar] [CrossRef]
- Gneiting, T.; Raftery, A.E.; Westveld III, A.H.; Goldman, T. Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Monthly Weather Review 2005, 133, 1098–1118. [Google Scholar] [CrossRef]
- Nokeri, T.C. Forecasting Using ARIMA, SARIMA, and the Additive Model. In Implementing Machine Learning for Finance; Springer: 2021; pp. 21-50.






| Year | Fed Steer Price ($/cwt) | Nat Gas Price ($/GJ) | Barley Price ($/tonne) | Exchange (CAD/US) | Canadian Consumer Price Index |
|---|---|---|---|---|---|
| 2005 | 85.60 | 7.87 | 91.79 | 1.21 | 85.44 |
| 2006 | 86.90 | 6.22 | 101.81 | 1.13 | 86.93 |
| 2007 | 88.51 | 6.05 | 156.99 | 1.07 | 88.30 |
| 2008 | 90.07 | 7.47 | 199.31 | 1.07 | 90.15 |
| 2009 | 85.72 | 3.65 | 152.77 | 1.14 | 90.85 |
| 2010 | 89.08 | 3.57 | 138.07 | 1.03 | 91.93 |
| 2011 | 106.47 | 3.28 | 174.58 | 0.99 | 94.51 |
| 2012 | 112.32 | 2.14 | 214.24 | 1.00 | 96.39 |
| 2013 | 119.26 | 2.83 | 239.08 | 1.03 | 97.17 |
| 2014 | 156.51 | 4.00 | 173.54 | 1.10 | 98.65 |
| 2015 | 184.16 | 2.42 | 218.80 | 1.28 | 100.00 |
| 2016 | 153.75 | 1.83 | 215.96 | 1.33 | 100.81 |
| 2017 | 154.93 | 2.02 | 193.91 | 1.30 | 101.96 |
| 2018 | 153.68 | 1.29 | 213.36 | 1.30 | 103.68 |
| 2019 | 149.87 | 1.40 | 227.00 | 1.33 | 106.03 |
| 2020 | 138.94 | 1.90 | 215.26 | 1.34 | 107.04 |
| 2021 | 155.88 | 3.10 | 279.61 | 1.25 | 111.04 |
| 2022 | 173.08 | 4.78 | 380.14 | 1.30 | 118.44 |
| 2023* | 222.36 | 2.57 | 386.30 | 1.35 | 124.01 |
| Modeling | Acronym | Algorithm | Description |
|---|---|---|---|
| Multivariate Approaches | RF | Random Forest Regression | RFR is an integrated learning method, a general-purpose and quite effective classification and regression approach. It is a technique that ensembles numerous randomized decision trees and averages their predictions. |
| AB | Adaboost Regressor | AdaBoost stands as a widely used classification algorithm. Throughout the training process, the sample's distribution weight is enhanced as the error rate rises; conversely, it diminishes, as the new distribution weight decreases. Subsequently, samples are continuously trained based on these altered distribution weights. The objective is to yield robust results by minimizing subsequent model errors, ultimately achieving higher accuracy rates [45,46]. | |
| SVM | Support Vector Machines | SVM is a supervised-learning strategy that uses a symmetrical loss function that penalizes both high and low misestimates equally and it has been shown to be an effective method for estimating real-value functions [47]. It has the capability to conduct both linear and non-linear classification and regression. However, dealing with large datasets can be a challenging task [48]. | |
| Univariate Approaches | ARIMA | Auto Regressive Integrated Moving Average | ARIMA is a modeling algorithm based on the idea that past values of a time series can be used to predict future values by themselves, also taking into account autocorrelation in the error terms and stationarity. |
| SARIMA | Seasonal Auto Regressive Integrated Moving Average | SARIMA is defined as the ‘Seasonal’ ARIMA model, and it is formed by adding seasonal lag and moving average terms to an ARIMA model. | |
| SARIMAX | Seasonal Auto Regressive Integrated Moving Average with exogenous factors | SARIMAX model is another form of SARIMA model with an external predictor, also known as an exogenous variable e.g. seasonal index. |
| RF hyperparameter | RF best params | SVR hyperparameter | SVR best params | AB hyperparameter | AB best params |
|---|---|---|---|---|---|
| Bootstrap | True | C | 1 | Base estimator max depth | 10 |
| Ccp alpha | 0.0 | Epsilon | 0.1 | Learning rate | 1 |
| Criterion | mse | Kernel | rbf | N estimators | 300 |
| Min impurity decrease | 0.0 | - | - | - | - |
| Min samples leaf | 1 | - | - | - | - |
| Min samples split | 2 | - | - | - | - |
| Min weight fraction leaf | 0.0 | - | - | - | - |
| N estimators | 300 | - | - | - | - |
| N jobs | 1 | - | - | - | - |
| Max features | auto | - | - | - | - |
| Max samples | None | - | - | - | - |
| Max leaf nodes | None | - | - | - | - |
| Max depth | None | - | - | - | - |
| Hyperparameter | ARIMA | SARIMA | SARIMAX |
|---|---|---|---|
| start_p | 1 | 1 | 1 |
| start_q | 1 | 1 | 1 |
| max_p | 3 | 3 | 3 |
| max_q | 3 | 3 | 3 |
| m | 1 | 12 | 12 |
| test | adf | adf | adf |
| seasonal | False | False | True |
| trace | True | True | True |
| start_P | 0 | 0 | 0 |
| D | 0 | 1 | 1 |
| stepwise | True | True | True |
| Hyperparameter | ARIMA | SARIMA | SARIMAX |
|---|---|---|---|
| Best Model | ARIMA(3,1,1) (0,0,0)[0] | ARIMA(0,1,3)(0,1,1)[12] | ARIMA(2,0,0)(1,1,2)[12] |
| AIC | 960.02 | 878.99 | 1000.73 |
| BIC | 980.46 | 895.75 | 1027.63 |
| Prob (Q) | 0 | 0.09 | 0 |
| Prob (JB) | 0 | 0 | 0.56 |
| Heteroske-dasticity (H) | 2.51 | 3.07 | 2.39 |
| Ljung-Box (Q) | 70.43 | 52.62 | 183.15 |
| Skew | -0.62 | -0.31 | -0.02 |
| Model | ML Algorithms | MAE | RMSE | MSE |
|---|---|---|---|---|
| Multivariate | RF | 0.19 | 0.28 | 0.08 |
| AB | 0.19 | 0.28 | 0.08 | |
| SVR | 0.33 | 0.43 | 0.19 | |
| Univariate | ARIMA | 3.24 | 4.35 | 18.89 |
| SARIMA | 1.76 | 2.18 | 4.76 | |
| SARIMAX | 2.50 | 3.05 | 9.31 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).