Preprint
Article

This version is not peer-reviewed.

Forecasting and Statistical Quality Assessment of Philippine Crop Production Data: Evidence from Rice and Corn

Submitted:

19 April 2026

Posted:

20 April 2026

You are already at the latest version

Abstract
Reliable crop statistics are foundational to food-security planning, yet the literature often treats forecasting accuracy and data-quality assessment as separate tasks. This paper develops an integrated evidence synthesis around three Philippine studies that together illuminate both problems for rice and corn. The first compared Seasonal Autoregressive Integrated Moving Average and Holt-Winters models for quarterly rice and corn production and found that Holt-Winters with additive seasonality yielded lower forecast errors. The second extended the forecasting problem to machine-learning models and reported that Random Forest produced the strongest predictive performance among the tested algorithms, while performance varied across other nonlinear approaches. The third applied the Newcomb-Benford law to official crop production statistics and identified deviations in rice and corn digit patterns that warrant further validation. Drawing on official Philippine Statistics Authority documentation and broader methodological literature on forecast evaluation, survey reliability, and crop-yield prediction, the paper argues that forecastability and statistical integrity should be studied together rather than in isolation. A series can be forecastable yet still contain reporting irregularities, while a numerically plausible series can remain difficult to forecast because of structural breaks, weather shocks, or shifting production conditions. For agricultural planning, the strongest evidence base comes from combining temporal modeling with routine statistical-quality screening, transparent revision practices, and follow-up diagnostics when anomalies appear. The paper concludes by proposing a practical framework for Philippine agricultural analytics in which data integrity checks precede and accompany forecasting, thereby improving the credibility of crop outlooks used for procurement, import planning, early warning, and resource allocation.
Keywords: 
;  ;  ;  ;  ;  ;  ;  

1. Introduction

Rice and corn remain central to Philippine food-security planning because they sit at the intersection of staple consumption, farm livelihoods, input use, stocks management, and import decisions. Official crop estimates are not merely descriptive records. They are operational signals used in policy and market coordination. The Philippine Statistics Authority generates quarterly estimates and forward-looking outlooks through the Palay Production Survey and the Corn Production Survey, and the resulting releases are explicitly intended to inform planners and policy makers on matters concerning the rice and corn sectors (Philippine Statistics Authority, 2021a, 2021b). In that setting, the analytical value of crop statistics depends on two linked questions. First, how forecastable are the series? Second, how trustworthy are the reported numerical patterns that feed the forecasts?
These questions are often handled in separate literatures. Forecasting studies usually emphasize model selection, error metrics, and horizon performance. Data-quality studies, by contrast, focus on measurement error, coherence, statistical production processes, and anomaly screening. Yet in practice the two issues are inseparable. Forecasts derived from weak or irregular input data can produce misleading confidence, while data-quality diagnostics that never confront the predictive behavior of the series can remain disconnected from planning use. Quality in official statistics is commonly understood in terms of fitness for use and is evaluated through dimensions such as integrity, accuracy and reliability, timeliness, coherence, and interpretability (Brackstone, 1999; International Monetary Fund, 2003). For crop statistics that support food planning, fitness for use includes predictive usefulness as well as numerical credibility.
This paper addresses that gap through a focused evidence synthesis built around three connected Philippine studies. The first examined quarterly rice and corn production forecasting using Seasonal Autoregressive Integrated Moving Average and Holt-Winters models and reported better performance for Holt-Winters additive seasonality (Parreño, 2023). The second expanded the modeling space to Random Forest, Echo State Networks, Neural Network Autoregressive models, and an Autoregressive Support Vector Machine, showing that predictive performance differed substantially across machine-learning approaches and that Random Forest achieved the lowest error levels among the tested models (Parreño & Anter, 2024). The third shifted the lens from forecasting to statistical integrity by applying the Newcomb-Benford law to official crop production statistics and identifying notable deviations in rice and corn, among other crops, that justify closer validation rather than automatic acceptance of the published figures (Parreño, 2024).
The contribution of the present article is conceptual and methodological. It does not introduce a new estimation exercise. Instead, it shows why forecastability and data integrity should be treated as complementary properties of official crop series. By bringing together Philippine forecasting evidence, Benford-based screening, official survey documentation, and broader literature on survey reliability and forecast evaluation, the paper develops an integrated framework for agricultural analytics that is more useful for planning than either approach on its own.

2. Review Method

The article uses a focused integrative review design. The three core studies were selected because each addresses a distinct but connected component of the same policy problem: forecasting rice and corn production, extending the forecast problem to nonlinear machine-learning models, and screening official crop statistics for irregular numerical patterns. These studies were then situated within supporting literature on official-statistics quality frameworks, Benford-based reliability assessment, crop-yield prediction, and forecast evaluation. Official documentation from the Philippine Statistics Authority was used to anchor the discussion in the actual institutional production of rice and corn statistics.
A meta-analysis was neither feasible nor appropriate because the core studies differ in immediate objective, model class, and unit of evaluation. One is a classical time-series comparison, one is a machine-learning model comparison, and one is a statistical-quality screening exercise. The synthesis therefore proceeded interpretively. First, the core findings of each study were restated in comparable terms. Second, the studies were examined for what they imply about the predictive usefulness and reliability of official crop data. Third, a conceptual framework was developed to show how the two dimensions can be combined in applied agricultural planning. The purpose is not to collapse heterogeneous evidence into a single summary coefficient, but to clarify how the evidence fits together in a usable way.

3. Findings from the Focused Evidence Synthesis

3.1. Official crop statistics as planning infrastructure

Philippine rice and corn statistics are produced within a survey system that already blends measurement and outlook generation. The Palay Production Survey is a quarterly survey that produces estimates of production, area, and yield, and each round also generates forecasts for the next two quarters (Philippine Statistics Authority, 2021a). The Corn Production Survey follows the same quarterly cycle, and its documentation shows that the data system includes current-quarter estimates, succeeding-quarter forecasts, and related information on harvested area, production, crop type, prices, stocks, and weather indicators (Philippine Statistics Authority, 2021b). This institutional design matters for the present review because it means that crop statistics are already used prospectively. Forecasting is not an optional academic add-on. It is built into the information environment of crop policy.

3.2. What the forecasting studies contribute

Within that environment, the 2023 rice and corn forecasting study provides the baseline evidence on classical time-series performance. Using quarterly Philippine production data from 1987 to the first quarter of 2023, the study compared SARIMA and Holt-Winters models and found that Holt-Winters with additive seasonality produced lower RMSE and MAPE values for both rice and corn (Parreño, 2023). The result is important for two reasons. First, it suggests that stable seasonal structure remains strong enough in the official series to support interpretable univariate forecasting. Second, it shows that relatively transparent methods can still perform well in an applied policy setting, which matters when agencies need methods that are understandable to nontechnical users.
The 2024 machine-learning study widened the analytical horizon by testing Random Forest, Echo State Networks, Neural Network Autoregressive models, and an Autoregressive Support Vector Machine for the same agricultural problem (Parreño & Anter, 2024). The study reports that Random Forest achieved the lowest error levels, while ARSVM yielded the highest errors, with the remaining models showing intermediate performance. Taken together with the earlier SARIMA and Holt-Winters comparison, the evidence suggests that rice and corn production forecasting in the Philippines is sensitive to model class, and that nonlinear methods can extract additional predictive value from the series when their temporal structure is not fully captured by classical seasonal models alone. This finding aligns with the broader crop-yield literature, which shows that machine learning can improve agricultural prediction when the underlying features and evaluation procedures are appropriate, but that performance gains are highly dependent on data characteristics and validation design (van Klompenburg et al., 2020).
The forecasting studies therefore establish a clear case for continued predictive modeling of official crop statistics. They show that the rice and corn series contain usable temporal signal, and they identify a methodological progression from seasonal statistical models to more flexible machine-learning approaches. At the same time, the forecasting evidence alone cannot answer whether the underlying measurements are fully credible. A series can contain seasonality, persistence, and nonlinear structure while still reflecting reporting problems, revisions, or local irregularities. Forecast accuracy and data validity are related, but they are not identical.

3.3. What the statistical quality study contributes

That is where the 2024 Newcomb-Benford study becomes analytically important. Benford's law describes the logarithmic distribution expected of leading digits in many naturally occurring numerical datasets that span multiple orders of magnitude (Benford, 1938; Hill, 1995). In applied settings it is often used as a screening device for unusual digit patterns rather than as a final verdict on data falsification. The Philippine crop study applied first-digit and first-two-digit tests to major crop production statistics and found notable deviations for rice and corn, among other crops, thereby highlighting areas where published values merit closer scrutiny (Parreño, 2024).
This contribution should be interpreted carefully. Benford-based nonconformity does not prove fabrication, and conformity does not guarantee correctness. The method is best understood as an economical diagnostic that flags numerical patterns for follow-up checking. That interpretation is consistent with the broader literature. Judge and Schechter (2009) show how Benford analysis can detect abnormalities in survey data, while Kaiser (2019) argues that deviations can serve as indicators of reliability problems in survey-based datasets. In agricultural statistics specifically, Hanci (2022) demonstrates that Benford tests can be meaningfully applied to crop production series, reinforcing the idea that digit-distribution analysis can function as part of a wider quality-assurance toolbox.
What the Benford study adds to the Philippine rice and corn forecasting literature is therefore not a competing model, but a different evidentiary question. Forecasting asks whether a series has enough stable temporal structure to support useful prediction. Benford screening asks whether the numerical composition of the reported values is sufficiently plausible to justify analytical trust. The two are distinct. A forecastable series may still deserve an audit. A numerically plausible series may still be hard to forecast because of climatic shocks, policy discontinuities, or structural transformation.
Table 1. Summary of the three core studies and their contribution to the integrated framework.
Table 1. Summary of the three core studies and their contribution to the integrated framework.
Study Data focus Methods Main finding Analytical contribution
Parreño (2023) Quarterly Philippine rice and corn production, 1987 to Q1 2023 SARIMA and Holt-Winters Holt-Winters additive seasonality outperformed SARIMA for both crops using RMSE and MAPE Establishes baseline evidence that official crop series contain forecastable seasonal structure
Parreño and Anter (2024) Philippine rice and corn production series Random Forest, ESN, NNAR, and ARSVM Random Forest produced the lowest error rates, while ARSVM showed the weakest performance Shows that nonlinear model choice materially affects forecasting performance
Parreño (2024) Official Philippine crop production statistics, including rice and corn First-digit and first-two-digit Newcomb-Benford tests Rice and corn exhibited deviations that merit follow-up validation Adds an integrity-screening layer before strong planning claims are made from the data

4. An Integrated Framework for Forecasting and Quality Assessment

Studying forecastability and statistical integrity together produces a fuller account of agricultural information quality. The forecasting literature provides evidence on temporal signal, model suitability, error magnitude, and horizon behavior. The integrity literature provides evidence on numerical plausibility, reporting anomalies, and the need for validation. Neither dimension is sufficient on its own for planning. If forecasting alone is emphasized, analysts may optimize models on data whose irregularities have not been screened. If statistical forensics alone is emphasized, data producers may identify suspicious digit patterns without knowing whether the cleaned or audited series actually supports useful decision-making.
The need for this combined view is reinforced by best-practice work on forecast evaluation. Forecast assessment is highly sensitive to how data are partitioned, which error measures are chosen, and whether the characteristics of the series are respected during validation (Hewamalage et al., 2023). If the input series itself contains quality issues, then model comparison can overstate or understate genuine predictive capacity. In the language of official-statistics quality management, predictive performance should be read as one practical expression of fitness for use, but only alongside integrity, accuracy, and coherence in the production process (Brackstone, 1999; International Monetary Fund, 2003).
For rice and corn planning, this logic yields a simple but consequential principle. Forecasts should be treated as strongest when they are generated from series that both perform well under transparent forecasting evaluation and pass basic statistical-quality screening, or at least show anomalies that have been investigated and documented. Conversely, when Benford screening raises concerns, the appropriate response is not to discard forecasting altogether. It is to introduce diagnostic follow-up such as metadata review, revision checks, regional decomposition, outlier analysis, and comparison across reporting rounds before strong planning claims are made.
The three core Philippine studies support a staged analytical workflow. First, official rice and corn data are acquired from the established PSA system. Second, the series undergo routine integrity checks, including first-digit and first-two-digit screening where appropriate. Third, forecasting models are compared using transparent train-test logic and multiple error measures. Fourth, forecast interpretation is conditioned on the quality findings. Under this workflow, predictive results become not just more accurate in a technical sense, but more credible as public planning inputs.
Table 2. Planning matrix linking forecastability and data integrity.
Table 2. Planning matrix linking forecastability and data integrity.
Forecastability Integrity signal Implication for planning use
High forecastability Acceptable integrity Strongest basis for planning. Use forecasts directly with routine monitoring and periodic re-estimation.
High forecastability Questionable integrity Useful signal, but forecasts should be interpreted only after audit, revision checks, and sensitivity analysis.
Low forecastability Acceptable integrity Data may be credible but difficult to predict. Expand model space, add covariates, or shorten the decision horizon.
Low forecastability Questionable integrity Weakest basis for planning. Prioritize data review, metadata inspection, and redesign of the analytical pipeline before relying on forecasts.

5. Implications for Agricultural Planning in the Philippines

Several implications follow for food-security and agricultural planning in the Philippines. First, statistical agencies and research units should avoid treating data validation and forecasting as separate reporting silos. Quarterly crop releases already contain both current estimates and near-term outlooks. A routine integrity screen attached to those releases would strengthen user confidence and help distinguish ordinary seasonal movement from values that require closer verification.
Second, model development should remain plural rather than doctrinaire. The reviewed evidence shows that Holt-Winters performed better than SARIMA in one Philippine comparison, while Random Forest performed best among the tested machine-learning approaches in a later comparison. The correct lesson is not that one family of models should permanently dominate. It is that model choice should remain empirical, data-dependent, and regularly re-evaluated as new quarters are added and structural conditions change.
Third, regional disaggregation is a logical next step. National aggregates can conceal subnational reporting variation and localized production shocks. Because the PSA system already disseminates regional tables on production, area, and yield, the integrated framework proposed here could be extended to regional or provincial panels in order to identify where forecastability is high, where digit irregularities cluster, and where targeted verification would most improve planning value (Philippine Statistics Authority, 2021b).
Fourth, anomaly detection should be communicated with restraint. Benford deviations should trigger inquiry, not accusation. For official crop statistics, legitimate causes of unusual digit patterns may include aggregation procedures, small-number effects in disaggregated tables, revisions, crop-specific reporting conventions, or structural breaks associated with weather and policy shocks. A quality-assessment protocol should therefore pair Benford screening with contextual review rather than present it as stand-alone proof of error.
Finally, the integrated approach is useful beyond immediate forecasting accuracy. It strengthens the evidentiary basis for procurement planning, import timing, stock management, and early warning because it joins two questions that decision makers actually care about. Can the next few quarters be predicted with usable accuracy? And should the historical figures behind those predictions be trusted enough to guide action?

6. Conclusion

This paper synthesized three connected Philippine studies to show that agricultural forecasting and statistical quality assessment are best treated as complementary, not competing, approaches. The forecasting studies demonstrate that official rice and corn production series contain exploitable temporal structure and that model performance differs meaningfully across classical and machine-learning approaches. The Newcomb-Benford study demonstrates that official crop statistics should not be assumed to be analytically unproblematic simply because they are widely used. Together, these findings support a stronger methodological position for agricultural planning. Predictive usefulness must be interpreted alongside statistical integrity.
For Philippine food-security analytics, the practical implication is straightforward. Crop outlooks are most defensible when they emerge from a workflow that begins with data-quality screening, proceeds through transparent forecasting evaluation, and reports both the predictive results and any quality caveats that shape interpretation. Such a workflow does not weaken forecasting. It makes forecasting more credible as an input to public decision-making.
Future work can extend this framework by using regional crop series, linking forecasting with revision histories, incorporating weather and price covariates, and testing whether quality-screened series systematically produce more stable out-of-sample performance. Even without those extensions, the current synthesis shows that studying forecastability and data integrity together offers a more useful basis for rice and corn planning than either perspective in isolation.

References

  1. Benford, F. The law of anomalous numbers. Proceedings of the American Philosophical Society 1938, 78(4), 551–572. [Google Scholar]
  2. Brackstone, G.J. Managing data quality in a statistical agency. Survey Methodology 1999, 25(2), 139–149. [Google Scholar]
  3. Hanci, F. Application of Benford's law in agricultural production statistics. Journal of the National Science Foundation of Sri Lanka 2022, 50(2), 387–393. [Google Scholar] [CrossRef]
  4. Hewamalage, H.; Ackermann, K.; Bergmeir, C. Forecast evaluation for data scientists: Common pitfalls and best practices. Data Mining and Knowledge Discovery 2023, 37(2), 788–832. [Google Scholar] [CrossRef] [PubMed]
  5. Hill, T.P. A statistical derivation of the significant-digit law. Statistical Science 1995, 10(4), 354–363. [Google Scholar] [CrossRef]
  6. International Monetary Fund. Data quality assessment framework and data quality program. 2003. Available online: https://www.imf.org/external/np/sta/dsbb/2003/eng/dqaf.htm.
  7. Judge, G.; Schechter, L. Detecting problems in survey data using Benford's law. Journal of Human Resources 2009, 44(1), 1–24. [Google Scholar] [CrossRef]
  8. Kaiser, M. Benford's law as an indicator of survey reliability: Can we trust our data? Journal of Economic Surveys 2019, 33(5), 1602–1618. [Google Scholar] [CrossRef]
  9. Parreño, S.J.E. Forecasting quarterly rice and corn production in the Philippines: A comparative study of Seasonal ARIMA and Holt-Winters models. ICTACT Journal on Soft Computing 2023, 14(2), 3224–3231. [Google Scholar] [CrossRef]
  10. Parreño, S.J.E. Analyzing crop production statistics of the Philippines using the Newcomb-Benford law. Multidisciplinary Science Journal 2024, 6(6), e2024079. [Google Scholar] [CrossRef]
  11. Parreño, S.J.E.; Anter, M.C.J. New approach for forecasting rice and corn production in the Philippines through machine learning models. Multidisciplinary Science Journal 2024, 6(9), e2024168. [Google Scholar] [CrossRef]
  12. Philippine Statistics Authority. Palay Production Survey 2016. 2021a. Available online: https://psada.psa.gov.ph/catalog/113?vcode=iHfW.
  13. Philippine Statistics Authority. Corn Production Survey 2016. 2021b. Available online: https://psada.psa.gov.ph/catalog/140/related-materials?vcode=gXS4.
  14. van Klompenburg, T.; Kassahun, A.; Catal, C. Crop yield prediction using machine learning: A systematic literature review. Computers and Electronics in Agriculture 2020, 177, 105709. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated