3.1. Bibliometric Results
The review identifies 5,658 documents (96.4% scientific articles), with continuous growth since 2009 and a peak of 721 publications in 2024. In accordance with Lotka’s law[
31], most author entries have up to two publications, whereas a minority concentrates the most substantial scientific output.
To directly support the selection of the meteorological databases adopted in the present study, and to corroborate their relevance for this type of analysis in the international literature, an additional screening examines the full bibliometric corpus in order to identify documents explicitly referring to the seven databases evaluated here.
NREL (58 documents), ERA-5 (53 documents), and MERRA-2 (36 documents) are the most frequently identified databases in the international literature, reflecting their widespread use in solar resource assessment and PV generation studies. Representative examples include
Multicriteria GIS Modeling of Wind and Solar Farms in Colorado for NREL [
32],
Potential Assessment of Photovoltaic Power Generation in China for ERA-5 [
33], and
Long-Term Patterns of European PV Output Using 30 Years of Validated Hourly Reanalysis and Satellite Data for MERRA-2 [
34]. Among these, MERRA-2 is associated with the most highly cited representative paper in this subset, with 988 citations [
34].
NASA POWER appears in 16 documents, all published between 2019 and 2025, which indicates more recent adoption, particularly in tropical and data-scarce regions; a representative example is
Spatial Forecasting of Solar Radiation Using ARIMA Model [
35]. NOAA appears in 26 documents spanning a wide temporal range; however, these mentions include references to NOAA products and instruments, and not exclusively to irradiance datasets for PV design, as illustrated by
Artificial Neural Network Based Daily Local Forecasting for Global Solar Radiation [
36]. INMET and CMA remain underrepresented in the global corpus, with only 2 and 4 identified documents, respectively, which is consistent with their more regional scope. Representative examples are
Analysis of Seasonal Aspects of Nebulosity on the Project of Fixed Photovoltaic Installations at the City of Belém, Brazil for INMET [
37] and
Constructing a Gridded Direct Normal Irradiance Dataset in China During 1981–2014 for CMA [
38]. These results further support their inclusion in the present comparative analysis as geographically relevant alternatives to global reanalysis products.
3.2. Comparative Analysis by Database
The annual percentage errors in the estimated photovoltaic generation are summarized in the heat map shown in
Figure 1, computed according to Equation (
9). Before the annual comparison, all meteorological series are temporally harmonized to an hourly resolution following the temporal standardization procedure proposed in [
6]. In addition, each photovoltaic system is simulated using the technical specifications of its respective reference installation, including the rated module power, module thermal coefficient, number of modules, inverter Maximum Power Point Tracking (MPPT) efficiency, and module Nominal Operating Cell Temperature (NOCT). This procedure ensures that the inter-database comparison reflects differences in the meteorological inputs rather than inconsistencies in the photovoltaic system parameterization.
Figure 1 shows that the estimation performance remains strongly site-dependent across Brazil. NASA POWER provides the lowest absolute annual error in Teófilo Otoni (MG), João Pessoa (PB), Marabá (PA), Parnamirim (RN), and Maratá (RS), with deviations ranging from
to
. NREL performs best in Olinda (PE), with an error of only
, while MERRA-2 yields the smallest deviation in Rio de Janeiro (RJ), with
. In contrast, INMET produces the lowest relative deviation in Florianópolis (SC) and Primavera do Leste (MT), although in both cases the remaining errors are still substantial, especially in Primavera do Leste, where even the best available database retains a deviation of
.
The Northeastern coastal sites reveal distinct behaviors among the products. In João Pessoa, NASA POWER is the most accurate database, with an error of , whereas NOAA overestimates generation by and both ERA-5 and MERRA-2 underestimate by nearly . In Parnamirim, NASA POWER again provides the closest annual estimate (), followed by CMA () and NREL (). In Olinda, however, NREL, CMA, and NOAA all show competitive behavior, with deviations between and , while ERA-5 and MERRA-2 exhibit markedly negative biases of approximately . These results indicate that no single database dominates the entire Northeastern region, even under relatively similar tropical coastal conditions.
The largest discrepancies are observed in Primavera do Leste (MT), which represents the most critical case in the present study. All databases show large positive deviations except INMET, with errors ranging from
for NASA POWER to
for NOAA. This systematic overestimation suggests that local atmospheric and land-use factors, including seasonal aerosol loading and biomass-burning influences, are not adequately represented by the evaluated large-scale products [
9,
39]; in addition, the measured series was affected by operational interruptions of the PV system, with months of extremely low generation that may reflect not only shutdown periods, but also the combined influence of adverse climatic conditions and partial system unavailability. Marília (SP) also shows large positive deviations for most gridded databases, particularly CMA (
), NOAA (
), MERRA-2 (
), and ERA-5 (
), while INMET strongly underestimates the annual generation (
). These two locations reinforce that annual agreement is highly sensitive to local representativeness and cannot be generalized across different climatic regions.
The INMET database exhibits the strongest negative bias overall, especially in Rio de Janeiro (RJ), where the error reaches , and in Primavera do Leste (MT) and Marília (SP), with deviations of and , respectively. These large underestimations may reflect structural limitations of isolated ground stations in regions with complex terrain, strong cloud variability, or limited spatial representativeness. Nevertheless, INMET performs competitively in Florianópolis (SC), where its error of is still the lowest among the available databases, and in Olinda (PE), where its negative deviation is lower than that observed for ERA-5 and MERRA-2.
CMA shows a marked tendency toward overestimation in most locations, including Marília (SP), Florianópolis (SC), Maratá (RS), Marabá (PA), and especially Primavera do Leste (MT), where the deviation reaches . The only site where CMA slightly underestimates generation is Olinda (PE), with . NOAA likewise exhibits a predominantly positive bias and produces the largest positive error in several cities, including Parnamirim (RN), Marabá (PA), and Rio de Janeiro (RJ). ERA-5 and MERRA-2 display very similar spatial behavior, alternating between moderate positive deviations in the South, Southeast, and Midwest and negative deviations in parts of the Northeast, which is consistent with their shared reanalysis-based character.
It is important to emphasize that good agreement in annual totals does not necessarily imply adequate month-by-month adherence. Positive and negative monthly deviations may compensate over the year, producing a small annual error even when the seasonal representation remains weak. For this reason, the annual heat map should be interpreted jointly with the financial analysis and the multicriteria ranking.
Table 4 summarizes the distribution of annual percentage errors by database. Among the databases with complete coverage of all ten locations, NASA POWER presents the median closest to zero (median = 8.68%) and a comparatively moderate dispersion (IQR = 33.18), indicating the most balanced overall performance across the full sample. NOAA exhibits the lowest IQR among the databases with complete coverage (29.06), but also a high positive median (18.30%), revealing a consistent overestimation bias. NREL shows the lowest dispersion overall (IQR = 13.89), but only for six locations, which limits its spatial representativeness. CMA records the highest maximum error (86.71%) and one of the largest dispersions, while ERA-5 and MERRA-2 present very similar statistical behavior, both with moderate positive medians and large interquartile ranges. INMET stands out for the most consistent negative bias (median =
), confirming its structural tendency to underestimate generation in several of the evaluated sites.
3.3. AHP Multicriteria Analysis
Table 5 presents the AHP scores by city and database under the adopted weighting structure of 37.33% for MAPE, 19.71% for RMSE, 29.69% for
, 7.56% for NPV, and 5.71% for
payback. Under these assumptions, NASA POWER ranks first in seven of the ten locations, with an average score of 87.2 and a minimum of 69.8, indicating the strongest overall multicriteria performance among the databases with complete spatial coverage. NREL attains the second-highest average score (73.5), although it is available for only six locations, whereas MERRA-2 and ERA-5 show intermediate average performance, with 58.8 and 58.7, respectively. INMET exhibits the largest spatial variability, ranging from 8.4 to 86.7, which reflects its strongly site-dependent behavior.
Figure 2 consolidates the results by counting the number of locations in which each database ranks first in the three evaluated criteria, whereas
Figure 3 details the geographic distribution of those leads by location and criterion.
Under the adopted weighting structure, NASA POWER clearly dominates the consolidated ranking with 19 first-place results: 5 in generation accuracy, 7 in financial adherence, and 7 in AHP score. It leads in generation accuracy in Teófilo Otoni/MG, João Pessoa/PB, Marabá/PA, Parnamirim/RN, and Maratá/RS; in financial adherence in Teófilo Otoni/MG, Marília/SP, João Pessoa/PB, Marabá/PA, Parnamirim/RN, Primavera do Leste/MT, and Maratá/RS; and in AHP score in Teófilo Otoni/MG, Marília/SP, João Pessoa/PB, Marabá/PA, Parnamirim/RN, Primavera do Leste/MT, and Maratá/RS. This result confirms that NASA POWER provides the most consistent joint performance across the adopted statistical, financial, and multicriteria summaries.
INMET ranks second overall with 5 first-place results, concentrated in generation accuracy in Marília/SP, Florianópolis/SC, and Primavera do Leste/MT, as well as in financial adherence and AHP score in Florianópolis/SC. NREL and MERRA-2 follow with 3 first-place results each, but in much more localized patterns. NREL concentrates all three of its leads in Olinda/PE, where it ranks first simultaneously in generation accuracy, financial adherence, and AHP score. MERRA-2 shows the same threefold leadership in Rio de Janeiro/RJ. By contrast, CMA, ERA-5, and NOAA do not rank first in any of the three consolidated criteria, despite showing competitive performance in specific cities. To verify whether this multicriteria ranking remains stable under alternative weighting assumptions, the sensitivity analysis described in
Section 2.4 is applied to the baseline AHP model.
Figure 4 compares the average AHP score profiles obtained under the four alternative weighting scenarios against the baseline ranking, whereas
Table 6 summarizes the one-at-a-time perturbation analysis around the baseline weight vector.
Figure 4 shows that the weighting structure affects the relative distances among databases and may also modify the top-ranked alternative under sufficiently different priority schemes. NASA POWER remains the highest-ranked database in the baseline, technically oriented, balanced, and statistically oriented scenarios, with average AHP scores of 87.19, 87.06, 75.99, and 88.32, respectively. However, under the economically oriented scenario, NREL becomes the top-ranked database, with an average score of 73.02, while NOAA and CMA also gain relative competitiveness. These results suggest that NASA POWER exhibits the most robust overall multicriteria performance across the tested scenarios, although its leadership is not strictly preserved when the weighting structure strongly favors the economic criteria.
The one-at-a-time sensitivity analysis summarized in
Table 6 further confirms this robustness. When each baseline weight is independently varied from
to
, NASA POWER remains the top-ranked database in terms of average AHP score in all perturbations. This result indicates that the main multicriteria conclusion of the study is stable not only across predefined alternative scenarios, but also under continuous local variations around the baseline AHP weight vector.
The financial impact of meteorological database choice is illustrated in
Figure 5. Primavera do Leste/MT and Rio de Janeiro/RJ provide complementary examples of how the multicriteria ranking should be interpreted. In Primavera do Leste, NASA POWER ranks first in the AHP analysis, but it estimates an annual generation of 22,548.69 kWh compared with the measured 13,533.37 kWh, corresponding to an error of
, a projected
payback of 3.04 years, and a 25-year NPV of BRL 115,733.96, compared with the observed values of 5.06 years and BRL 46,176.50; by contrast, INMET yields 8,847.74 kWh, presenting the smallest annual deviation at this location (
), but still produces a markedly conservative estimate, with a
payback of 7.74 years and an NPV of BRL 10,024.60. This wide range suggests that the divergence in Primavera do Leste is related not only to meteorological representation, but also to the low generation observed in some months, possibly associated with a combination of adverse climatic conditions and partial operational unavailability of the system, as shown in
Figure 6.
In Rio de Janeiro/RJ, by contrast, the closest agreement with the measured system is achieved by MERRA-2, which estimates 8,082.51 kWh against the measured 8,175.53 kWh (
), with a
payback of 2.54 years versus the observed 2.51 years and an NPV of BRL 44,916.46 compared with the recorded BRL 45,634.17, as shown in
Figure 6. Taken together, these cases show that NASA POWER is the most robust database overall; however, leadership in an aggregate multicriteria evaluation does not necessarily translate into the most accurate financial projection for every location [
9,
10,
11].
3.4. Case Study: System Installed in Northeastern Brazil
The 5.50 kWp photovoltaic system installed in João Pessoa, in Northeastern Brazil, was originally designed and installed in a commercial context. The original sizing adopted the simplified method described in Equation (
1), based on the solar irradiance data provided by CRESESB. For the analyzed system, the commercial proposal estimated an average monthly generation of 762 kWh/month, equivalent to 9144 kWh/year, annual savings of BRL 7680.96 (approximately USD 1524.94 at the exchange rate of May 22, 2026), and a simple
payback of 3.13 years for an investment of BRL 24,062.10 (approximately USD 4777.17 at the exchange rate of May 22, 2026). The analysis compares three scenarios for the years 2023, 2024, and 2025, adopted as a temporal horizon because they correspond to a period close to the preparation of the proposal, prepared in August 2022: (i) the simplified method; (ii) higher-temporal-granularity modeling; and (iii) generation measured by the inverter through the manufacturer’s monitoring platform.
The location of the case-study system is shown in
Figure 7, highlighting its position in the tropical coastal zone of Northeastern Brazil. Monthly generation patterns are first compared in
Figure 8,
Figure 9 and
Figure 10, whereas the resulting annual and financial indicators are summarized in
Table 7 and
Figure 11. This comparative structure is also consistent with previous applications of the same methodological approach, in which simplified market-based sizing was contrasted with higher-temporal-granularity technical modeling and real-world measured system performance [
40].
This geographic setting is relevant because the coastal tropical environment provides additional context for interpreting the behavior of the evaluated meteorological databases in the case study.
Figure 8,
Figure 9 and
Figure 10 help explain why annual agreement and month-by-month adherence do not necessarily coincide. In 2023, NASA POWER provides the best overall agreement among the available databases, combining the smallest annual deviation (
) with the lowest mean absolute monthly deviation (MAE = 45.63 kWh). By contrast, NOAA shows a systematic positive bias throughout the year, resulting in an annual overestimation of
, while NREL presents a pronounced monthly inconsistency that substantially degrades its month-by-month adherence.
The contrast becomes clearer in 2024. In this year, CMA remains closer to the measured series in several individual months and yields a mean absolute monthly deviation (45.86 kWh) slightly lower than that of NASA POWER (49.33 kWh). However, CMA exhibits predominantly positive deviations over the year and therefore overestimates the annual total by . NASA POWER, despite larger deviations in some individual months, alternates positive and negative errors more effectively and achieves a much closer annual result, with a deviation of only . This case directly illustrates the temporal compensation effect discussed in this study.
The 2025 results provide the complementary situation. NASA POWER again shows the best month-by-month adherence among the complete datasets, with MAE = 67.24 kWh, whereas MERRA-2 and ERA-5 achieve annual totals that are slightly closer to the measured system, with deviations of and , respectively, against for NASA POWER. Thus, a closer annual aggregate does not necessarily imply the best monthly tracking. In addition, the INMET series is only partially available in 2025 and therefore should be interpreted with caution, while NREL is omitted because no valid monthly series is available for that year.
Taken together, these three yearly comparisons reinforce the central interpretation of the case study: the most appropriate database cannot be selected solely from annual totals or solely from month-by-month adherence. NASA POWER stands out for its more balanced and robust behavior across the three years, particularly when annual consistency and temporal representation are interpreted jointly, which supports its adoption as the comparative basis for the higher-temporal-granularity modeling conducted in this work.
As shown in
Table 7, the simplified CRESESB-based method produces a fixed annual generation estimate of 9,144.00 kWh regardless of the year under analysis, a structural characteristic that directly reflects its static nature, since it relies on a single irradiance reference value and therefore cannot capture interannual climatic variability. In 2023, this results in a generation overestimation of
, a
payback underestimated by
, and an NPV overestimated by BRL 2,540.94 (approximately USD 506.77 at the exchange rate of May 22, 2026), corresponding to
. In 2024, the corresponding deviations are
,
, and BRL 2,372.43 (approximately USD 473.16), equivalent to
. In 2025, when the measured system records its lowest annual generation across the three-year period (8,118.20 kWh), the limitations of the simplified method become even more evident: the generation deviation reaches
, the
payback is underestimated by
, and the NPV is overestimated by BRL 5,759.33 (approximately USD 1,148.65), corresponding to
. This is the largest financial distortion observed in the case study.
NASA POWER, used as the climatic database in the higher-temporal-granularity model, yields year-specific estimates of 8,365.21, 8,420.05, and 8,460.41 kWh for 2023, 2024, and 2025, respectively, thereby following the interannual variation of the measured system more closely. Its generation deviations remain within and , with the 2024 result being particularly noteworthy, as the annual deviation is only . The corresponding NPV deviations range from to , equivalent to BRL 1,335.01 (approximately USD 266.26) in 2023, BRL 1,085.38 (approximately USD 216.47) in 2024, and BRL 2,609.28 (approximately USD 520.40) in 2025, consistently smaller than those obtained with the simplified method.
This contrast illustrates one of the central contributions of the present study: whereas the simplified method propagates the same annual generation assumption to all years and, consequently, to all financial indicators, the higher-temporal-granularity model captures year-to-year changes in generation potential by accounting for hourly irradiance variability, ambient and module temperature effects, and module-specific technical parameters. As a result, it reduces the uncertainty associated with payback and NPV estimates and improves the reliability of investment decisions based on pre-installation energy assessments.