5. Hierarchical Clustering Results and Financial Structure Segmentation
The results obtained for the normalized indicators clearly demonstrate the trade-off between compactness, separation, and structure for the considered algorithms. Density-Based clustering has high values for maximum diameter, minimum separation, and the Dunn index, which means good cluster compactness and separation, while it has poor results for Pearson’s gamma, entropy, and the Calinski-Harabasz index, which indicate poor global structure and balance in cluster partitions. Hierarchical clustering has the best results for Pearson’s gamma and reasonably good results for entropy and the Dunn index, which indicate good global ordering and acceptable internal validity, though it does not dominate in separation and compactness results. k-Means has the best results for the Calinski-Harabasz index and high values for Pearson’s gamma, which indicate good global variance separation, while it has poor results for the Dunn index and minimum separation, which indicate poor cluster separation in the data space. Model-based and Random Forest clustering methods demonstrate balanced results, which are neither good nor poor for any of the considered indicators, though they are close to the best results for entropy and moderate results for separation and compactness. Fuzzy C-Means has poor results for all separation and compactness indicators, despite good results for entropy, which makes it difficult to evaluate its quality. Considering all indicators together, Hierarchical clustering demonstrates the best balance between high values for Pearson’s gamma and good results for other indicators, avoiding poor results for other methods, which makes it the best compromise for clustering validity.
| Indicator |
Density Based |
Fuzzy C-Means |
Hierarchical |
Model Based |
k-Means |
Random Forest |
| Maximum diameter |
1.000 |
0.135 |
0.099 |
0.500 |
0.000 |
0.360 |
| Minimum separation |
1.000 |
0.000 |
0.192 |
0.051 |
0.036 |
0.049 |
| Pearson’s γ |
0.000 |
0.383 |
1.000 |
0.512 |
0.710 |
0.235 |
| Dunn index |
1.000 |
0.000 |
0.392 |
0.059 |
0.088 |
0.065 |
| Entropy |
0.000 |
1.000 |
0.616 |
0.984 |
0.911 |
0.984 |
| Calinski–Harabasz |
0.000 |
0.505 |
0.479 |
0.578 |
1.000 |
0.421 |
From the hierarchical clustering results, it is evident that there is a significant unevenness in the structure of the clusters, with one major cluster being much larger than the rest. Specifically, cluster 1, with 186 observations, represents the majority of the dataset, while the rest of the clusters are significantly smaller, with some of them having only one observation each, i.e., clusters 5 and 10. This suggests that, in the multi-dimensional space defined by VTX, DBS, REM, MCX, and IPU, most of the units share similar financial and market characteristics, while a small number of units have unique combinations of these variables, requiring smaller clusters. This interpretation is supported by the results in terms of the explained within-cluster heterogeneity. Specifically, cluster 1 captures approximately 69.5% of the total within-cluster heterogeneity, indicating that most of the total heterogeneity of the dataset is contained within this major group. Clusters 2, 3, and 4 have smaller proportions, ranging approximately from 7% to 13%, while the rest have negligible proportions. This implies that most of the structure of the dataset is driven by a small number of broad partitions, with cluster 1 being the largest, reflecting what can be defined as “typical” in terms of combinations of stock market breadth (VTX), banking sector size vis-à-vis central bank size (DBS), remittances (REM), market concentration (MCX), and international public debt (IPU). Further insights can be gained with reference to the within-cluster sum of squares. Cluster 1, which consists of the highest number of observations, has the highest value of the within-cluster sum of squares, which is naturally related to the cluster’s size, but it is also an indication of the cluster’s internal dispersion. This implies that, although the observations included in this cluster are closer to those in the same cluster than to those in the other clusters, they are characterized by a high degree of financial structures and market development diversity. In contrast, the smaller clusters are characterized by significantly lower values of the within-sum-of-squares, which, in some cases, are close to zero, possibly because of the tight internal homogeneity of the observations included in these clusters or, in the case of the single-observation clusters, because of the absence of internal variation by definition. This implies that the smaller clusters are characterized by highly specific financial structures, possibly corresponding to the financial structures of the outliers regarding market concentration, remittance dependence, and exposure to international public debt. The silhouette is another measure of cluster quality and separation, and it provides an alternative view of the cluster configuration and separation. Cluster 1 is characterized by a relatively low silhouette value, equal to 0.271, which implies the presence of a large proportion of observations close to the cluster boundaries with the neighboring clusters. This is consistent with the idea of a cluster including a large and relatively heterogeneous set of observations with smooth differences between the cluster and the other clusters. In contrast, the silhouette values for the other clusters, i.e., for clusters 2, 3, and 4, are slightly higher, possibly reflecting the relatively higher quality and separation of these clusters, although not at a very high level. In contrast, the silhouette values for the smaller clusters, i.e., for clusters 6, 7, 8, and, above all, for cluster 9, are very high, equal to 0.905, possibly reflecting the relatively high separation of these clusters from the rest of the observations and the relatively high internal cohesion of the observations included in the same cluster. On the whole, hierarchical clustering indicates that the data has a core-periphery structure. The majority of the observations fall into a large, somewhat diffuse core cluster that is characterized by broadly similar combinations of VTX, DBS, REM, MCX, and IPU, while a few units fall into sharply distinct groups with very specific profiles. From the point of view of economic and financial analysis, this means that while a large number of countries or units display broadly similar patterns of financial development or market structures, there are nevertheless significant exceptions that display markedly different characteristics in terms of market concentration, dependence on remittance flows, the structure of the banking system, or the reliance on international public debt, etc., that are sufficiently different to justify their isolation into distinct clusters.
| Cluster |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
| Size |
186 |
28 |
42 |
24 |
1 |
6 |
9 |
5 |
5 |
1 |
| Explained proportion within-cluster heterogeneity |
0.695 |
0.071 |
0.134 |
0.072 |
0.000 |
0.008 |
0.009 |
0.010 |
6.498×10-4
|
0.000 |
| Within sum of squares |
331.713 |
33.816 |
64.125 |
34.549 |
0.000 |
3.664 |
4.416 |
4.755 |
0.310 |
0.000 |
| Silhouette score |
0.271 |
0.407 |
0.310 |
0.339 |
0.000 |
0.646 |
0.607 |
0.530 |
0.905 |
0.000 |
The cluster centroids derived from the hierarchical clustering process provide a rich representation of how differentiated groups of observations can be defined based on their characteristics along dimensions of stock market structure, banking system composition, external financial flows, market concentration, and international public debt reliance. The standardized nature of the variables means that positive and negative values indicate deviations from the sample mean, facilitating cross-cluster comparison of relative positions. For Cluster 1, which also represents the largest proportion of the sample, there is a moderately positive value for VTX and a slightly positive value for REM, along with slightly negative values for DBS and MCX, with IPU close to zero. This represents countries with somewhat broader and more diversified trading activity within the stock market beyond the dominant firms, along with a somewhat smaller role played by deposit-taking banks compared with the central bank. There is a slightly lower level of market concentration and a level of reliance on international public debt close to the mean. This can be interpreted as a mainstream financial structure, with balanced characteristics and no extreme positions along the dimensions considered. Cluster 2, on the other hand, shows a very different profile, with a very positive value for DBS and negative values for REM and IPU, with VTX close to zero and MCX slightly positive. This represents financial systems with dominant deposit-taking banks compared with the role of the central bank within the financial sector, with lower levels of remittance inflows, and a lower reliance on international public debt markets. The slightly positive value for MCX also suggests somewhat higher levels of diversification within the equity market compared with the mean. Cluster 3 is characterized by negative values of VTX and DBS, and positive values of REM and IPU, as well as negative values of MCX. The characteristics of this cluster indicate that the stock exchange is narrow and concentrated, and that banking is relatively less developed, while remittances and international public debt markets are significant. Therefore, Cluster 3 seems to indicate that remittances and international public debt markets play an important role in the overall economy of the country. Cluster 4 is differentiated by its high value of MCX and positive values of VTX and DBS, as well as negative values of REM and IPU. The characteristics of Cluster 4 indicate that the stock exchange is highly diversified and extends beyond the largest firms, and banking is relatively more developed, while remittances and international public debt markets play relatively less important roles. Therefore, Cluster 4 seems to indicate that remittances and international public debt markets do not play an important role in the overall economy of the country. On the other hand, Clusters 5, 8, and 10 represent more extreme and specialized systems. In Cluster 5, MCX and IPU levels are extremely high, with DBS being positive and REM being strongly negative. This suggests a financial structure with highly diversified equity markets, strong banking systems, high reliance on international public debt, and minimal remittance significance. In Cluster 8, MCX levels are extremely high, with REM being strongly negative, DBS being positive, and VTX being moderately positive. This again points towards highly market-oriented systems with minimal dependence on remittance. In Cluster 10, with exceptionally high levels of DBS and REM, but with IPU being negative, it appears that in this configuration, the banking sector is dominant, remittance is highly significant, while reliance on international public debt is relatively modest. The most extreme cases in terms of negative VTX, as well as high levels in REM and IPU, especially in the case of cluster 9, are covered by clusters 6 and 9. The economies in these clusters are characterized by extremely narrow stock market structures, weak or less dominant banking systems, and a strong dependence on external financial flows, both from the public and the private sector. In conclusion, the results based on the cluster means indicate that the countries are divided into different categories in terms of financial development and external financial integration, ranging from market-based diversified systems to more externally dependent and structurally concentrated systems.
| |
VTX |
DBS |
REM |
MCX |
IPU |
| Cluster 1 |
0.405 |
-0.360 |
0.085 |
-0.315 |
0.107 |
| Cluster 2 |
0.003 |
2.275 |
-0.479 |
0.249 |
-0.777 |
| Cluster 3 |
-1.328 |
-0.729 |
0.977 |
-0.631 |
0.835 |
| Cluster 4 |
0.490 |
0.545 |
-0.473 |
2.251 |
-0.379 |
| Cluster 5 |
0.128 |
0.957 |
-1.201 |
3.353 |
2.360 |
| Cluster 6 |
-2.357 |
0.145 |
-1.964 |
-0.310 |
-1.604 |
| Cluster 7 |
0.387 |
1.469 |
-2.187 |
0.713 |
-1.723 |
| Cluster 8 |
0.469 |
1.375 |
-2.244 |
3.791 |
-1.749 |
| Cluster 9 |
-4.729 |
-0.889 |
1.938 |
-0.655 |
1.751 |
| Cluster 10 |
0.495 |
3.412 |
2.168 |
0.412 |
-1.358 |
The figure presents a comprehensive visual interpretation of the results obtained from the hierarchical clustering for the model with the inclusion of VTX, DBS, REM, MCX, and IPU, bringing together the information obtained for cluster selection, cluster structure, and economic interpretation. Panel A presents the evolution of information criteria and the sum of squares for each cluster, depending on the number of clusters considered for the partition. The downward trend represents the improvement in the goodness-of-fit measure for the model, while the highlighted minimum represents the optimal number of clusters, near ten, where the model balances goodness-of-fit and parsimony considerations for the partition. This result supports the selection of a rich cluster structure, capable of capturing the heterogeneity present in the data set. Panel B presents the clustered observations in a reduced dimensional space, where different colors are used to differentiate the clusters. The clear differentiation between some of the clusters supports the evidence that the hierarchical algorithm does not partition the sample mechanically but recognizes patterns in the data set. Some clusters are more compact and well differentiated, while others show more overlapping patterns, consistent with the evidence of a large, heterogeneous core group, along with smaller, more differentiated groups of observations. Panel C presents the dendrogram, where the hierarchical structure of the clustering algorithm is presented, showing the merging of the observations and the different groups obtained during the process, where broad branches are associated with more general patterns, while smaller branches are associated with more idiosyncratic patterns. The level at which the branches merge represents the distance between the groups, while the existence of some long vertical jumps suggests that the clusters are genuinely different in terms of the underlying financial and market characteristics. Panel D displays the standardized cluster means, which are the economic interpretation of the clusters. The differences in the clusters are evident in terms of the stock market breadth (VTX), relative importance of deposit money banks (DBS), remittance inflows (REM), market concentration (MCX), as well as the relative importance of international public debt (IPU). Clusters are characterized by diversified structures, with high values in terms of market concentration (MCX) and relative importance of deposit money banks (DBS), as well as low values in terms of remittance inflows (REM), while other clusters are characterized by the opposite: narrow structures, with low market concentration (MCX) and relative importance of deposit money banks (DBS), as well as high values in terms of remittance inflows (REM) and relative importance of international public debt (IPU). The extreme positive or negative values in the clusters indicate the existence of very specific financial structures, as suggested by the small, well-separated groups in the other panels. The figure suggests that the methodology of hierarchical clustering detects a rich segmentation in the data, highlighting the coexistence of a large group of similar observations, as well as a few smaller, more differentiated clusters, driven by different financial development, market structures, and external financial integration, as captured by the underlying five variables.

The resulting scatter plot matrix provides a detailed view of the relationships between these five standardized financial variables, VTX, DBS, REM, MCX, and IPU, with the data points colored according to their clusters determined using hierarchical clustering. This form of visualization is useful not only for evaluating the internal consistency of these clusters but also for understanding the ways in which distinct financial dimensions interact with each other. Several distinct financial relationships can be determined. For instance, there is a strong positive relationship between MCX and VTX, indicating that an expansion of stock exchange trading activities beyond the largest firms is related to a less concentrated and more diversified financial system. This is true across all clusters, with some clusters residing at more extreme levels. In terms of DBS, there is a more nuanced relationship. Clusters with high levels of DBS, indicating a stronger financial system dominated by deposit money banks relative to the central bank, are typically found at moderate to high levels of MCX and low levels of REM. This indicates a stronger domestic capital market and less reliance upon remittance flows. By contrast, clusters with low DBS levels, indicating a less important role played by domestic banking, are more frequently associated with high levels of REM and, at times, high levels of IPU. From the REM panels, we can see that there is an obvious distinction between clusters that have high remittance dependence and those that have relatively lower remittances. Clusters that have high REM values tend to be located in areas that have lower values of VTX and MCX, indicating that they have relatively narrower and more concentrated market activity and tend to have higher IPU values, indicating more dependence on international public debt. Again, this suggests that remittance-dependent countries tend to have less developed domestic financial markets and more developed links with external sources of finance. The IPU relationships also support this view of segmentation into remittance-dependent and less remittance-dependent countries. Clusters with high IPU values tend to group together with high REM values and lower values of VTX and MCX, while clusters with low IPU values tend more often to group together with stronger values of domestic market activity and DBS. The colored point clouds of each of the variables also suggest that the hierarchical clustering has successfully captured significant multivariate structure and not simply arbitrary groupings of countries. Each of the clusters is located in relatively distinct areas of the variable space, with considerable overlap in the central or more “average” areas of each distribution. In summary, the scatterplot matrix of the data set visually confirms that the hierarchical clustering has successfully captured significant and coherent financial and market structure, distinguishing between market-oriented systems that have well-developed banking systems and relatively well-developed equity markets and systems that tend to be more externally driven and have relatively higher remittances, relatively higher values of international public debt, and relatively lower values of domestic market activity.
