An Auto-Weighting Aggregative Fuzzy Collaborative Intelligence Approach for DRAM Yield Forecasting

: In a collaborative forecasting task, experts may have unequal authority levels. However, this has rarely been considered reasonably in the existing fuzzy collaborative forecasting methods. In addition, experts may not be willing to discriminate their authority levels. To address these issues, an auto-weighting fuzzy weighted intersection (FWI) fuzzy collaborative intelligence approach is proposed in this study. In the proposed auto-weighting FWI fuzzy collaborative intelligence approach, experts’ authority levels are automatically and reasonably assigned based on their past forecasting performances. Subsequently, the auto-weighting FWI mechanism is established to aggregate experts’ fuzzy forecas ts. The theoretical properties of the auto-weighting FWI mechanism have been discussed and compared with those of the existing fuzzy aggregation operators. After applying the auto-weighting FWI fuzzy collaborative intelligence approach to a case of forecasting the yield of a DRAM product from the literature, its advantages over several existing methods were clearly illustrated.


Introduction
The (die) yield of a dynamic random access memory (DRAM) product is the average percentage of good dies on each wafer for fabricating the product [1]. The yield of a DRAM product improves with time, which is usually described as a yield learning process [2]. However, there is a lot of uncertainty inherent in a yield learning process, mainly caused by error-prone human operations [3]. As a result, it is very difficult to accurately forecast the future yield of a DRAM product. To tackle this difficulty, the range of yield is also estimated [4]. Many studies have adopted a fuzzy number to forecast the future yield of a DRAM product [5][6][7][8][9], so as to show the most possible value and range of yield simultaneously. To generate a fuzzy yield forecast, a fuzzy yield learning curve is fitted instead [4].
In fitting a fuzzy yield learning curve, two objectives are optimized [6]: (1) Accuracy: The most likely value of a fuzzy yield forecast should be as close to the actual value as possible.
(2) Precision: The range of a fuzzy yield forecast should be as narrow as possible.
Fuzzy collaborative intelligence methods have shown great potential in simultaneously optimizing these two objectives [5][6][8][9]. Some relevant literature is reviewed as follows. Unlike some fuzzy collaborative forecasting methods that are joint applications of multiple types of fuzzy forecasting methods [10][11][12][13][14], fuzzy collaborative intelligence methods rely on the collaboration among experts [15][16]. Chen and Lin [7] proposed a fuzzy collaborative forecasting method, in which experts solved two nonlinear programming (NLP) problems to generate fuzzy yield forecasts. Fuzzy intersection (FI) was applied to aggregate all experts' fuzzy yield forecasts, so as to improve the forecasting precision. Then, a back propagation network (BPN) was constructed to defuzzify the aggregation result, so as to optimize the forecasting accuracy. Chen [5] proposed a heterogeneous fuzzy collaborative forecasting method, in which experts solved NLP problems or trained artificial neural networks to generate fuzzy yield forecasts. Chen and Wang [6] replaced experts with software agents to facilitate the collaboration process. Chen and Wang [8] approximated NLP problems with quadratic programming (QP) problems that were easier to solve. However, sometimes experts lack an overall consensus, and the aggregation result is an empty set. To solve this problem, partial-consensus fuzzy intersection (PCFI) [17] is applied to seek the partial consensus among some experts instead. Chen et al. [18] proposed a fuzzy collaborative forecasting method, in which the number of experts that reach a partial consensus varies with time.
In practice, it is common that experts have unequal authority levels [7,19], which has rarely been considered in this field. In Chen and Lin [7], experts' authority levels were discriminated. Fuzzy yield forecasts by experts with higher authority levels were learned more times in training the BPN defuzzifier. Chen et al. [19] proposed the fuzzy weighted intersection (FWI) operator to aggregate experts' fuzzy yield forecasts, so that experts' authority levels affected the membership function, rather than value, of the aggregation result. In addition, a fuzzy collaborative forecasting method can be more flexible and effective if experts can have unequal authority levels. For this reason, an autoweighting FWI fuzzy collaborative intelligence approach is proposed in this study.
In the proposed auto-weighting FWI fuzzy collaborative intelligence approach, a group of experts forecast the yield of a DRAM product collaboratively. At first, each expert fits the fuzzy yield learning curve of the DRAM product by solving either of two NLP problems. The fitted fuzzy yield learning curves are applied to forecast the yield of the product, for which the forecasting accuracy and precision are evaluated. Based on experts' forecasting performances, their authority levels are automatically assigned. Subsequently, the FWI operator is applied to aggregate experts' fuzzy yield forecasts. After defuzzifying the aggregation result using the center-of-gravity (COG) method with the aid of  cut operations, the overall forecasting performance is evaluated.
The differences between the proposed methodology and some existing methods are summarized in Table 1. Compared with existing fuzzy collaborative forecasting methods, the proposed methodology is more flexible because experts can have unequal authority levels. In addition, in the proposed methodology, the aggregation result can be defuzzified using the COG method, which is easier than the prevalent BPN defuzzifier. The remainder of this paper is organized as follows. Section 2 is dedicated to the literature review. Section 3 details the proposed auto-weighting FWI fuzzy collaborative intelligence approach. Section 4 presents the results of applying the proposed methodology to a case from the literature. Section 5 concludes this paper and puts forth some topics for future investigation.
The original definition of fuzzy collaborative intelligence [20] considers the following situation: experts are distributed and can only access disjointed parts of data. As a result, their analysis results are not enough to reflect the overall situation and need to be aggregated. This situation will become more complicated if experts are not able to share the parts of data they accessed. In such fuzzy collaborative intelligence methods, experts usually apply the same analytical methods. Another definition of fuzzy collaborative intelligence [21] emphasizes the diversity of viewpoints adopted by experts who apply different methods to analyze the same data. Some recent literature on fuzzy collaborative intelligence is reviewed as follows.
Fuzzy collaborative clustering is one of the main streams in fuzzy collaborative intelligence [15][16]20,[22][23][24]. Pedrycz [25] defined the concept of granular fuzzy modelling as the application of fuzzy collaborative clustering methods to information granules.
Ayadi et al. [26] proposed a fuzzy collaborative evaluation method for supplier assessment. They established a procedure for identifying linguistic hierarchical levels and converting unbalanced linguistic variable sets. Finally, evaluators' assessment results were aggregated using weighted average.
Liu et al. [22] proposed a fuzzy collaborative clustering method for ranking factor granules. Each factor granule had three parts: patterns, factors, and factor-induced information. Factor granules that were closer to each other were ranked higher.
Yadav and Tyagi [23] proposed a fuzzy collaborative clustering method for recommending items to customers, in which fuzzy c-means (FCM) was applied to generate item and customer clusters. A weighting scheme was then used to aggregate the two types of clusters.
Ngo et al. [24] proposed a new FCM algorithm in which fuzzy parameters and variables were given in interval-valued fuzzy numbers, and applied it to fuzzy collaborative clustering.
Zhong et al. [27] solved a fuzzy collaborative optimization problem that was composed of two fuzzy multi-objective optimization problems. In addition, the optimal solution of one fuzzy multiobjective optimization problem became an input to the other.

The Proposed Methodology
Without loss of generality, all fuzzy parameters and variables in the proposed methodology are given in or approximated with triangular fuzzy numbers (TFNs) [35].

An Aggregative Nonlinear Programming Model for Generating Fuzzy Yield Forecasts
The improvement in the yield of a DRAM product is usually described as a yield learning process [7], that can be achieved eventually. b is the learning constant (speed); 0 b  . () denotes fuzzy multiplication [36]. All terms on both sides of Equation (1) are converted into their logarithmic values as which is a fuzzy linear regression (FLR) equation that can be fitted in various ways [7,9,[37][38][39][40]. In Chen and Lin [7], each expert fits this equation by solving either of the following two NLP problems [7]: where t y is the actual yield at period t. The objective function minimizes the high-order sum of the ranges of fuzzy yield forecasts. o(k) ∈ Z + ; 0 < o(k)  4 [6]. Constraints (4) and (5) ensure that the membership of an actual value in the corresponding fuzzy yield forecast is higher than s(k), a threshold specified by Expert #k. Equations (6) to (8) are the expansion of Equation (2). Constraints (9) and (10) define the sequence of corners for the corresponding fuzzy variable. Problem NLP I can be converted into or approximated with a QP problem [8][9] that can be solved using various methods including the interior point method, the active set method, the augmented Lagrangian method, the conjugate gradient method, the gradient projection method, and an extension of the simplex algorithm [41].

(Problem NLP II)
The objective function maximizes the high-order sum of satisfaction levels. w(k) ∈ Z + ; 0  w(k)  4 [6]. Constraint (12) ensures that the average range of fuzzy yield forecasts is narrower than d(k), a threshold specified by Expert #k. The satisfaction level at each period is measured with Constraints (13) and (14). Problem NLP II can also be converted into or approximated with a QP problem. The values of o(k), s(k), d(k) and w(k) specified by different experts are not the same, leading to unequal fuzzy yield forecasts that need to be aggregated. To eliminate the need to choose from the two NLP problems, an integrative NLP model is proposed in this study by merging the two NLP problems as where () k  is the relative weight Expert #k assigns to the first objective function. When

Auto-Weighting FWI Operator for Aggregating Fuzzy Yield Forecasts
If all experts' fuzzy yield forecasts overlap, there is an overall consensus among experts. The overlapping part represents values that are considered possible by all experts, and can be modelled in terms of the FI (i.e., the minimum T-norm) of all experts' fuzzy yield forecasts [28][29][30][31]: The membership function of ± Proof. Expressing fuzzy variables in Equation (21) with their left and right  cuts gives Substituting Equations (34) and (35) into Equation (33) gives Theorem 1 is proved.
If the consensus among all experts is lacking, the FI result will be an empty set. In this situation, the partial consensus among some experts, modelled by the FI result of these experts' fuzzy yield forecasts, is sought instead. To generate a set containing values that are considered possible by all experts, the union of the partial-consensus FI results can be formed using the PCFI operator [42]: in which H is the number of experts that reach a (partial) consensus; 2  H  K.
A problem of the PCFI operator is how to determine the value of H. A larger value of H means that more experts have reached a (partial) consensus, which is a favorable property in practice. However, it is also a difficult task. In addition, the PCFI result of more experts' fuzzy yield forecasts is a smaller set that is less likely to include actual values for future (or unlearned) data. The EPCFI operator [33] determines the value of H by optimizing the following objective function: which means that if seeking the consent of another expert will greatly narrow the PCFI result, then the consent should not be sought.
The existing aggregation methods do not discriminate experts' authority levels. If experts have unequal authority levels, aggregating their fuzzy yield forecasts using the FWI operator [42] can generate more acceptable results: where k  is the authority level of expert k; is not a concave fuzzy number [43], and cannot be simply expressed with its left and right  cuts. Some theoretical properties of the FWI operator are follows [19]: An example is provided in Figure 2. Values that are considered highly possible by all experts or just the most authoritative expert will have high memberships in the FWI result. This outcome is more in line with the expectations of all experts and is more acceptable to everyone. where C is a positive constant; RMSE(k) is the forecasting accuracy, in terms of root mean squared error (RMSE), achieved by expert #k in the past: Subsequently, a BPN with the following configuration is constructed to defuzzify the aggregation result (see Figure 3): (1) Input: Inputs to the BPN include the value and membership of each corner of the aggregation result. Deriving the representative value based on these corners is meaningful, because they are considered highly possible by all experts or just the most authoritative expert. Consider the example shown in Figure 4. The aggregation result has six corners, and hence there are twelve inputs to the BPN. However, the number of corners may differ from period to period. Therefore, the number of inputs is determined by the maximum number of corners in all examples. (2) Hidden layer: Many studies have shown that a BPN with a single hidden layer is sufficient to fit a complex nonlinear relationship [7,11,17]. The number of nodes in the hidden layer is twice the number of inputs [7,17].

A Case Study
The proposed auto-weighting FWI fuzzy collaborative intelligence approach has been applied to forecast the yield of a DRAM product that has been studied in [5]. In this application, four experts forecasted the yield of the DRAM product collaboratively. At first, these experts fitted the yield learning model of the DRAM product by solving four NLP problems that are summarized in Table  2.  In the collected yield data, the first six periods were used to build the models, while the remaining periods were left for testing. All NLP problems were solved using Lingo on a PC with i7-7700 CPU 3.6 GHz and 8 GB RAM. The average execution time was less than 3 seconds. The fitted yield learning models are Expert #2 Expert #3 FWI These fuzzy yield learning models were applied to forecast the yield of the DRAM product for the remaining periods. Then, a fuzzy yield forecast was defuzzified using the COG method. The forecasting accuracy was evaluated in terms of mean absolute error (MAE), mean absolute percentage error (MAPE), and RMSE, while the forecasting precision was evaluated in terms of the average range and hit rate, defined as follows: The evaluation results are summarized in Table 3. Based on the RMSEs, by setting C to 0.033, experts' authority levels were determined as 0.17, 0.41, 0.19, and 0.22, respectively. Subsequently, the auto-weighting FWI mechanism was applied to aggregate experts' fuzzy yield forecasts. Taking period 7 as an example, the aggregation result is shown in Figure 5. In this figure, 0.71 was considered as the most possible value, which was in line with all experts' fuzzy yield forecasts. This is the novel property of the proposed methodology. The aggregation result was then defuzzified using a BPN. Inputs to the BPN defuzzifier were the coners of the aggregation result [45], as summarized in Table 4. There were at most nine corners each period. Therefore, the BPN defuzifier had 18 inputs and 36 hidden-layer nodes. The training data were adopted to train the BPN using MATLAB R2017a on the same PC. The maximum number of epochs was 10000. The forecasting results are summarized in Figure 6. The overall forecasting performance using the proposed methodology was evaluated as Expert #4 FWI Figure 6. The forecasting results using the proposed methodology.
For comparison, three existing methods have also been applied to this case. The first existing method was the overall-consensus fuzzy collaborative forecasting method [7], which was based on overall consensus and employed FI and BPN to aggregate experts' fuzzy yield forecasts and defuzzify the aggregation result, respectively. The second compared method was the logistic regression method, which fitted the collected yield data with a logistic regression model. Then, the lower and upper bounds of yield were established by subtracting and adding three times the standard deviation to the yield forecast, respectively. The last existing method compared in the experiment is the evolving partial-consensus fuzzy collaborative forecasting method [18], in which the EPCFI operator was applied to aggregate experts' fuzzy yield forecasts. In this way, the number of experts who reached a partial consensus varied with time [46].
The forecasting performances using various methods are compared in Table 4. According to the experimental results, the following discussion is made: (1) The proposed methodology outperformed the existing methods in optimizing the forecasting accuracy measured in terms of MAE and MAPE. By contrast, the evolving partial-consensus fuzzy collaborative intelligence method achieved the best performance in reducing RMSE. A lower value of MAE but a higher value of RMSE meant that the forecasting deviations using the proposed methodology were diffused. (2) Period 9 was an exceptional case in which the yield of the DRAM product unexpectedly fell. If this exceptional case was removed, then the proposed methodology reduced MAPE further to 0.44%. (3) To ascertain whether the advantage of the proposed methodology over the existing methods was significant or not, the paired t test [47] was conducted to test the following analyses: H0: The forecasting accuracy, in terms of the absolute deviation, using the proposed methodology is the same as that using the compared method. The forecasting accuracy, in terms of the absolute deviation, using the proposed methodology is better than that using the compared method.
The testing results are summarized in Table 5. The advantage of the proposed methodology over the overall-consensus fuzzy collaborative forecasting method was significant at a  level of 0.01. (4) On the other hand, between the two methods that achieved the highest hit rates, the proposed methodology was more able to narrow the average range of fuzzy yield forecasts.

Conclusions
Experts usually have unequal authority levels in a collaborative forecasting or decision-making task. However, this has rarely been considered in the existing fuzzy collaborative intelligence methods. In addition, sometimes experts may not be willing to discriminate their authority levels. To address these issues, an auto-weighting FWI fuzzy collaborative intelligence approach is proposed in this study. In the proposed auto-weighting FWI fuzzy collaborative intelligence approach, experts' authority levels are automatically assigned based on their past forecasting performances, which is considered a reasonable treatment. Subsequently, the auto-weighting FWI mechanism can be applied to aggregate experts' fuzzy forecasts. Many theoretical properties of the auto-weighting FWI mechanism have been discussed and compared with those of the existing FI, PCFI, and FWI operators.
The auto-weighting FWI fuzzy collaborative intelligence approach has been applied to a case of forecasting the yield of a DRAM product from the literature. According to the experimental results, the following conclusions were drawn: (1) The proposed approach effectively improved the forecasting accuracy in terms of MAE and MAPE. In addition, after removing an exceptional case, the proposed methodology was able to reduce MAPE to only 0.4%. (2) The proposed methodology optimized the forecasting precision in terms of hit rate, while keeping fuzzy yield forecasts relatively narrow. (3) The most possible value in the aggregation result was in line with all experts' fuzzy yield forecasts, which is the distinct property of the proposed methodology.
Other forms of the FWI aggregator can be proposed in future. In addition, in this study, an expert's authority level is based on the RMSE of his/her past forecasts. Other forecasting accuracy indexes can be adopted as well. Forecasting precision indexes can also be incorporated. Further, the