3.1. Application of the fuzzy set theory in student assessment
The standard evaluation of online practice tests at FMI of the University of Plovdiv is conducted by using the following grading system:
An Excellent 6 grade is awarded for test scores from to
A Very good 5 grade – from to
A Good 4 grade – from to
A Satisfactory 3 grade is from to
A Poor 2 grade is and below.
However, the fairness of marginal, “borderline” scores that determine whether a learner is assigned the higher or the lower grade can be viewed as questionable. It can be considered unjust for a learner to pass a test with 50 points out of 100, and for another one to fail the same exam with only a single point less than him/her. In an attempt to make students’ boundary grades more equitable, we have employed a fuzzy-set technique to change the grades of the borderline cases in two examination tests. Similar efforts have been described in [
12] and [
13], where fuzzy logic and fuzzy functions were applied to estimate learners’ tests with a view to allocate them a more objective grade.
The first exam considered is a Final test taken by 78 students at the end of the language course. It consists of 60 closed questions and one open, with a maximum score of 80 points, 20 of which are awarded for the open question. When test questions measure similar capability or expertise, they yield a large inner consistency reliability. If a test comprises various types of questions evaluating different kinds of capabilities and knowledge, Cronbach’s coefficient tends to be smaller as a consequence of the dissimilarity of the questions in the context of layout and content. For this reason, initially we need to decide what weight to assign on the open question to guarantee the reliability of the test; an erroneous choice can often prove discriminatory. The alpha coefficient of Cronbach can effortlessly be estimated by means of the formula:
The evaluated test item is 61 with sum of variances
, and the Test Variance is
. Inserting these numbers in (
1), we get
, which means that the reliability is simply acceptable. The largest variance happens in the open question; consequently, we are searching for
to multiply the scores of the open-ended question for
to become bigger. In case we choose
, we obtain sum of the variances
, and the Test Variance is
. Implementing the new numbers in (
1), we get
, therefore, the reliability is excellent.
Thus, the maximum score that a learner can receive is 64 (60 for the closed questions + for the open one), i.e. the maximum points for the closed questions are 60, and the maximum score for the open one is 4. Using the standard grading of online practice tests at FMI, the results should be interpreted as follows: An Excellent 6 grade is from 56 to 64 points, a Very good 5 grade is from 48 to 55, a Good 4 grade is from 40 to 47, a Satisfactory 3 grade is from 32 to 39, and a Poor 2 grade is 31 points and below.
The second test is a Midterm Test, taken approximately in the middle of the language course. It was administrated to 36 students altogether. The test consists of 70 closed questions with 1 point awarded for a correct answer, and 3 open questions, the maximium points to which are 6 each. Thus, the overall maximum test points are 88 and the highest possible score for the open items is 18. The Cronbach’s alpha is , meaning that the reliability of the test is good. There were no great differences in the variances of the different types of questions, that is why we will not search for a scaling coefficient to increase Cronbach’s alpha coefficient as we did for the first test. At first glance, the results of the second test appear to be worse when compared with the first one, because not a single test–taker has obtained the maximum points. Solely to simplify the calculations and the notations, in order to use one and the same function in the fuzzy sets technique, we scaled the results of the second test. Thus, we scaled the points to ensure that the maximum score obtained by a student in the second test would represent the maximum points of the test, i.e. the highest number of points, received by the students, was 80, therefore we scaled the results of all students with the factor . Since the scores from the open question(s) are used for fuzzifying some of the test results, we scaled the 18 points from the results from the open questions in the second test by a factor of in order to be able to use the same functions in the calculations.
Fuzzy logic, fuzzy sets, and fuzzy functions have been widely used since its introduction by Zadeh [
13]. We would like to emphasize only several works that have a connection with our investigation in e–learning and e–testing: [
14,
15,
16,
17].
A classical technique for reevaluations of test results by fuzzy sets is to consider some borderline grades that need to be reassessed [
14,
15]. However, we will present a different approach. We will search for a maximum number of possible borderline grades to be fuzzified without changing the statistical distribution of the overall grade. We have considered five functions that stand for the fuzzy membership ones to the sets of marks. We would like to mention that in very recent years there has been a large increase of the usage of fuzzy logic in the evaluation of students’ performance [
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28].
Let
and
be two functions. We have specified five functions, which denote the bell-like functions of fuzzy membership to the sets of marks.
We can consider the following functions:
and we get a bell–shaped fuzzy function
,
,
,
,
, denoted by
,
,
,
,
, respectively.
3.3. Illustration of the fuzzy logic usage in recalculating students’ marks
We illustrate the functions defined above in the case when
,
,
,
,
,
,
,
,
,
(
Figure 1).
Therefore, a student with 41 overall points belongs to the set of Satisfactory grades with a degree and to the class of Good grades with a degree and to the other sets: Poor, Very good or Excellent grades with a degree 0. A student with an overall score of 44 points belongs to the set of Good grades with a degree 1 and to the other sets: Poor, Satisfactory, Very good, or Excellent mark with a degree 0.
In case the points a learner has received in a test do not belong definitely to a given set, we need to choose a different criterion, which depends on the learner’s result, in order to decide which grade to assign him/her, and that will be the learner’s result on the open items. In addition, we once again divided the learners’ marks into 5 groups: Poor (from to points), Satisfactory (from to ), Good (from to ), Very good (from to ), and Excellent (from to ), and we also defined their membership functions , which we will denote by , , , and
We illustrate the defined above functions in the case
,
,
,
,
,
,
,
,
(
Figure 2).
The norms for executing set procedures of union (AND), intersection (OR), and complement (NOT) that concern us the most are given below.
For Union, we look at the degree of membership for each set and pick the lower one of the two, that is:
(
Figure 3).
For Intersection, we inspect the degree of membership for each set and choose the larger of the two, that is
(
Figure 4).
The Fuzzy Associative matrix (
Table 1) provides an appropriate manner to immediately integrate the input relations in order to get the fuzzified output results [
14] or [
29]. The input values for the scores of the open-ended items are at the upper section of the matrix and the input values for the total results of the test are down left in the matrix. We have used the conventional Bulgarian grading scale.
Let us review a learner with a total result of 49 points and a mark on the open-ended item of 19 points. He or she belongs to the set of Very Good marks with a degree
and to the set of Good grades with a degree
. Normally, he/she will be assessed with Very good (5). Nevertheless, the crossing of the two marks – the total points together with the points of the open question, denotes the following: He or she belongs to the set
with
degree; to
with
degree; to the set
with
degree, and to the set
with a degree
of the matrix in
Table 1. Consequently, we can assign him or her Excellent (6).
In accordance with [
14] and [
29], we have to recalculate the mark for each learner, whose test score does not belong definitely to a given set. For this purpose, we can consider the table, in which the function
F returns the minimums of
and
. The highest membership grade that is obtained from the table stands for the corrected mark from the matrix (
Table 2).
Now, by way of illustration, let a student have a total test score of 57 points and open question result of 18 points (
and
). As shown in
Table 3, after the fuzzification, the student will be marked with Excellent (6), which coincides with the traditional evaluation.
If we analyze another learner that has received 42 points (which corresponds to Good (4) in the traditional scoring system) and 17 points for the open-ended item, it is seen from
Table 4 that after the correction that particular learner should get a higher mark, namely Very Good (5).
To recalculate the test results, we only need to input the test scores and MapleSoft 2016.0 automatically chooses the scores to be fuzzified and calculates the fuzzified grades.
We define two parameters and and we specify that , , , , , , , , , , , , , , , , , , , .
As a result, Maple returns that when
and
, the two distributions do not differ statistically and
is the largest possible sum. In this case, we have changed 46 marks. When we fuzzify the grades with
and
, the two distributions differ statistically. We acquire 56 marks to be changed as follows: by increasing the set of fuzzified marks we only add new students, whose marks will be reevaluated. That is why we have listed in
Table 5 the 56 fuzzified marks as follows (the information below is organized in this order: [classical test grade, classical open question grade], student’s number in the list, test score, open question score, fuzzified grade, classical grade). We have used a bold font for the 10 new students that were added by increasing
to
.
At any stage of the calculations, Maple tests the Standard T-Test with Paired Samples. In the case of and we get that we should accept the hypothesis that the distributions have equal means and in the case , , Maple returns that we should reject the hypothesis that the two distributions have equal means.
We will analyze the fuzzified grades in the Discussion below in order to justify that we have obtained fairer marks in the first case and less equitable marks in the second one.
In the second exam the students have received the following scores:
Overall points [78, 76, 80, 63, 67, 61, 68, 75, 13, 72, 74, 76, 72, 79, 70, 58, 57, 66, 66, 59, 74, 67, 36, 35, 75, 38, 36, 52, 47, 55, 25, 47, 73, 35, 31, 70] and points on the open questions [15, 14, 15, 7, 6, 8, 9, 13, 0, 13, 13, 14, 11, 15, 15, 3, 9, 11, 8, 9, 13, 12, 0, 3, 14, 0, 0, 6, 7, 8, 0, 3, 13, 0, 0, 7]. Compared with the results from the first test – overall points [57, 74, 58, 58, 22, 65, 19, 70, 21, 63, 42, 42, 62, 42, 63, 23, 47, 62, 66, 74, 59, 56, 69, 59, 67, 55, 48, 41, 52, 46, 42, 37, 53, 61, 58, 30, 35, 69, 63, 66, 54, 41, 48, 47, 50, 63, 56, 41, 50, 56, 45, 36, 60, 43, 69, 55, 70, 77, 71, 74, 24, 66, 54, 43, 49, 59, 53, 42, 54, 60, 62, 56, 35, 59, 71, 53, 61, 52] and points on the open question [18, 20, 19, 19, 7, 18, 0, 20, 0, 17, 14, 16, 14, 16, 18, 10, 8, 20, 16, 19, 16, 16, 20, 14, 20, 18, 18, 0, 14, 7, 9, 0, 8, 13, 20, 9, 0, 12, 16, 18, 18, 13, 19, 15, 17, 19, 8, 15, 20, 19, 16, 0, 16, 15, 20, 20, 19, 20, 20, 20, 10, 19, 19, 0, 0, 0, 16, 10, 14, 14, 18, 18, 7, 19, 17, 16, 16, 6]), it is perceived that the results from the second test from the open questions are much lower than those from the first one.
When we fuzzify the grades with and , the two distributions will differ statistically. We get 18 grades to be modified, without changing the distributions of the overall marks before and after the fuzzification.
3.4. CCA modeling of the assessment process in a cyber-physical educational environment
Ambient-oriented modeling (
) is a type of computational process, in the context of which interactions between objects from the physical and the virtual worlds play a major role. The Calculus of Context-aware Ambients (
) formalism models the system’s ability to respond to changes in the surrounding space [
30]. A
environment is an identity that is used to describe an object or a component – a process, device, location, etc. Each environment has a name, boundaries, and can contain other environments within itself, as well as be included in another environment. There are three possible relationships between any two environments – parent, child, and relative. Each environment can communicate with the environments around it and environments can exchange messages with each other. The process of exchanging messages is done using the handshaking process. In the notation, “
” is a symbol for relative environments; “
” and “
” are parent and child symbols; “
” means sending, and “
” means receiving a message. An environment can be mobile, i.e. it can move within its surroundings. With
, there are two movement options: in and out, which allow environments to move from one location to another. In
, four syntactic categories can be distinguished:
processes P
capabilities M
locations
context expressions k.
As we have already pointed out, the concept of ambients is an abstraction of the limited space where some computation is performed. Ambients are mobile and can build ambient hierarchies. Through these hierarchies, any entity in a cyber-physical system can be modeled, regardless of its nature (physical, logical, mobile, or static), as well as the environment (or context) of that entity. In addition, an ambient contains a process representing its capabilities, i.e. the actions that this ambient is allowed to perform, as well as mobility capabilities, contextual capabilities, and communication capabilities.
Due to its dynamic and hybrid nature, the process of assessing student knowledge in the context described in the previous section can be modeled using the mathematical notation of . The cyber-physical educational environment is, by its nature, a multi-agent system that implements processes and services through interaction between various intelligent agents. Each component of the environment is served by one or more specialist assistants, and users are represented in the platform by their personal assistants. Each such intelligent environment component can be represented by a separate ambient. Let us consider the following ambients:
PA_T – a personal assistant to the teacher;
PA_Si – a personal assistant of the i-th student;
SA_TS – a specialist assistant serving the Test System in the Education space
SA_DM – a specialist assistant providing services related to the use of data from the Data Module
SA_SB – a specialist assistant supporting interaction with Student Books component
AA – an analytical assistant that provides services related to information analysis by using the described fuzzy set approach.
We will model the processes of these ambients according to the hybrid approach described above.
The instructor, through their personal assistant, sends a message to the assistant of the test system requesting to open the test for all students. After a student completes the test, their score is recorded in the Data Module, and the teacher receives information about it. The instructor’s personal assistant communicates with the
ambient with a request to analyze the results of that student according to the considered approach and in consequence receives a proposal for an assessment, which he/she sends to the student’s virtual student book. The process of this ambient is represented by (
2).
After receiving a request to open the test from the teacher’s personal assistant PA_T, the SA_TS ambient sends information to the students’ personal assistants. This communication with the i-th student is modeled in (
3).
As soon as the student finishes working on the test, his/her personal assistant sends a message to the specialist assistant of the data module SA_DM with a request to record the results obtained. The ambient process is represented by (
4).
The specialist assistant of the data module SA_DM records the results of the students and sends information to the teacher. When it receives a request from the
ambient, it selects the requested data and sends it for analysis. The process of this ambient is represented in (
5).
The
ambient analyzes the results of the conducted test after a request from the teacher’s personal assistant. To access a particular set of data, it sends a request to the SA_DM ambient. The process is presented in (
6).
The closing stage of the implementation of the process is the recording of the final assessment of the students in the administrative system of the virtual student book (SA_SB).
The
programming language is a computer-readable version of the
syntax. The interpreter of this language enables testing and verification of the modeled scenario (
Figure 5).