Submitted:
31 August 2025
Posted:
02 September 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction

2. Importance and Role of AI and Machine Learning Techniques in Biomedical Research
3. Entropy: Motivation, Three Definitions, and Applications
3.1. Brief History of Entropy
- (a)
- Experimental physics—Thermodynamics—Heat machines and Heat engines (Clausius): The development of heat machines such as steam engines, initiated by Carnot, required experimental and theoretical understanding of the conversion of heat into mechanical work. Unexpectedly, it occurred that not all heat energy can be converted into mechanical work. The remaining part of energy that was impossible to convert into mechanical work got the name of entropy (Clausius [80,81,82]); for details, see Section 3.2.
- (b)
- Theoretical Physics—Statistical Physics—Kinetic Theory of Gases (Boltzmann): The necessity to build a theoretical description of entropy led to the development of the entropy equation (Boltzmann [69,70,83]). The idea of quantization of momentum, , of gas molecules was used. That leads to the well-known Boltzmann equation and velocity distribution; for details; see Section 3.4.
- (c)
- Theory of Communication and Information (Shannon): It occurred that the information content of messages can be described by entropy defined, specially for this purpose. This set of mathematical approaches had been developed within the communication theory. Later it was improved encoding/decoding tool and used by the military during WWII. Even later it was used in computer & Internet communication (Shannon [45,46]); for details; see Section 3.5.
3.2. Entropy Definition in Thermodynamics
3.3. Concept of Entropy and Its Mathematical Foundations
3.4. Entropy Definition in Statistical Physics
3.5. Entropy Definition in Theory of Information

- should be continuous in the .
- When all are equally probably, so , then is increasing function of N.
- should be additive.
3.6. Applications of Information Entropy in Biology and Medicine
4. Results–Part 1: Comparison of ’Normal’ with ’Non-TdP + TdP’ Rabbits
4.1. Description of Major Arrhythmia Types as Seen on ECG Recordings of Humans and Their Curves: TdP, VT, VF, and PVC in Contrast with Normal Ones






4.2. Algorithm and Graphic Depiction Describing Evaluation of Permutation Entropy: Allowing Easy Orientation in Following Results
| Algorithm 1 The algorithm that is evaluating permutation entropy, which uses three points with the distance equal to , is described using a pseudo-code [92]. N is the length of the input string; triplets of points; the variable defines the distance of values between points of triplets; the variable represents the number of bins within the distribution. |
![]() |
4.3. Permutation Entropy () of Rabbits’ ECG Recordings
4.4. Preprocessing of ECG Recordings: Detailed Inspection of ECG Recordings, and Defining Exact Moments of Drug Applications and Their Increasing Doses
4.5. Statistical Features of Subintervals: Demonstrated on Selected Example of 20
4.6. Systematic Definitions of All Features Used During Preprocessing and Evaluation of Curves: This Section Serves as Reference to all Experiments Conducted in This Study and To Easy Orientation
4.6.1. Definition of Data, Features, and Used Operators Abbreviations: Systematic Overview
4.6.2. Assessing Curves and Designing Data Structures for ML-Experiments
- (i)
- The minimum number of drug infusions that was common for all rabbits was selected in order to compare rabbits correctly. It yielded two intervals: the interval ’0’, called the comparison/control interval (before ), and the interval ’1’, called the methoxamine one (after ). No other intervals can be used because some rabbits got arrhythmia and expired during the second interval (after the application of the first infusion of dofetilide, ; the interval called ’2’). Yet some other rabbits died at the interval ’3’, after an increase of the dose of dofetilide, . Whereas, all non-arrhythmogenic rabbits survived all harsh drug insults applied.
- (ii)
- For each rabbit, a selected control interval had the length of 505 seconds (the minimal common value) just before the application of the first infusion (methoxamine). The actual moment of methoxamine application, , vary for all rabbits.
- (iii)
- Time intervals between the moment of initiation of methoxamine infusion , , and the first dofetilide infusion, , are different for each rabbit, and they yield . Firstly, the length of this interval is retrieved for each rabbit separately. Secondly, the minimal common length of this interval for all rabbits was assessed, which produced the value of 465 seconds. It is assumed that the moment of methoxamine application, , belongs to the interval of methoxamine and not to the control interval—the drug disruption of physiology starts there, whereas anesthesia disruption is already present in the control interval.
4.6.3. Preprocessing of Curves: Design and Creation of Subintervals That Were Subsequently Tested by Whole Range of ML Methods
4.7. List of All Tested Combinations of Features Used to Find the Best Statistics and ML Methods: This Serves as Thorough Navigation Tool to Design Similar Future Approaches
4.7.1. Simple Statistical Features
4.7.2. Advanced Statistical Features
4.7.3. All Tested Features and Their Combinations
4.8. List of All Performed Machine Learning Experiments: The Core Part of the Conducted Research Together with Permutation Entropy Evaluation
- (A)
-
Important time moments: each of s sub-intervals was too long for ML methods (approximately 100 values), and thus various subsampling strategies were tested: only Random Forest succeeded. The pitfall in this approach is that the classification outputs were—in this way—introduced to the experiment already during the subsampling process by Random Forest. After this preselection, the identified important time moments (subsamples of the sub-intervals) were used—served as the input—for subsequently used ML methods. It is not sure whether this approach can be considered an allowable one when we deal with such an extremely low number of ECG recordings.To confirm the achieved results of this preselection, the importance of the identified time moments was manually verified by Box-and-Whisker plots (box plots). It was revealed that arrhythmogenic and non-arrhythmogenic rabbits have different values (distributions) at these time moments, and thus we validated the correctness of identified features. This implies the fact that the same time moments could be identified manually. It is done by comparing the values (distributions) of arrhythmogenic and non-arrhythmogenic rabbits at the appropriate time moments. Thus Random Forest may be interpreted just as a selection technique helping to automate and speed up the selection process of the important time moments.There is still existing a possibility of failure of this approach on larger data. The reason is that the important time moments were identified on a few s and may not exist in the case of the larger data set, number of ECGs—see Figure 17 for a counter-example where an example similar to our case is represented by five sinus curves where the conclusion is subsequently negated by use of ten curves—and thus use of this type of feature is extremely unsafe, and the achieved results must be considered with extreme caution!The identified important time moments of the sub-intervals were tested by:
- (i)
- single machine learning algorithm and
- (ii)
-
ensembles of them
- (1)
- Bagging
- (2)
- AdaBoost
- (3)
- Combination of different algorithms for each value of L parameter
- (4)
- Combination of classifiers (see Section 6.5) for all values of L parameter at once
- (B)
-
Simple statistics: the simple statistical features (mean, standard deviation, variance, min, max; 25th, 50th and 75th percentiles) were evaluated and then tested using the following ML methods & features
- (i)
-
Simple statistics of the original s
- (1)
- (dead end) by a single machine learning algorithm, and
- (2)
-
(dead end) by ensembles of them
- (a)
- (dead end) Combination of classifiers for all values of L parameter at once.
- (ii)
-
Simple stats of the sub-intervals
- (1)
- (dead end) by a single machine learning algorithm, and
- (2)
-
(dead end) by ensembles of them
- (a)
- (dead end) Combination of classifiers for all values of L parameter at once.
Wherever the words ’dead end’ are used, it means that the given ML method(s) did not reach any significant result. - (C)
-
Advanced statistics: advanced statistical features (integral, skewness, kurtosis, slope, [max-min]/length of , energy of , sum of values of , trend of [uptrending, downtrending or without trend]) were evaluated and subsequently tested together with simple statistical features using the following ML methods
- (i)
-
Simple + advanced stats of the original s evaluated by
- (1)
- (dead end) by a single machine learning algorithm and
- (2)
-
by ensembles of them
- (a)
- (dead end) Bagging,
- (b)
- (dead end) AdaBoost,
- (c)
- (dead end) Combination of different algorithms for each value of L parameter,
- (d)
- Combination of classifiers for all values of L parameter at once.
- (ii)
-
Simple + advanced stats of the sub-intervals evaluated by
- (1)
- (dead end) by a single machine learning algorithm and
- (2)
-
(dead end–all following cases) by ensembles of them
- (a)
- Bagging,
- (b)
- AdaBoost,
- (c)
- Combination of different algorithms for each value of L parameter,
- (d)
- Combination of classifiers for all values of L parameter at once.
- (D)
-
Important statistics: important statistical features from the simple + advanced stats were evaluated and subsequently tested using the following ML methods
- (i)
-
For the original s
- (1)
- (dead end) by a single machine learning algorithm and
- (2)
-
by ensembles of them
- (a)
- (dead end) Bagging,
- (b)
- (dead end) AdaBoost,
- (c)
- (dead end) Combination of different algorithms for each value of L parameter,
- (d)
- Combination of classifiers for all values of L parameter at once.
- (ii)
-
For the sub-intervals
- (1)
- (dead end) by a single machine learning algorithm and
- (2)
-
by ensembles of them
- (a)
- (dead end) Bagging,
- (b)
- (dead end) AdaBoost,
- (c)
- (dead end) Combination of different algorithms for each value of L parameter,
- (d)
- Combination of classifiers for all values of L parameter at once.
- (E)
-
Statistic & Time Feature Combinations: Combination of statistical features (simple + advanced or only important from simple + advanced) and important time moments were tested using the following ML methods:
- (i)
- A single machine learning algorithm and
- (ii)
-
Ensembles of ML Algorithms
- (1)
- Bagging,
- (2)
- AdaBoost,
- (3)
- Combination of different algorithms for each value of L parameter,
- (4)
- Combination of classifiers for all values of L parameter at once.
- (F)
-
DFT coefficients: Important DFT coefficients of the sub-intervals, selected from the first 15 real and first 15 imaginary DFT coefficients, were tested using the following ML methods:
- (i)
-
A single machine learning algorithm (k-NN)the best result found for 40 is giving: Se = 0.845, Sp = 0.835, AUC = 0.840, Acc = 0.839, and
- (ii)
-
Ensembles of ML algorithms
- (i)
- Combination of classifiers for all values of L parameter at once.
- (G)
-
Important time moments based on DFT: important time moments of the sub-intervals reconstructed by the first ten DFT coefficients (real and imaginary parts) were tested by:
- (i)
- (dead end) by a single machine learning algorithm
- (H)
-
DWT coefficients: The ten most significant DWT coefficients in absolute value of the sub-intervals were taken. These coefficients were sorted in descending order for subsequent classifying using machine learning algorithms by
- (i)
- (dead end) A single machine learning algorithm,
- (ii)
-
Ensembles of them
- (1)
- Combination of classifiers for all values of L parameter at once.
- (I)
-
Important time moments based on DWT: important time moments of the sub-intervals reconstructed by the ten most in absolute value significant DWT coefficients were tested by
- (i)
- A single machine learning algorithm.
- (J)
- Useful L: best values of L parameter based on all above-mentioned experiments. This section contains the summary; see Table 8 values of the L parameter with ARARS scores greater than 80%. In the case of values of the L parameter with ARARS scores greater than 90%, their highest ARARS scores are explicitly stated. As can be seen from the table, the most useful values of the L parameter are 10, 90, and 500.
- (K)
- The usefulness of ML methods and their best results has been evaluated and documented; see Table 7.
- (L)
- Outlier detection was not performed due to an insufficient number of experimental ECG recordings.
- (M)
- Real-time prediction on the one-minute sub-intervals was tested. It revealed that the first three minutes after application of methoxamine are the most important.
4.9. List of Best Achieved Machine Learning Results for Normal Group Against non-TdP and TdP Acquiring Group of Rabbits
4.9.1. Lags L of That Are Giving Best Results for Listed Combinations of Features and Algorithms
4.9.2. Predictions Employing Majority Voting above Simultaneous Combinations of Classifiers for All Values of Lags L
| Features | Process map ID | Used algorithms | Se (%) | Sp (%) | AUC | Acc (%) |
|---|---|---|---|---|---|---|
| ISI_Top5-TM | aii4 | SVM | 1.0 | 1.0 | 1.0 | 1.0 |
| RF | 0.99 | 0.88 | 0.93 | 0.95 | ||
| k-NN | 0.99 | 1.0 | 0.99 | 0.99 | ||
| LR | 0.96 | 0.87 | 0.91 | 0.93 | ||
| OC_Top5-ASF | di2d | SVM | 0.83 | 0.97 | 0.9 | 0.88 |
| SVM, RF, k-NN, LR | 0.8 | 0.99 | 0.9 | 0.87 | ||
| OC_ASF | ci2d | SVM | 0.83 | 0.86 | 0.84 | 0.84 |
| ISI_Top5-ASF | dii2d | SVM | 0.93 | 0.93 | 0.93 | 0.93 |
| ISI_Top5-TM & ISI_Top5-ASF | eii4 | SVM, RF, k-NN, LR | 1.0 | 1.0 | 1.0 | 1.0 |
| ISI-DFT_Top5-C | fii | SVM | 0.99 | 0.99 | 0.99 | 0.99 |
| ISI-DWT_Top10-C | hii | SVM | 0.95 | 0.85 | 0.9 | 0.9 |
4.9.3. List of Best ML-results for All Values of Lags L: Guide to Easy Orientation
| L | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 5 | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 | 200 | 300 | 400 | 500 | |
| RF | x | x | x | |||||||||||||
| SVM | x | x | x | x | x | x | x | 0.92 | x | 0.92 | x | 0.93 | ||||
| k-NN | x | x | x | x | x | x | 0.92 | x | x | x | 0.91 | |||||
| LR | 0.92 | 0.92 | x | x | x | x | ||||||||||
| Ensemble | 0.95 | x | x | x | 0.92 | x | x | 0.93 | ||||||||
5. Discussion
5.1. The Role of Hypothesis Creation and Testing in Science, Statistics, and Machine Learning
- (i)
- A scientific hypothesis is a preliminary idea/guess fitting the evidence that must be further elucidated [95]. A good scientific hypothesis is testable, and it either proves itself to be true or false. When it is proven true, then it becomes a law or theory. In the future, any law or theory could be disproved in the light of new evidence.
- (ii)
-
A statistical hypothesis is dealing with the relationship between observations. A statistical hypothesis test is used to compute a critical value (called p) that says how probable it is that the observation is by mere chance [95]. Lower p means a higher probability that the observation is not by chance due to the chosen data. High p means that relationship is probably observed by chance. The value means that chosen hypothesis can be valid by mere chance in five percent of cases—with decreasing p, such chance decreases. We are never one hundred percent sure about the outcome of statistical hypothesis testing, even for very small p-values. That is why we want to reach a p as small as possible.Two types of hypotheses are used:
- (0)
- Null hypothesis–H0 means that there is no difference between observed events with some value of p. No effect is present.
- (1)
- Alternative hypothesis–H1 means that we suggest the presence of some effect. The alternative hypothesis is accepted when the null hypothesis gets rejected.
- (iii)
-
The machine learning hypothesis is a model that approximates the relationship between input and output in the best way from all possible hypotheses that can be made using a given method(s) [48,78]. Learning in ML is de facto searching through the space of all available hypotheses for a given set of ML methods.There are two types of hypotheses recognized in ML
- (1)
- h (hypothesis): It is a single hypothesis. A specific model is mapping inputs into outputs, which can be evaluated and used to make predictions on unknown data.
- (2)
- H (hypothesis set): It is a space of hypotheses. A space of all possible hypotheses, which can map given inputs into known outputs, that is searched through for the best candidate.
By choosing model(s) and its parameters, we define a hypothesis space H, which is searched, that contains a single hypothesis h that will best approximate the targeted function between inputs and outputs. It is more efficient to choose more models and parameters, as it speeds up the search time. This is a very difficult task, as we do not know the target function in advance.
5.2. Input Data: What We Must Be Aware of Prior to Evaluation of ECG Data by Entropy Measures and ML Methods
- (i)
- Availability of ECG recordings as open access data has a very high priority in research. This is the fundamental condition of development: availability of reliable & replicable AI, ML, and DL methods in medicine. Everyone must be able to reevaluate the research (data are not open-source in this study [52,88,89]). Additionally, it opens the development to the future even more sensitive algorithms and their easy comparison with their predecessors. Each change of the ECG database (usually its update) must be followed by reevaluation of the actual and all older algorithms applied to it. It can be automated.
- (ii)
- Annotation of ECG recordings by cardiologists. This gives very strong input data to all supervised algorithms/methods that are going to be applied on those data, as it ensures that the methods will provide correct outputs. Any errors in ECG annotation can be fatal to all subsequent evaluation steps of ECG recordings. Those errors must be avoided prior to any further evaluation of data; a double-check by using two different approaches is desirable (not done above ECGs from this study).
- (iii)
- Visual inspection of ECG recordings by a mathematician prior to their CS processing. It enables decreasing the possibility of spoiling training data by incorrect data. This is hard, as mathematicians are usually not trained to classify heart arrhythmia, tachycardia, blocks, and other heart diseases, but failure in this stage can save serious, unwanted consequences on data processing. Completely nonsensical data can be ruled out in this way. Exclusion of data must be carried out with great care, as we might rule out some features that we currently do not understand but in the future they can be found useful.
- (iv)
- Entropy measures evaluation of ECG recordings. To carry out the visual inspection of entropy curves is necessary. HRV variability inspection is desirable too, as it can reveal the presence of some hidden, non-obvious physiological insults that can influence the development of classification/prediction methods. Within the ECG recordings used in this study, there were detected abrupt changes of HRV that were not explained by their authors. A detailed inspection of entropy measures that are present in combined graphs reveals a lot of hidden information, and it helps to design the ML phase of the research and improve feature selection.
- (v)
- Statistical evaluation of entropy curves serve as a preparatory step for the application of ML methods. Statistics alone are unable to discern and classify observed phenomena and provide sufficient preciseness and reliability.
- (vi)
- Feature selection. It is important to pre-process data, as they are usually too large to be evaluated by ML without any kind of feature selection. Feature selection narrows the amount of processed data in the ML stage.
- (vii)
- Machine learning preliminary tests. This stage helps to quickly scan huge space of all tested hypothesis that are defined by input data and their statistical processing, which produces features used in ML methods.
- (viii)
- Machine learning production runs using various methods. This stage is zeroing in on the final, most efficient ML methods above given input data. A very important part of this stage is the application of different ML methods above the same data that can, in the ideal case, be preprocessed differently. Identical outputs from different ML methods are strong evidence that the methodology and the effect itself are correct.
5.3. Complex Systems View: Wider Background and Methodology

5.4. Machine Learning View: Grouped by Different Approaches
5.4.1. Classification by statistical features of alone
5.4.2. Single ML Methods
5.4.3. Ensembles of ML Methods: statistical features, selected times, and other features
5.5. Advantages of Used ML Methods
5.6. Disadvantages of Used ML-Methods
5.7. Limitations of Achieved Results: General and Specific
- Different species, such as rabbits, dogs, pigs, and humans, will very probably produce different features. Hence, ML methods will produce different results!
- The same species can have inter-species variability. It means that the same output, an arrhythmia, can have two distinct sets of features that detect it.
- For laboratory animals is more often observed lack of inter-species variability and signs of inbreeding. This shifts experimental animals far from the standard population that is having high numbers of gene alleles and different epigenetic setups!
- It is even possible that specially bred laboratory animals could produce different results in different laboratories for undecidable reasons, because some part of the protocol can be slightly altered, diet different, the treatment of animals by staff different, or the operation procedure can vary. One example can be a light regimen that strongly affects the hormonal setup of animals that are otherwise identical.
- All influences from the above can somehow—for us in unknown ways—alter the underlying physiology of the tested animals. This has a substantial impact on heart physiology of animals and, hence, can alter the entire experiment and arrhythmia prediction.
- The robustness of achieved results in this study must be tested by exploration of larger numbers of animals, different species, and finally on humans.
- A single AI/ML method is less reliable than several independent methods reaching the same conclusion. At this point it is worth stressing out that this research is consistent with deep learning research [103] results where authors found a neuronal network capable of predicting arrhythmias in ICU patients one hour before their onset from ECG recordings.
5.8. Biomedical Point of View
5.9. Future Directions
6. Materials and Methods
6.1. Database of ECG Recordings Measured on Rabbits
6.2. Permutation Entropy
6.3. Simple and Advanced Statistics Used to Classify ECGs
6.4. Preclinical and Clinical Chained Research Method
6.5. Machine Learning Techniques: Brief Description
6.5.1. Training and test data sets
6.5.2. Logistic Regression (LR)
6.5.3. k-Nearest Neighbors Algorithm (k-NN)
6.5.4. Support Vector Machine (SVM)
6.5.5. Decision Tree (DT)
6.5.6. Ensemble Learning (EL)

6.5.7. Random Forest (RF)
- 1.
- Original data are split into randomly chosen subsets (bootstrapping).
- 2.
- Every decision tree is at each node selecting the best split using only a random subset of all features during its construction (this ensures diversity of trees).
- 3.
- Each constructed decision tree is evaluated for each subset.
- 4.
- The final decision is created by the majority vote of all predictions from all trees. Averaging of the values during the evaluation of the final decision is applied in regression!
6.6. Definition of ARARS Score: More Balanced Performance Measure
6.7. Standardization of Features: Rescaling of Variables
6.8. Estimates of Feature Importance: Navigation Tools Towards Viable Hypotheses
- (i)
- All data of one rabbit stored at one table is providing simultaneous information about all features/time and all lags L at one place (two-dimensional case, 2-D); see Figure 25. This is how data are stored in the program.
- (ii)
- Vertical viewpoint displays varying features (alternatively varying time) for a fixed, preselected lag L for each rabbit separately (3-D case), one horizontal plane represents one rabbit; see Figure 26,
- (iii)
- Horizontal viewpoint displays varying lag L against a fixed, preselected feature for each rabbit separately (3-D case), one horizontal plane represents one rabbit; see Figure 27.
- (iv)
- The PeE time-slicing method takes values from curves of different rabbits—in total N for a given fixed at the preselected time t—and create a time slice (2-D case) from them. This time slice is consecutively displayed in N-dimensional space (N-D case)—a 3D example with with the same #L of three different rabbits is shown in Figure 28.
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| ECG | Electrocardiogram |
| PVC | Premature ventricular contraction |
| TdP | Torsades de Pointes |
| VT | Ventricular tachycardia |
| VF | Ventricular fibrillation |
| HRV | Heart rate variability |
| CS | Complex system |
| PeE | Permutation Entropy |
| STD | Standard deviation |
| AI | Artificial Intelligence |
| ML | Machine Learning |
| SVM | Support vector machine |
| RF | Random forest |
| LR | Logistic regression |
| EL | Ensemble learning |
| DM | Data mining |
Appendix A. Used ML Methods Strategies and Specifications
Appendix A.1. Figure Depicting Strategy and Structure of All Performed ML Experiments

Appendix A.2. Principles of Ensemble Learning: Explained on SVM


Appendix A.3. Gini Impurity
| Example | Class A [%] | Class B [%] | Total # samples [100 %] | Filling formula | Final |
|---|---|---|---|---|---|
| A | 100 | 0 | 100 | 0 | |
| B | 75 | 25 | 100 | ||
| C | 50 | 50 | 100 | ||
| D | 25 | 75 | 100 | ||
| E | 0 | 100 | 100 | 0 |
Appendix B. Reliability, Reproducibility, and Safety of ML/AI Solution–Critical Questions
Appendix B.1. Point 1.: "How is ML/AI Model Embedded in Feedback Loops to Facilitate a Learning Health System?"
Appendix B.2. Point 2.: "What Is the Health Question Relating to Patient Benefit?"
Appendix B.3. Point 3.: "When and How Should Patients Be Involved in Data Collection, Analysis, Deployment, and Use?"
Appendix B.4. Point 4.: "Is There Organizational Transparency about the Flow of Data?"
Appendix B.5. Point 5.: "Is the data suitable to answer the clinical question, i.e., does it capture the relevant real-world heterogeneity, and is it of sufficient detail and quality?"
Appendix B.6. Point 6.: "Does the validation methodology reflect the real world constrains and operational procedures associated with data collection and storage?"
Appendix B.7. Point 7.: "On what basis are data accessible to other researchers?"
Appendix B.8. Point 8.: "What computational software resources are available, and are they sufficient to tackle this problem?"
Appendix B.9. Point 9.: "Are the reported performance metrics relevant for the clinical context in which the model will be used?"
Appendix B.10. Point 10.: "Is the reported gain in statistical performance with the AI/ML algorithm clinically justified in the context of any trade-offs?"
Appendix B.11. Point 11.: "Is the ML/AI algorithm compared to the current best technology, and against other appropriate baselines?"
Appendix B.12. Point 12.: "Are different parts of the prediction modeling pipeline available to others to allow for method reproducibility, including: statistical code for ’preprocessing’, and the modeling work-flow (including the methods, parameters, random seeds, etc. utilized)?"
Appendix B.13. Point 13.: "Are the results reproducible in settings beyond where the system was developed (i.e., external validity)?"
Appendix B.14. Point 14.: "What evidence is there that the model does not create or exacerbate inequities in health-care by age, sex, ethnicity or other protected ethnicities?"
Appendix B.15. Point 15.: "What evidence is there that clinicians and patients find the model and its output (reasonably) interpretable?"
Appendix B.16. Point 16.: "What evidence is there of real world model effectiveness in the proposed clinical setting and how are unintended consequences prevented?"
Appendix B.17. Point 17.: "How is the model being regularly re-assessed, and updated as data quality and clinical praxes changes (i.e., post-license monitoring)?"
Appendix B.18. Point 18.: "Is the ML/AI model cost-effective to build, implement, and maintain?"
Appendix B.19. Point 19.: "How will potential financial benefits be distributed if the ML/AI model is commercialized?"
Appendix B.20. Point 20.: "How have the regulatory requirements for accreditation/approval been addressed?"
Appendix C. Results–Part 2: An Example of Falsified Machine Learning Results for Normal and Non-TdP Against TdP Acquiring Groups of Rabbits
Appendix C.1. Normal and Non-TdP Against TdP Acquiring Rabbits: Lags L of PeE That Are Giving Best Results for Listed Combinations of Features and Algorithms
| Features | Used algorithms | Used L | |
|---|---|---|---|
| Control | Methoxamine | ||
| ISI_Top5-TM | SVM | 1, 10, 100, 20, 30, 300, 400, 50, 60, 80 |
1, 10, 20, 200, 30, 300, 5, 50, 500, 60, 70, 80, 90 |
| RF | 10, 30, 60, 70 | 10, 30, 5, 70 | |
| k-NN | 10, 20, 200, 30, 300, 400, 5, 50, 70, 80 |
20, 200, 30, 5, 90 | |
| LR | 10, 50, 60 | 70, 80, 90 | |
|
OC_Top5-ASF |
SVM | 1, 10, 100, 20, 200, 30, 300, 40, 400, 5, 50, 500, 60, 70, 80, 90 |
|
| SVM, RF, k-NN, LR |
RF: 1, 10, 100, 200, 30, 300, 40, 400, 5, 50, 500, 60, 70, 80, 90 LR: 10, 100, 20, 30, 40, 5, 50, 60, 80, 90 k-NN: 1, 10, 100, 20, 200, 30, 300, 40, 400, 5, 50, 500, 60, 70, 80, 90 SVM: 1, 10, 100, 20, 200, 30, 300, 40, 400, 5, 50, 500, 60, 70, 80, 90 |
||
| OC_ASF | SVM | 10, 100, 20, 30, 300, 40, 5, 50, 70, 80, 90 | |
| ISI_Top5-ASF | SVM | 20, 30, 300, 60, 70 | 10, 100, 40, 5, 500 |
| ISI_Top5-TM & ISI_Top5-ASF | SVM, RF, k-NN, LR |
RF: 10, 20, 300, 60 LR: 10, 300, 400, 50, 60, 80 k-NN: 20, 300, 5, 60 SVM: 10, 100, 20, 30, 300, 400, 5, 50, 60, 80 |
RF: 10, 5, 70 LR: 100, 300, 50, 500, 70, 80, 90 k-NN: 10, 100, 200, 5, 500, 70, 90 SVM: 10, 100, 200, 30, 300, 5, 50, 500, 60, 70, 80 |
| ISI-DFT_Top5-C | SVM | 10, 20, 30, 300, 40, 50, 60, 80, 90 | 100, 20, 40, 5, 70, 80, 90 |
| ISI-DWT_Top10-C | SVM | 1, 20, 60 | 1, 20, 30, 500 |
Appendix C.2. Normal and non-TdP Against TdP Acquiring Rabbits: Predictions Employing Majority Voting above Simultaneous Combinations of Classifiers for All Values of Lags L
| Features | Process map ID | Used algorithms | Se (%) | Sp (%) | AUC | Acc (%) |
|---|---|---|---|---|---|---|
| ISI_Top5-TM | aii4 | SVM | 1.0 | 1.0 | 1.0 | 1.0 |
| RF | 0.65 ↓ | 1.0 ↑ | 0.82 ↓ | 0.9 ↓ | ||
| k-NN | 0.77 ↓ | 1.0 | 0.88 ↓ | 0.93 ↓ | ||
| LR | 0.71 ↓ | 1.0 ↑ | 0.85 ↓ | 0.92 ↓ | ||
| OC_Top5-ASF | di2d | SVM | 0.77 ↓ | 0.95 ↓ | 0.86 ↓ | 0.9 ↑ |
| SVM, RF, k-NN, LR | 0.69 ↓ | 0.99 | 0.84 ↓ | 0.91 ↑ | ||
| OC_ASF | ci2d | SVM | 0.72 ↓ | 0.96 ↑ | 0.84 | 0.89 ↑ |
| ISI_Top5-ASF | dii2d | SVM | 0.88 ↓ | 0.98 ↑ | 0.93 | 0.95 ↑ |
| ISI_Top5-TM & ISI_Top5-ASF | eii4 | SVM, RF, k-NN, LR | 0.99 ↓ | 1.0 | 0.99 ↓ | 0.99 ↓ |
| ISI-DFT_Top5-C | fii | SVM | 0.99 | 1.0 ↑ | 0.99 | 0.99 |
| ISI-DWT_Top10-C | hii | SVM | 0.98 ↑ | 1.0 ↑ | 0.99 ↑ | 0.99 ↑ |
Appendix C.3. Normal and non-TdP Against TdP Acquiring Rabbits: List of Best ML-results for All Values of Lags L—Guide to Easy Orientation
| L | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 5 | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 | 200 | 300 | 400 | 500 | |
| RF | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||
| SVM | x | x | x | x | x | x | 0.93 | x | x | 0.91 | x | x | x | 0.91 | x | 0.92 |
| k-NN | x | x | 0.91 | x | x | x | x | x | x | 0.91 | x | 0.91 | x | x | x | x |
| LR | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||
| Ensemble | x | x | x | x | x | x | x | 0.94 | 0.91 | x | x | x | x | x | x | x |
References
- Van Noord, C.; Eijgelsheim, M.; Stricker, Bruno, H.Ch. Drug- and non-drug-associated QT interval prolongation. British Journal of Clinical Pharmacology 2010, 70, 16–23. [CrossRef]
- Jayasinghe, R.; Kovoor, P. Drugs and the QTc interval. Australian Prescriber 2002, 25, 63–65. [Google Scholar] [CrossRef]
- Chorin, E.; Dai, M.; Shulman, E.; Wadhwani, L.; Bar Cohen, R.; Barbhaiya, C.; Aizer, A.; Holmes, D.; Bernstein, S.; Soinelli, M.; et al. The QT Interval in Patients with SARS-CoV-2 Infection Treated with Hydroxychloroquine/Azithromycin. medRxiv 2020. [Google Scholar] [CrossRef]
- Li, M.; Ramos, L.G. Drug-Induced QT Prolongation And Torsades de Pointes. Pharmacy and Therapeutics: a peer-reviewed journal for formulary management 2017, 42, 473–477. [Google Scholar]
- Yap, Y.G.; Camm, A.J. Drug induced QT prolongation and torsades de pointes. Heart 2003, 89, 1363–1372. [Google Scholar] [CrossRef]
- Trinkley, K.E.; Page II, R.L.; Lien, H.; Yamanouye, K.; Tisdale, J.E. QT interval prolongation and the risk of torsades de pointes: essentials for clinicians. Current Medical Research and Opinion 2013, 29, 1719–1726. [Google Scholar] [CrossRef]
- Katz, A.M. Physiology of the heart, 5 ed.; Lippincott Williams & Wilkins, 2011; p. 576.
- Topol, E. Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again, 1st ed.; Basic Books, Inc.: USA, 2019. [Google Scholar]
- Cleophas, T.J.; Zwinderman, A.H. Machine Learning in Medicine - a Complete Overview; Springer, 2015; p. 516. [CrossRef]
- Kreibig, S.D. Autonomic nervous system activity in emotion: A review. Biological Psychology 2010, 84, 394–421, The biopsychology of emotion: Current theoretical and empirical perspectives. [Google Scholar] [CrossRef]
- Mauss, I.B.; Robinson, M.D. Measures of emotion: A review. Cognition and Emotion 2009, 23, 209–237. [Google Scholar] [CrossRef] [PubMed]
- Kléber, A.G.; Rudy, Y. Basic Mechanisms of Cardiac Impulse Propagation and Associated Arrhythmias. Physiological Reviews 2004, 84, 431–488. [Google Scholar] [CrossRef] [PubMed]
- Richman, J.S.; Moorman, J.R. Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology: Heart and Circulatory Physiology 2000, 278, H2039–H2049. [Google Scholar] [CrossRef] [PubMed]
- Costa, M.; Peng, C.K.; Goldberger, A.L.; Hausdorff, J.M. Multiscale entropy analysis of human gait dynamics. Physica A 2003, 330, 53–60. [Google Scholar] [CrossRef] [PubMed]
- Humeau-Heurtier, A. The Multiscale Entropy Algorithm and Its Variants: A Review. Entropy 2015, 17, 3110–3123. [Google Scholar] [CrossRef]
- Liang, Z.; Wang, Y.; Sun, X.; Li, D.; Voss, L.J.; Sleigh, J.W.; Hagihira, S.; Li, X. EEG entropy measures in anesthesia. Frontiers in Computational Neuroscience 2015, 9, 16. [Google Scholar] [CrossRef] [PubMed]
- Olofsen, E.; Sleigh, J.W.; Dahan, A. Permutation entropy of the electroencephalogram: a measure of anaesthetic drug effect. British Journal of Anaesthesia 2008, 101, 810–821. [Google Scholar] [CrossRef]
- Shivaram, S.; Muthyala, A.; Meghji, Z.Z.; Karki, S.; Arunachalam, S.P. Multiscale Entropy Technique Discriminates Single Lead ECG’s With Normal Sinus Rhythm and Sleep Apnea. In Proceedings of the Frontiers in Biomedical Devices, 04 2018, Vol. DMD2018-6948, p. V001T01A016. [CrossRef]
- Lee, C.H.; Sun, T.L.; Jiang, B.C.; Choi, V.H. Using Wearable Accelerometers in a Community Service Context to Categorize Falling Behavior. Entropy 2016, 18. [Google Scholar] [CrossRef]
- Vargas, B.; Cuesta-Frau, D.; Ruiz-Esteban, R.; Cirugeda-Roldan, E.; Varela, M. What Can Biosignal Entropy Tell Us About Health and Disease? Applications in Some Clinical Fields. Nonlinear Dynamics Psychology and Life Sciences 2015, 19, 419–436. [Google Scholar]
- Lynch, K. Apple Watch 4. Smartwatch, Apple Inc., Cupertino, CA, USA, 2019.
- Schulz, S.; Adochiei, F.C.; Edu, I.R.; Schroeder, R.; Costin, H.; Bär, K.J.; Voss, A. Cardiovascular and cardiorespiratory coupling analyses: a review. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 2013, 371, 20120191. [Google Scholar] [CrossRef]
- Frank, B.; Pompe, B.; Schneider, U.; Hoyer, D. Permutation entropy improves fetal behavioural state classification based on heart rate analysis from biomagnetic recordings in near term fetuses. Medical & Biological Engineering & Computing 2006, 44, 179–87. [Google Scholar] [CrossRef]
- Kroc, J. Emergent Information Processing: Observations, Experiments, and Future Directions. Software 2024, 3, 81–106. [Google Scholar] [CrossRef]
- Kroc, J. Difference Between AI and Biological Intelligence Observed by Lenses of Emergent Information Processing. In Complex Systems With Artificial Intelligence; López-Ruiz, R., Ed.; IntechOpen: Rijeka, 2024; chapter 4. [CrossRef]
- Kroc, J. Exploring Emergence: Video-Database of Emergents Found in Advanced Cellular Automaton ’Game of Life’ Using GoL-N24 Software. https://www.researchgate.net/publication/373806519, 2023. Accessed as on 09-10-2023.
- Kroc, J. Robust massive parallel information processing environments in biology and medicine: case study. Journal of Problems of Information Society 2022, 13, 12–22. [Google Scholar] [CrossRef]
- Dehmer, M.; Mowshowitz, A. A history of graph entropy measures. Information Sciences 2011, 1, 57–78. [Google Scholar] [CrossRef]
- Borda, M. Fundamentals in Information Theory and Coding, 1 ed.; Springer, Berlin, Heidelberg, 2011. [CrossRef]
- Arndt, C. Information Measures: Information and its Description in Science and Engineering; Springer, Berlin, Heidelberg, 2001. [CrossRef]
- Ben-Naim, A. Entropy Demystified: The Second Law Reduced To Plain Common Sense, revised ed.; World Scientific, 2008. [CrossRef]
- Ben-Naim, A. A Farewell to Entropy: Statistical Thermodynamics Based on Information; World Scientific, 2008. [CrossRef]
- Perkiömäki, J.S.; Mäkikallio, T.H.; Huikuri, H.V. Fractal and Complexity Measures of Heart Rate Variability. Clinical and Experimental Hypertension 2005, 27, 149–158. [Google Scholar] [CrossRef] [PubMed]
- Pincus, S.M.; Gladstone, I.M.; Ehrenkranz, R.A. A regularity statistic for medical data analysis. Journal of Clinical Monitoring 1991, 7, 335–45. [Google Scholar] [CrossRef]
- Rolnick, D.; Donti, P.L.; Kaack, L.H.; Kochanski, K.; Lacoste, A.; Sankaran, K.; Ross, A.S.; Milojevic-Dupont, N.; Jaques, N.; Waldman-Brown, A.; et al. Tackling Climate Change with Machine Learning. CoRR 2019, abs/1906.05433, [1906.05433].
- Ben-Naim, A. Information, Entropy, Life and the Universe: What We Know and What We Do Not Know; World Scientific, 2015; [https://www.worldscientific.com/doi/pdf/10.1142/9479]. [CrossRef]
- Borowska, M. Entropy-Based Algorithms in the Analysis of Biomedical Signals. Studies in Logic, Grammar and Rhetoric 2015, 43, 21–32. [Google Scholar] [CrossRef]
- Pincus, S.M. Approximate Entropy as a Measure of System Complexity. Proceedings of the National Academy of Sciences of the United States of America 1991, 88, 2297–2301. [Google Scholar] [CrossRef]
- Pincus, S.M. Approximate entropy (ApEn) as a complexity measure. Chaos: An Interdisciplinary Journal of Nonlinear Science 1995, 5, 110–117. [Google Scholar] [CrossRef]
- Riedl, M.; Müller, A.; Wessel, N. Practical considerations of permutation entropy. The European Physical Journal Special Topics 2013, 222, 249–262. [Google Scholar] [CrossRef]
- Costa, M.; Goldberger, A.L.; Peng, C.K. Multiscale entropy of biological signals. Physical review. E, Statistical, nonlinear, and soft matter physics 2005, 71, 021906. [Google Scholar] [CrossRef] [PubMed]
- Costa, M.; Goldberger, A.L.; Peng, C.K. Multiscale Entropy Analysis of Complex Physiologic Time Series. Physical Review Letters 2002, 89, 068102. [Google Scholar] [CrossRef] [PubMed]
- Gao, J.; Hu, J.; Tung, W.W. Entropy measures for biological signal analyses. Nonlinear Dynamics 2012, 68, 431–444. [Google Scholar] [CrossRef]
- Bari, V.; Girardengo, G.; Marchi, A.; De Maria, B.; Brink, P.A.; Crotti, L.; Schwartz, P.J.; Porta, A. A Refined Multiscale Self-Entropy Approach for the Assessment of Cardiac Control Complexity: Application to Long QT Syndrome Type 1 Patients. Entropy 2015, 17, 7768–7785. [Google Scholar] [CrossRef]
- Shannon, C.E.; Weaver, W. The Mathematical Theory of Communication; Urbana: University of Illinois Press, 1998. [Google Scholar]
- Shannon, C.E. A mathematical theory of communication. The Bell System Technical Journal 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Guido, S.; Müller, A.C. Introduction to Machine Learning with Python; O’Reilly Media, 2016.
- Mitchell, T.M. Machine Learning; McGraw-Hill Series in Computer Science, McGraw-Hill Education, 1997.
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements Of Statistical Learning: Data Mining, Inference, and Prediction, 2 ed.; Springer Series in Statistics, Springer, New York, 2009.
- Shalev-Shwartz, S.; Ben-David, S. Understanding Machine Learning: From Theory to Algorithms; Cambridge University Press, 2014. [CrossRef]
- Hastie, T.; Tibshirani, R.; Friedman, J.; Franklin, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. The Mathematical Intelligencer 2004, 27, 83–85. [Google Scholar] [CrossRef]
- Jarkovská, D.; Nalos, L.; Štengl, M. ECG reccordings of methoxamine and dofetilide induced Torsades de Pointes arrhythmias on rabbit model having fat-rich diet. Technical report, Charles University, Faculty of Medicine in Pilsen, Department of Physiology, 2013-2014.
- Yuan, Q.; Zhou, W.; Li, S.; Cai, D. Epileptic EEG classification based on extreme learning machine and nonlinear features. Epilepsy Research 2011, 96, 29–38. [Google Scholar] [CrossRef] [PubMed]
- Carricarte Naranjo, C.; Sanchez-Rodriguez, L.M.; Brown, M.M.; Estévez, B.M.; Machado, G.A. Permutation entropy analysis of heart rate variability for the assessment of cardiovascular autonomic neuropathy in type 1 diabetes mellitus. Computers in Biology and Medicine 2017, 86, 90–97. [Google Scholar] [CrossRef] [PubMed]
- Ravelo-Garcia, A.G.; Navarro-Mesa, J.L.; Casanova-Blancas, U.; Martin-González, S.; Quintana-Morales, P.; Guerra-Moreno, I.; Canino-Rodrígues, J.M.; Hernández-Pérez, E. Application of the Permutation Entropy over the Heart Rate Variability for the Improvement of Electrocardiogram-based Sleep Breathing Pause Detection. Entropy 2015, 17, 914–927. [Google Scholar] [CrossRef]
- Bandt, C.; Pompe, B. Permutation Entropy: A Natural Complexity Measure for Time Series. Physical Review Letters 2002, 88, 174102. [Google Scholar] [CrossRef]
- Zanin, M.; Zunino, L.; Rosso, O.A.; Papo, D. Permutation Entropy and Its Main Biomedical and Econophysics Applications: A Review. Entropy 2012, 14, 1553–1577. [Google Scholar] [CrossRef]
- Bian, C.; Qin, C.; Ma, Q.D.Y.; Shen, Q. Modified permutation-entropy analysis of heartbeat dynamics. Physical review. E, Statistical, nonlinear, and soft matter physics 2012, 85, 021906. [Google Scholar] [CrossRef]
- Malik, M.; Camm, A.J. Heart rate variability. Clinical Cardiology 1990, 13, 570–576, [https://onlinelibrary.wiley.com/doi/pdf/10.1002/clc.4960130811]. [Google Scholar] [CrossRef]
- Camm, A.J.; Malik, M.; Bigger, J.T.; Breithardt, G.; Cerutti, S.; Cohen, R.J.; Coumel, P.; Fallen, E.L.; Kennedy, H.L.; Kleiger, R.E.; et al. Heart Rate Variability: Standards of Measurement, Physiological Interpretation, and Clinical Use. Task Force of the European Society of Cardiology the North American Society of Pacing Electrophysiology. Circulation 1996, 93, 1043–1065. [Google Scholar] [CrossRef]
- Acharya, U.R.; Joseph, K.P.; Kannathal, N.; Min C., C.; Suri, J.S. Heart Rate Variability. In Advances in Cardiac Signal Processing; Acharya, U.R.; Suri, J.S.; Spaan, J.; S.M., K., Eds.; Springer, Berlin, Heidelberg, 2007; pp. 121–165. [CrossRef]
- Bravi, A.; Longtin, A.; Seely, A.J. Review and classification of variability analysis techniques with clinical applications. Biomedical Engineering OnLine 2011, 10, article. [Google Scholar] [CrossRef]
- Kroc, J.; Balihar, K.; Matějovič, M. Complex Systems and Their Use in Medicine: Concepts, Methods, and Bio-Medical applications. preprint 2019, pp. 1–21. [CrossRef]
- Zheng, R.; Yamabe, S.; Nakano, K.; Suda, Y. Biosignal Analysis to Assess Mental Stress in Automatic Driving of Trucks: Palmar Perspiration and Masseter Electromyography. Sensors (Basel) 2015, 15, 5136–5150. [Google Scholar] [CrossRef]
- Singh, R.R.; Conjeti, S.; Banerjee, R. Bio-signal based on-road stress monitoring for automotive drivers. In Proceedings of the 2012 National Conference on Communications, NCC 2012, 02 2012, pp. 1–5. [CrossRef]
- Singh, M.; Bin Queyam, A. Stress Detection in Automobile Drivers using Physiological Parameters: A Review. International Journal of Electronics Engineering 2013, 5, 1–5. [Google Scholar]
- Studer, L.; Paglino, V.; Gandini, P.; Stelitano, A.; Triboli, U.; Gallo, F.; Andreoni, G. Analysis of the Relationship between Road Accidents and Psychophysical State of Drivers through Wearable Devices. Applied Sciences 2018, 8, 1230. [Google Scholar] [CrossRef]
- Kroc, J. COMPLEX SYSTEMS AND THEIR USE IN BIOMEDICAL RESEARCH: MATHEMATICAL CONCEPTS OF SELF-ORGANIZATION AND EMERGENCE. International Congress of Cell Biology, Poster - P097, 2016. [CrossRef]
- Boltzmann, L. Vorlesungen über Gastheorie, vol. I., 1 ed.; J.A. Barth, Leipzig, 1896.
- Boltzmann, L. Vorlesungen über Gastheorie, vol. II., 1 ed.; J.A. Barth, Leipzig, 1898.
- Jaynes, E.T. Gibbs vs Boltzmann Entropies. American Journal of Physics 1965, 33, 391–398. [Google Scholar] [CrossRef]
- Poincaré, H. The Three-Body Problem and the Equations of Dynamics: Poincaré’s Foundational Work on Dynamical Systems Theory; Vol. 443, Astrophysics and Space Science Library, Springer, 2017; p. 248. [CrossRef]
- Arnold, V.I. Geometrical methods in the Theory of Ordinary Differential Equations, 2 ed.; Grundlehren Der Mathematischen Wissenschaften, Springer, New York, 2012; p. 351.
- Layek, G.C. An Introduction to Dynamical Systems and Chaos; Springer India, 2015; p. 622. [CrossRef]
- Mandelbrot, B.B. The fractal geometry of nature; W. H. Freeman & Co.: San Francisco, CA, 1982; p. 468. [Google Scholar]
- Urdan, T.C. Statistics in Plain English, 4 ed.; Routledge, 2015; p. 266.
- Kroc, J. Biological Applications of Emergent Information Processing: Fast, Long-Range Synchronization. preprint (2025), https://www.researchgate.net/profile/Jiri-Kroc/388328720.
- Russell, S.J.; Norvig, P. Artificial Intelligence: A Modern Approach; Prentice Hall series in artificial intelligence, Prentice Hall, 2010; p. 1132.
- Styer, D. Entropy as Disorder: History of a Misconception. The Physics Teacher 2019, 57, 454–458. [Google Scholar] [CrossRef]
- Clausius, R. Ueber verschiedene für die Anwendung bequeme Formen der Hauptgleichungen der mechanischen Wärmetheorie. Annalen der Physik 1865, 125, 353–400. [Google Scholar] [CrossRef]
- Clausius, R. The Mechanical Theory of Heat – with its Applications to the Steam Engine and to Physical Properties of Bodies. London: John van Voorst 1867. Retrieved 19 June 2012.
- Clausius, R. On a Mechanical Theorem Applicable to Heat. Philosophical Magazine 1870, 40, 122–27. [Google Scholar] [CrossRef]
- Boltzmann, L. Über die Mechanische Bedeutung des Zweiten Hauptsatzes der Wärmetheorie (vorgelegt in der Sitzung am 8 February 1866). k. k. Hof- und Staatsdrukerei 1866, pp. 1–26.
- Clayton. CO2. JAChS 1932.
- Bashkirov, A.G. Renyi entropy as a statistical entropy for complex systems. Theor. Math. Phys. 2006, 149, 1559–1573. [Google Scholar] [CrossRef]
- Haken, H. Information and Self-Organization: A Macroscopic Approach to Complex Systems (Springer Series in Synergetics); Springer-Verlag: Berlin, Heidelberg, 2006. [Google Scholar]
- Nicolis, J.S. Dynamics of Hierarchical Systems: An Evolutionary Approach; Springer-Verlag: Berlin, Heidelberg, 1986. [Google Scholar]
- Štengl, M. Private communications, Charles University, Faculty of Medicine in Pilsen, Department of Physiology, 2016-2017.
- Jarkovská, D. Private communications, Charles University, Faculty of Medicine in Pilsen, Department of Physiology, 2016-2017.
- Bobir, D. Identification of Pre-Arrhythmogenic Features in ECG and Other Data Using Machine Learning. Master’s thesis, Czech Technical University in Prague, Faculty of Information Technology, 2017.
- ECG Learning Center. ecg.utah.edu.
- Kroc, J. Open-source software predicting Torsade de Pointes arrhythmias, ventricular tachycardias, and ventricular fibrillations from ECG recordings: Part I—Permutation Enropy (Python and BASH). Software, GPLv3, Independent Reseach, Pilsen, The Czech Republic, 2025. [avaiable at www.researchgate.net/publication/392937277].
- Kroc, J. Visualization of ECG recordings using Python program ECG-pyview. Software, GPLv3, Complex Systems Research, Pilsen, The Czech Republic, 2020. [avaiable at www.researchgate.net/publication/350621866.
- Kroc, J. Mini-visualization program of ECG recordings: import text file and export segments of ECG. Software, GPLv3, Complex Systems Research, Pilsen, The Czech Republic, 2025. [avaiable at www.researchgate.net/publication/394454735.
- Kumar, A. HYPOTHESIS TESTING IN MEDICAL RESEARCH: A KEY STATISTICAL APPLICATION. Journal of Universal College of Medical Sciences 2016, 3, 53–56. [Google Scholar] [CrossRef]
- Amrhein, V.; Greenland, S.; McShane, B. Scientists rise up against statistical significance. Nature 2019, 567, 305–307. [Google Scholar] [CrossRef]
- Ioannidis, J.P.A. Why Most Published Research Findings Are False. PLoS Medicine 2005, 2, e124. [Google Scholar] [CrossRef] [PubMed]
- Vollmer, S.; Mateen, B.A.; Bohner, G.; Király, F.J.; Ghani, R.; Jonsson, P.; Cumbers, S.; Jonas, A.; McAllister, K.S.L.; Myles, P.; et al. Machine learning and AI research for Patient Benefit: 20 Critical Questions on Transparency, Replicability, Ethics and Effectiveness. BMJ: British Medical Journal 2020, 368, l6927. [Google Scholar] [CrossRef]
- Chalmers, A.F. What Is This Thing Called Science?, 4 ed.; Hackett Publishing, 2013; p. 304.
- Barabási, A.L.; Oltvai, Z.N. Network biology: understanding the cell’s functional organization. Nature Reviews Genetics 2004, 5, 101–113. [Google Scholar] [CrossRef]
- Barabási, A.L.; Gulbahce, N.; Loscalzo, J. Network medicine: a network-based approach to human disease. Nature Reviews Genetics 2011, 12, 56–68. [Google Scholar] [CrossRef]
- Loscalzo, J.; Barabási, A.L.; Silverman, E.K. Network medicine: Complex Systems in Human Disease and Therapeutics; Harvard University Press: London, 2017; p. 448. [Google Scholar] [CrossRef]
- Lee, H.; Shin, S.Y.; Seo, M.; Nam, G.B.; Joo, S. Prediction of Ventricular Tachycardia One Hour before Occurrence Using Artificial Neural Networks. Scientific Reports 2016, 6, 32390. [Google Scholar] [CrossRef]
- Bobir, D. Software predicting Torsade de Pointes arrhythmias, ventricular tachycardias, and ventricular fibrillations from permutation entropy; Part II—Machine Learnig (based on Python). Software, GPLv3, Czech Technical University, Faculty of Information Technologies, Prague, The Czech Republic, 2017. [avaiable at www.researchgate.net/profile/Dmitryi_Bobir/394027065].
- Perez, M.V.; Mahaffey, K.W.; Hedlin, H.; Rumsfed, J.S.; et al..; for the Apple Heart Study Investigators. Large-Scale Assessment of a Smartwatch to Identify Atrial Fibrillation. The New England Journal of Medicine 2019, 381, 1909–1917. [CrossRef]
- Lee, U.; Blain-Moraes, S.; Mashour, G.A. Assessing levels of consciousness with symbolic analysis. Philosophical Transactions of Royal Society A, Mathematical, physical, and engineering sciences 2015, 373, 20140117. [Google Scholar] [CrossRef]
- Singh, S.; Bansal, S.; Kumar, G.; Gupta, I.; Thakur, J.R. Entropy as an Indicator to Measure Depth of Anaesthesia for Laryngeal Mask Airway (LMA) Insertion during Sevoflurane and Propofol Anaesthesia. Journal of Clinical and Diagnostic Research 2017, 11, UC01–UC03. [Google Scholar] [CrossRef]
- Acharya, U.R.; Fujita, H.; Sudarshan, V.K.; Bhat, S.; Koh, J.E.W. Application of entropies for automated diagnosis of epilepsy using EEG signals: A review. Knowledge-Based Systems 2015, 88, 85–96. [Google Scholar] [CrossRef]
- Studholme, C.; Hill, D.L.G.; Hawkes, D.J. An overlap invariant entropy measure of 3D medical image alignment. Pattern Recognition 1999, 32, 71–86. [Google Scholar] [CrossRef]
- Zhou, X.; Ding, H.; Wu, W.; Zhang, Y. A Real-Time Atrial Fibrillation Detection Algorithm Based on the Instantaneous State of Heart Rate. PLoS ONE 2015, 10, e0136544. [Google Scholar] [CrossRef]
- Breiman, L. Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author). Statist. Sci. 2001, 16, 199–231. [Google Scholar] [CrossRef]
- Panik, M. Regression Modeling: Methods, Theory, and Computation with SAS; CRC Press, 2009.
- Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Wadsworth International Group: Belmont, CA, 1984. [Google Scholar]
- Breiman, L. Random Forests. Machine Learning 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics 2000, 29, 1189–1232. [Google Scholar] [CrossRef]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning Representations by Back-propagating Errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
- Cortes, C.; Vapnik, V. Support-Vector Networks. In Proceedings of the Machine Learning; 1995; pp. 273–297. [Google Scholar] [CrossRef]
- Friedman, N.; Geiger, D.; Goldszmidt, M. Bayesian Network Classifiers. Machine Learning 1997, 29, 131–163. [Google Scholar] [CrossRef]
- Seeger, M. GAUSSIAN PROCESSES FOR MACHINE LEARNING. International Journal of Neural Systems 2004, 14, 69–106. [Google Scholar] [CrossRef] [PubMed]
- Hartigan, J.; Wong, M. Algorithm AS 136: A K-means clustering algorithm. Applied Statistics 1979, pp. 100–108. [CrossRef]
- Verhulst, P. Recherches mathématiques sur la loi dáccroissement de la population; Académie Royale, 1845.
- Cover, T.M.; Hart, P.E. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
- Dietterich, T.G. Ensemble Methods in Machine Learning. In Proceedings of the Multiple Classifier Systems: First International Workshop, MCS 2000, Lecture Notes in Computer Science. Springer, Berlin, Heidelberg, 01 2000, Vol. 1857, pp. 1–15. [CrossRef]
- Sagi, O.; Rokach, L. Ensemble learning: A survey. WIREs Data Mining and Knowledge Discovery 2018, 8, e1249. [Google Scholar] [CrossRef]
- Ho, T. Random decision forests. In Proceedings of the Proceedings of 3rd International Conference on Document Analysis and Recognition, Los Alamitos, CA, USA, aug 1995; Vol. 1, p. 278. [CrossRef]
- Polar developers. Polar H 10 Heart Rate Sensor. ECG recording, Polar Electro Oy, Kempele, Finland, 2018.
- Samsung developers. Samsung Galaxy Watch S4. Smartwatch, Samsung Electronics Co., Ltd., Seoul, South Korea, 2019.
- Periyaswamy, T.; Balasubramanian, M. Ambulatory cardiac bio-signals: From mirage to clinical reality through a decade of progress. International Journal of Medical Informatics 2019, 130, 103928. [Google Scholar] [CrossRef] [PubMed]
- Léger, L.; Gojanovic, B.; Sekarski, N.; Meijboom, E.J.; Mivelaz, Y. The Impending Dilemma of Electrocardiogram Screening in Athletic Children. Pediatric Cardiology 2016, 37, 1–13. [Google Scholar] [CrossRef] [PubMed]
- Giebel, G.D.; Gissel, C. Accuracy of mHealth Devices for Atrial Fibrillation Screening: Systematic Review. JMIR mHealth and uHealth 2019, 7, e13641. [Google Scholar] [CrossRef]



















| Macrostates | Dice AB Configurations | #Microstates |
|---|---|---|
| (Sum) | ||
| 2 | 11 | 1 |
| 3 | 12, 21 | 2 |
| 4 | 13, 22, 31 | 3 |
| 5 | 14, 23, 32, 41 | 4 |
| 6 | 15, 24, 33, 42, 51 | 5 |
| 7 | 16, 25, 34, 43, 52, 61 | 6 |
| 8 | 26, 35, 44, 53, 62 | 5 |
| 9 | 36, 45, 54, 63 | 4 |
| 10 | 46, 55, 64 | 3 |
| 11 | 56, 65 | 2 |
| 12 | 66 | 1 |
| Dice Sum | Dice ABC Configurations | #States |
|---|---|---|
| 3 | 111 | 1 |
| 4 | 112, 121, 112 | 3 |
| 5 | 113, 131, 311, 122, 212, 221 | 6 |
| 6 | 114, 141, 411, 123, 132, 213, 231, 312, 321, 222 | 10 |
| 7 | 115, 151, 511, 124, 142, 214, 241, 412, 421, 133, 313, 331, 223, 232, 322 | 15 |
| 8 | 116, 161, 611, 125, 152, 215, 251, 512, 521, 134, 143, 314, 341, 413, 431, 224, 242, 422, 233, 323, 332 | 21 |
| 9 | 126, 162, 216, 261, 612, 621, 135, 153, 315, 351, 513, 531, 144, 414, 441, 225, 252, 522, 234, 243, 324, 342, 423, 432, 333 | 25 |
| 10 | 136, 163, 316, 361, 613, 631, 145, 154, 415, 451, 514, 541, 226, 262, 622, 235, 253, 325, 352, 523, 532, 244, 424, 442, 334, 343, 433 | 27 |
| 11 | 146, 164, 416, 461, 614, 641, 155, 515, 551, 236, 263, 326, 362, 623, 632, 245, 254, 425, 452, 524, 542, 335, 353, 533, 344, 434, 443 | 27 |
| 12 | 156, 165, 516, 561, 615, 651, 246, 264, 426, 462, 624, 642, 255, 525, 552, 336, 363, 633, 345, 354, 435, 453, 534, 543, 444 | 25 |
| 13 | 166, 616, 661, 256, 265, 526, 562, 625, 652, 346, 364, 436, 463, 634, 643, 355, 535, 553, 445, 454, 544 | 21 |
| 14 | 266, 626, 662, 356, 365, 536, 563, 635, 653, 446, 464, 644, 455, 545, 554 | 15 |
| 15 | 366, 636, 663, 456, 465, 546, 564, 645, 654, 555 | 10 |
| 16 | 466, 646, 664, 556, 565, 655 | 6 |
| 17 | 566, 656, 665 | 3 |
| 18 | 666 | 1 |
| Used method | Reasons and Achieved Results (Done by) | Success |
|---|---|---|
| Study of complex systems apps in medicine | This phase was quite intensive and extensive [63] (not reported). (J.K.) | . |
| Analysis of usefulness of various entropies | Extensive study of all existing entropy measures and their applications. Foundations of this research rely on deep understanding of entropy measures [63] & Section 1. (J.K.) | . |
| Permutation entropy | This measure was detected as the most suitable for measuring short runs of ECG recordings; see Section 6.2. It serves as the input into classification. (J.K.) | Yes |
| Heart rate variability | Had been studied in depth and found useless in prediction of arrhythmias (not reported). (J.K.) | No |
| Simple Statistics | Found insufficient in the prediction of arrhythmias; see Section 4. (J.K.) & (D.B.) | No |
| Advanced Statistics | Found insufficient in prediction of arrhythmias; see Section 4. (D.B.) | No |
| Machine Learning Methods | Selected methods displayed relatively high sensitivity in the prediction of arrhythmias; see Section 4 and Appendix C. (D.B.) | Yes |
| Type of Arrhythmia | Description |
|---|---|
| Torsades de Pointes (TdP) | A TdP arrhythmia is typically induced by the application of drugs both medical and recreational. Both types affect the functioning of ion channels that are utilized by cardiomyocytes to propagate action potential through them. Distribution of velocities of propagation of action potential is varying across the thickness of the heart walls. This triggers arrhythmia with a meandering focal point; see Figure 7. |
| Ventricular Tachycardia (VT) | A VT typically occurs in structurally damaged hearts after cardiomyopathy, heart infarction, heart inflammation with re-modulation of cardiomyocytes, or due to a genetic disease. All of those causes are leading to permanent structural changes of cardiomyocytes and/or the conductive system that triggers arrhythmia events with a fixed focal point. Structural changes lead to dispersion of action potential propagation speeds; see Figure 8. |
| Ventricular Fibrillation (VF) | VFs are similar to VTs, with the difference that there are simultaneously operating several focal points, which act as surrogate pacemakers; see Figure 9. |
| Premature Ventricular Contraction (PVC) | PVCs are heart contractions randomly initiated in ventricles of hearts. They, when doubled, and even more when tripled, can trigger a run of TdP, VT, or VF event; see Figure 10. |
| Features | Description |
|---|---|
| ISI_Top5-TM*** | top five important time moments discovered by Random Forest for each sub-interval |
| ISI-RM_Top5-TM** | top five important time moments discovered by Random Forest for each sub-interval with rolling mean |
| ISI | all values of each sub-interval |
| ISI-RM* | all values of each sub-interval with rolling mean |
| OC_SSF* | simple statistical features of each original |
| OC_ASF** | simple + advanced statistical features of each original |
| ISI_SSF* | simple statistical features of each sub-interval |
| ISI_ASF* | simple + advanced statistical features of each sub-interval |
| OC_Top5-ASF** | top five important statistical features (simple + advanced) discovered by Random Forest for the original s |
| ISI_Top5-ASF*** | top five important statistical features (simple + advanced) discovered by Random Forest for each sub-interval |
| Top5-L-ISI | all values of each sub-interval of top five important values of L parameter discovered by Random Forest for the appropriate sub-interval |
| Top5-L-ISI_Top5-TM** | top five important time moments of top five important values of L parameter discovered by Random Forest for each sub-interval |
| Top5-L-OC_Top5-ASF** | top five important statistical features (simple + advanced) of the original s for the selected values of the L parameter. These values are selected as the union of the top five important values of the L parameter discovered by Random Forest for each sub-interval. |
| MSI | merge of all values of control and methoxamine intervals together for each |
| MSI_Top5-TM*** | merge of the top five important time moments of control and methoxamine intervals together for each |
| MSI_Top5-TM & OC_ASF*** |
top five important time moments of control and methoxamine intervals merged together for each with subsequent merging with all statistical features (simple + advanced) computed for the appropriate original (not dissected) (i.e., control_imp_time + methoxamine_imp_time + ostats) |
| ISI_Top5-TM & OC_ASF*** |
merging the top five important time moments of each sub-interval with all statistical features (simple + advanced) computed for the appropriate original (i.e., control_imp_time + ostats, methoxamine_imp_time + ostats) |
| MSI_Top5-TM & MSI_Top5-TM-ASF*** |
top five important time moments of control and methoxamine intervals merged together for each with subsequent merging with all statistical features (simple + advanced) computed for this merged important time moments (i.e., control_imp_time + methoxamine_imp_time + cstats) |
| ISI_Top5-TM & ISI_ASF*** |
merging the top five important time moments of each sub-interval (control or methoxamine) with all statistical features computed for the appropriate sub-interval (i.e., control_imp_time + control_stats, methoxamine_imp_time + methoxamine_stats) |
| ISI_Top5-TM & ISI_Top5-ASF*** |
merging the top five important time moments of each sub-interval (control or methoxamine) with important statistical features computed for the appropriate sub-interval (i.e., control_imp_time + control_imp_stats, methoxamine_imp_time + methoxamine_imp_stats) |
| ISI-DFT_Top5-C*** | The top five important DFT coefficients (15 real values and 15 imaginary are merged together and the top five from them are selected) were evaluated for each sub-interval. Important time moments were identified by Random Forest. |
| ISI-DFT_Top5-TM** | Top five important time moments (identified by Random Forest) of each sub-interval, whereas curve, specified by values in the appropriate sub-interval, was reconstructed by the first ten DFT coefficients. |
| ISI-DWT_Top10-C*** | Ten most significant DWT coefficients of each taken in absolute value sub-interval. The coefficients were sorted in descending order. |
| ISI-DWT_Top5-TM*** | Top five important time moments (identified by Random Forest) of each sub-interval, whereas curve, specified by values in the appropriate sub-interval, was reconstructed by ten most significant in absolute value DWT coefficients. |
| MSI_Top5-TM-ASF | All statistical features (simple + advanced) computed for the merged top five important time moments of sub-intervals (tested only with MLP). |
| Features | Used algorithms | Used L | |
|---|---|---|---|
| Control | methoxamine | ||
| ISI_Top5-TM | SVM | 1, 100, 30, 300, 40, 500, 60 |
10, 100, 20, 200, 30, 300, 40, 5, 50, 500, 60, 70, 90 |
| RF | 300, 400, 500, 90 | 100, 5, 500 | |
| k-NN | 100, 30, 500, 60 |
10, 100, 20, 300, 40, 400, 5, 50, 500, 90 |
|
| LR | 100 | 10, 200, 300, 5, 500, 90 | |
|
OC_Top5-ASF |
SVM | 10, 20, 300, 40, 5, 50, 500, 90 | |
| SVM, RF, k-NN, LR | RF: 40 LR: 10, 40 k-NN: 10, 300, 5, 50, 90 SVM: 10, 20, 300, 40, 5, 50, 500, 90 |
||
| OC_ASF | SVM | 10, 40, 50 | |
| ISI_Top5-ASF | SVM | 20, 300 | 40, 60 |
| ISI_Top5-TM & ISI_Top5-ASF | SVM, RF, k-NN, LR | RF: 300 LR: 300, 60, 80 k-NN: 1, 300 SVM: 20, 30, 300, 500, 60, 80 |
RF: 100 LR: 200, 300, 400, 500 k-NN: 10, 20, 500, 60, 80 SVM: 10, 200, 30, 300, 400, 500, 60, 70, 80, 90 |
| ISI-DFT_Top5-C | SVM | 40, 400, 5, 90 | 1, 20, 200, 30, 80 |
| ISI-DWT_Top10-C | SVM | - | 10, 40, 60 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
