Submitted:
23 October 2024
Posted:
24 October 2024
You are already at the latest version
Abstract
Keywords:
MSC: 60A05, 60G55
1. Introduction
1.1. Organization of the Paper
1.2. Terminology
2. Probability Theory Since Kolmogorov
2.1. Kolmogorov’s Contribution
2.2. Probabilities or Expectations?
2.3. Probability Theory and Category Theory
-
For any fixed the mappingis measurable.
-
For every fixed , the mappingis a measure on .
2.4. Preliminaries on Point Processes
- For all the measure is locally finite.
- For all bounded sets the random variable is a count variable.
2.5. Poisson Distributions and Poisson Point Processes
- For all the random variable is Poisson distributed with mean value .
- If are disjoint, then the random variables and are independent.
2.6. Valuations
- Strictness
- .
- Monotonicity
- For all subsets , implies .
- Modularity
- For all subsets ,
- Continuity
- for any directed net .
3. Observations
3.1. Observations as Multiset Classifications
| Key | Animal |
| 1 | cow |
| 2 | horse |
| 3 | bee |
| 4 | sheep |
| 5 | flie |
| Key | Animal |
| 1 | mammal |
| 2 | mammal |
| 3 | insect |
| 4 | mammal |
| 5 | insect |
3.2. Observations as Empirical Measures
| Classification | Frequency |
| insect | 2 |
| mammal | 3 |
- Addition.
- Restriction.
- Inducing.
3.3. Categorical Properties of the Empirical Measures and Some Generalizations
3.4. Lossless Compression of Data
3.5. Lossy Compression of Data
4. Expectations
4.1. Simple Expectation Measures
- There are many ways of writing t as a product where and n is an integer.
- There are many different sampling schemes that will lead to a multiplication be .
- There are many ways of generating the randomness that is needed to perform the sampling.
4.2. Categorical Properties of the Expectation Measures and Some Generalizations
4.3. The Poisson Interpretation
- is Poisson distributed for any open set B.
- For any open sets the random variable is independent of the random variable given the random variable if and only if .
4.4. Normalization, Conditioning and Other Operations on Expectation Measures
4.5. Independence
4.6. Information Divergence for Expectation Measures
- with equality when
- is minimal when
- for all
5. Applications
5.1. Goodness-of-Fit Tests
5.2. Improper Prior Distributions
5.3. Markov Chains
5.4. Inequalities for Information Projections
6. Discussion and Conclusions
| Probability theory | Expectation theory |
| Probability | Expected value |
| Outcome | Instance |
| Outcome space | Multiset monad |
| P-value | E-Value |
| Probability measure | Expectation measure |
| Binomial distribution | Poisson distribution |
| Density | Intensity |
| Bernoulli random variable | Count variable |
| Empirical distribution | Empirical measure |
| KL-divergence | Information divergence |
| Uniform distribution | Poisson point process |
| State space | State cone |
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
| bin | binomial distribution |
| DCC | decending chain condition |
| E-statistic | Evidence statistic |
| E-value | Observed value of an E-statistic |
| hyp | Hypergeometric distribution |
| IID | Independent identically distributed |
| KL-divergence | Information divergence restricted to probability measures |
| MDL | Minimum description length |
| Mset | Multiset |
| N | Gaussian distribution |
| PM | Probability measure |
| Po | Poisson distribution |
| mset | Multiset |
| Poset | Partially ordered set |
| Pr | Probability |
References
- Kolmogorov, A.N. Grundbegriffe der Wahrscheinlichkeitsrechnung; Springer: Berlin, 1933. [Google Scholar]
- Lardy, T.; Grünwald, P.; Harremoës, P. Reverse Information Projections and Optimal E-statistics. IEEE Transactions on Information Theory 2024. [Google Scholar] [CrossRef]
- Perrone, P. Categorical Probability and Stochastic Dominance in Metric Spaces. phdthesis, Max Planck, Institute for Mathematics in the Sciences. 2018. [Google Scholar]
- nLab authors. monads of probability, measures, and valuations. Revision 45. 2024. Available online: https://ncatlab.org/nlab/show/monads+of+probability%2C+measures%2C+and+valuations.
- Shiryaev, A.N. Probability; Springer: New York, 1996. [Google Scholar]
- Whittle, P. Probability via Expectation, 3 ed.; Springer texts in statistics; Springer Verlag: New York, 1992. [Google Scholar]
- Kallenberg, O. Random Measures; Springer: Schwitzerland, 2017. [Google Scholar] [CrossRef]
- Lawvere. The Category of Probabilistic Mappings. Lecture notes.
- Scibior, A.; Z..; Ghahramani.; Gordon, A.D. Practical probabilistic programming with monads. In Proceedings of the 2015 ACM SIGPLAN Symposium on Haskell; Association for Computing Machinery: New York, NY, USA, 2015; Haskell ’15; pp. 165–176. [Google Scholar] [CrossRef]
- Giry, M. A categorical approach to probability theory. Categorical Aspects of Topology and Analysis; Banaschewski, B., Ed.; Springer Berlin Heidelberg: Berlin, Heidelberg, 1982; pp. 68–85. [Google Scholar]
- Lieshout, M.V. Spatial Point Process Theory. In Handbook of Spatial Statistics; Handbooks of Modern Statistical Methods; Chapman and Hall, 2010; chapter 16. [Google Scholar]
- Dash, S.; Staton, S. A Monad for Probabilistic Point Processes; ACT, 2021. [Google Scholar]
- Jacobs, B. From Multisets over Distributions to Distributions over Multisets. Proceedings of the 36th Annual ACM/IEEE Symposium on Logic in Computer Science; Association for Computing Machinery: New York, NY, USA, 2021. [Google Scholar] [CrossRef]
- Rényi, A. A characterization of Poisson processes. Magyar Tud. Akad. Mat. Kutaló Int. Közl. 1956, 1, 519–527. [Google Scholar]
- Kallenberg, O. Limits of Compound and Thinned Point Processes. Journal of Applied Probability 12, 269–278. [CrossRef]
- nLab authors. valuation (measure theory). 2024. Available online: https://ncatlab.org/nlab/show/valuation+%28measure+theory%29.
- Heckmann, R. Spaces of valuations. In Papers on General Topology and Applications; New York Academy of Sciences, 1996. [Google Scholar] [CrossRef]
- Blizard, W.D. The development of multiset theory. Modern Logic 1991, 1, 319–352. [Google Scholar]
- Monro, G.P. The Concept of Multiset. Mathematical Logic Quarterly 1987, 33, 171–178. [Google Scholar] [CrossRef]
- Isah, A.; Teella, Y. The Concept of Multiset Category. British Journal of Mathematics and Computer Science 2015, 9, 427–437. [Google Scholar] [CrossRef]
- Grätzer, G. Lattice Theory; Dover, 1971. [Google Scholar]
- Wille, R. Formal Concept Analysis as Mathematical Theory. In Formal Concept Analysis; Ganter, B., Stumme, G., Wille, R., Eds.; Number 3626 in Lecture Notes in Artificial Intelligence; Springer, 2005; pp. 1–33. [Google Scholar]
- Topsøe, F. Compactness in Space of Measures. Studia Mathematica 1970, 36, 195–212. [Google Scholar] [CrossRef]
- Alvarez-Manilla, M. Extension of valuations on locally compact sober spaces. Topology and its Applications 2002, 124, 397–433. [Google Scholar] [CrossRef]
- Harremoës, P. Extendable MDL. Proceedings ISIT 2013; IEEE Information Theory Society,, 2013; pp. 1516–1520. [CrossRef]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory; Wiley, 1991. [Google Scholar]
- Csiszár, I. The Method of Types. IEEE Trans. Inform. Theory 1998, 44, 2505–2523. [Google Scholar] [CrossRef]
- Harremoës, P. Testing Goodness-of-Fit via Rate Distortion. Information Theory Workshop, Volos, Greece, 2009. IEEE Information Theory Society, 2009, pp. 17–21. [CrossRef]
- Harremoës, P. The Rate Distortion Test of Normality. Proceedings ISIT 2019. IEEE Information Theory Society, 2019, pp. 241–245. [CrossRef]
- Harremoës, P. Rate Distortion Theory for Descriptive Statistics. Entropy 25, 456. [CrossRef]
- Rényi, A. On an Extremal Property of the Poisson Process. Annals Inst. Stat. Math. 1964, 16, 129–133. [Google Scholar] [CrossRef]
- McFadden, J.A. The Entropy of a Point Process. J. Soc. Indst. Appl. Math. 1965, 13, 988–994. [Google Scholar] [CrossRef]
- Harremoës, P. Binomial and Poisson Distributions as Maximum Entropy Distributions. IEEE Trans. Inform. Theory 2001, 47, 2039–2041. [Google Scholar] [CrossRef]
- Harremoës, P.; Johnson, O.; Kontoyiannis, I. Thinning and Information Projection. 2008 IEEE International Symposium on Information Theory. IEEE, 2008, pp. 2644–2648.
- Harremoës, P.; Johnson, O.; Kontoyiannis, I. Thinning, Entropy and the Law of Thin Numbers. IEEE Trans. Inform Theory 2010, 56, 4228–4244. [Google Scholar] [CrossRef]
- Hillion, E.; Johnson, O. A proof oof the Shepp-Olkin entropy concavity conjecture. Bernoulli 2017, arXiv:1503.0157023, 3638–3649. [Google Scholar] [CrossRef]
- Dawid, A.P. Separoids: A mathematical framework for conditional independence and irrelevance. Ann. Math. Artif. Intell. 2001, 32, 335–372. [Google Scholar] [CrossRef]
- Harremoës, P. Entropy inequalities for Lattices. Entropy 2018, 20, 748. [Google Scholar] [CrossRef]
- Harremoës, P. Divergence and Sufficiency for Convex Optimization. Entropy 2017, arXiv:1701.0101019, 206. [Google Scholar] [CrossRef]
- Csiszár, I. I-Divergence Geometry of Probability Distributions and Minimization Problems. Ann. Probab. 1975, 3, 146–158. [Google Scholar] [CrossRef]
- Pfaffelhuber, E. Minimax Information Gain and Minimum Discrimination Principle. Topics in Information Theory; Csiszár, I.; Elias, P., Eds. János Bolyai Mathematical Society and North-Holland, 1977, Vol. 16, Colloquia Mathematica Societatis János Bolyai, pp. 493–519.
- Topsøe, F. Information Theoretical Optimization Techniques. Kybernetika 1979, 15, 8–27. [Google Scholar]
- Csiszár, I.; Tusnády, G. Information geometry and alternating minimization procedures. Statistics and Decisions 1984, (Supplementary Issue 1), 205.237. [Google Scholar]
- Li, J.Q. Estimation of Mixture Models. Ph.d. dissertation, Department of Statistics, Yale University,, 1999. [Google Scholar]
- Li, J.Q.; Barron, A.R. Mixture Density Estimation. Proceedings Conference on Neural Information Processing Systems: Natural and Synthetic;, 1999.
- Harremoës, P. Bounds on tail probabilities for negative binomial distributions. Kybernetika 2016, 52, 943–966. [Google Scholar] [CrossRef]
- Harremoës, P.; Tusnády, G. Information Divergence is more χ2-distributed than the χ2-statistic. 2012 IEEE International Symposium on Information Theory; IEEE,, 2012; pp. 538–543. [CrossRef]
- Györfi, L.; Harremoës, P.; Tusnády, T. Some Refinements of Large Deviation Tail Probabilities. arXiv:1205.1005.
- Kass, R.E.; Wasserman, L.A. The Selection of Prior Distributions by Formal Rules. Journal of the American Statistical Association 1996, 91, 1343–1370. [Google Scholar] [CrossRef]
- Grünwald, P. The Minimum Description Length principle; MIT Press, 2007. [Google Scholar]
- Harremoës, P. Entropy on Spin Factors. Information Geometry and Its Applications; Ay, N.; Gibilisco, P.; Matúš, F., Eds. Springer, 2018, Vol. 252, Springer Proceedings in Mathematics & Statistics, pp. 247–278, [arXiv:1707.03222]. arXiv:1707.03222].
- Harremoës, P.; Matúš, F. Bounds on the Information Divergence for Hypergeometric Distributions. Kybernetika 2020, 56, 1111–1132. [Google Scholar] [CrossRef]
- Harremoës, P.; Johnson, O.; Kontoyiannis, I. Thinning and Information Projections. arXiv:1601.04255.
- Harremoës, P. Convergence to the Poisson Distribution in Information Divergence. Preprint 2, Mathematical department, University of Copenhagen, 2003.
- Harremoës, P.; Ruzankin, P. Rate of Convergence to Poisson Law in Terms of Information Divergence. IEEE Trans. Inform Theory 2004, 50, 2145–2149. [Google Scholar] [CrossRef]
- Kontoyiannis, I.; Harremoës, P.; Johnson, O. Entropy and the Law of Small Numbers. IEEE Trans. Inform. Theory 2005, 51, 466–472. [Google Scholar] [CrossRef]
- Harremoës, P. Lower Bounds for Divergence in the Central Limit Theorem. In General Theory of Information Transfer and Combinatorics; Springer Berlin Heidelberg: Berlin, Heidelberg, 2006; pp. 578–594. [Google Scholar] [CrossRef]
- Harremoës, P. Lower bound on rate of convergence in information theoretic Central Limit Theorem. Book of Abstracts for the Seventh International Symposium on Orthogonal Polynomials, Special functions and Applications;, 2003; pp. 53–54.
- Harremoës, P. Lower Bounds on Divergence in Central Limit Theorem. Electronic Notes in Discrete Mathematics 2005, 21, 309–313. [Google Scholar] [CrossRef]
- Harremoës, P. Maximum Entropy on Compact groups. Entropy 2009, 11, 222–237. [Google Scholar] [CrossRef]






Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).