Preprint
Article

This version is not peer-reviewed.

A Decimal–Hexadecimal Block Encoding for Prime Storage with Modular Compression via the Ternary Constraint

Submitted:

09 April 2026

Posted:

09 April 2026

You are already at the latest version

Abstract
We describe a decimal–hexadecimal block encoding for primality over a finite stored range. Since every prime greater than 5 must lie in one of the residue classes 1, 3, 7, 9 (mod 10), each decimal block of size ten can be encoded by a 4-bit word indicating which of the candidates 10k + 1, 10k + 3, 10k + 7, and 10k + 9 are prime. This yields a nibble-based storage scheme supporting exact primality queries and exact recovery of the prime-counting function π(x) by cumulative popcount. We then establish a structural theorem arising from the congruence 10 ≡ 1 (mod 3): for k ≡ 0 (mod 3) the candidates 10k + 3 and 10k + 9 are always composite, and for k ≡ 2 (mod 3) the candidates 10k + 1 and 10k + 7 are always composite. This partitions the nibble alphabet into three classes of sizes 4, 16, and 4, reducing the Shannon entropy from 4 bits to 2.42 bits per nibble and yielding a lossless compression of 39.4% over the original encoding with O(1) decode complexity. We present data structures, Python routines, and experimental validation up to 300,000.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  

1. Introduction

This note proposes a compact storage representation for primality based on decimal blocks. The starting point is elementary: every prime p > 5 satisfies
p 1 , 3 , 7 , 9 ( mod 10 ) .
Therefore, for each decimal block
D k = { 10 k , 10 k + 1 , , 10 k + 9 } , k 1 ,
only the four candidates 10 k + 1 , 10 k + 3 , 10 k + 7 , 10 k + 9 need to be stored.
The goal is not to introduce a new primality test or a sieve superior to classical methods. Rather, the purpose is to describe a compact block encoding of primality together with direct query algorithms. The resulting structure supports exact recovery, within the stored range, of the primality indicator, the prime-counting function π ( x ) , and the index of a stored prime via π ( p ) .
In Section 2 we define the encoding. Section 3 describes the database structure. Section 4 gives the core Python algorithms. Section 5 establishes the ternary constraint theorem and derives the compressed encoding. Section 6 reports experimental validation. Section 7 compares with traditional representations.

2. Decimal–Hexadecimal Encoding

For each decade D k , define four indicator bits
b 1 ( k ) , b 3 ( k ) , b 7 ( k ) , b 9 ( k ) { 0 , 1 } ,
where
b r ( k ) = 1 , if 10 k + r is prime , 0 , otherwise , r { 1 , 3 , 7 , 9 } .
We encode the block by the nibble
H ( k ) = 8 b 1 ( k ) + 4 b 3 ( k ) + 2 b 7 ( k ) + b 9 ( k ) ,
so the bit order is [ 1 , 3 , 7 , 9 ] [ 8 , 4 , 2 , 1 ] . Thus H ( k ) { 0 , 1 , , 15 } and can be represented in hexadecimal.

Examples

10 - - 19 : 11 , 13 , 17 , 19 all prime 1111 2 = F .
20 - - 29 : 23 , 29 prime 0101 2 = 5 .
30 - - 39 : 31 , 37 prime 1010 2 = A .
40 - - 49 : 41 , 43 , 47 prime 1110 2 = E .
Hence the sequence begins F , 5 , A , E , 5 , A , D , 5 , 2 , F ,

3. Database Structure

The stored data consists of two components: (1) the base primes 2 , 3 , 5 , 7 , stored explicitly; (2) the sequence of nibbles H ( 1 ) , H ( 2 ) , , H ( K ) for the desired range.
For compact storage, two consecutive nibbles are packed into one byte: high nibble = odd-indexed decade, low nibble = even-indexed decade. Hence two decades occupy one byte.
Proposition 1.
Given the explicitly stored primes 2 , 3 , 5 , 7 and the sequence of nibbles H ( k ) over a finite range, the primality indicator is exactly recoverable throughout that range.
Proof. 
Every integer n > 5 is either outside the residue classes 1 , 3 , 7 , 9 ( mod 10 ) , in which case it is automatically composite, or else belongs to a unique decade D k and to exactly one of the four candidate positions. The corresponding bit of H ( k ) records precisely whether that candidate is prime. Together with the explicit storage of 2 , 3 , 5 , 7 , this determines the primality indicator exactly in the stored range.    □

4. Core Algorithms in Python

4.1. Construction of the Database

Preprints 207352 i001Preprints 207352 i002

4.2. Direct Primality Query

Preprints 207352 i003

4.3. Recovery of the Prime-Counting Function

Preprints 207352 i004Preprints 207352 i005
Proposition 2.
Within the stored range, π ( x ) is exactly recoverable from the database by cumulative popcount together with the partial contribution of the block containing x.
Proof. 
Each encoded decade contributes exactly pc ( H ( k ) ) primes among the four admissible candidates. Summing these popcounts over all complete blocks before the block containing x, and adding the explicit contribution of 2 , 3 , 5 , 7 together with the partial count inside the current block, yields π ( x ) exactly.    □

5. The Ternary Constraint and Compressed Encoding

5.1. The Structural Theorem

The congruence 10 1 ( mod 3 ) implies
10 k + r k + r ( mod 3 ) .
Therefore 3 ( 10 k + r ) if and only if k r ( mod 3 ) . Applying this to the four candidates:
Theorem 1
(Ternary Constraint). Let k 1 and r { 1 , 3 , 7 , 9 } . Then:
1.
If k 0 ( mod 3 ) , then 3 ( 10 k + 3 ) and 3 ( 10 k + 9 ) , so b 3 ( k ) = b 9 ( k ) = 0 .
2.
If k 1 ( mod 3 ) , no candidate is divisible by 3 (beyond the value 3 itself, which is handled in the base set).
3.
If k 2 ( mod 3 ) , then 3 ( 10 k + 1 ) and 3 ( 10 k + 7 ) , so b 1 ( k ) = b 7 ( k ) = 0 .
Proof. 
Follows directly from 10 k + r k + r ( mod 3 ) :
  • k 0 : k + 3 0 and k + 9 0 ( mod 3 ) .
  • k 1 : k + 1 2 , k + 3 1 , k + 7 2 , k + 9 1 ( mod 3 ) ; none zero.
  • k 2 : k + 1 0 and k + 7 0 ( mod 3 ) .
Since all values 10 k + r > 3 for k 1 , divisibility by 3 implies compositeness.    □

5.2. Partition of the Nibble Alphabet

Theorem 1 partitions the 16-symbol nibble alphabet into three disjoint effective alphabets:
Class Forced zero bits Effective alphabet
k 0 ( mod 3 ) b 3 , b 9 { 0 , 2 , 8 , A } (size 4)
k 1 ( mod 3 ) none { 0 , , F } (size 16)
k 2 ( mod 3 ) b 1 , b 7 { 0 , 1 , 4 , 5 } (size 4)
Corollary 1.
For k 0 or k 2 ( mod 3 ) , the nibble H ( k ) is determined by exactly 2 bits. For k 1 ( mod 3 ) , all 4 bits are unconstrained.

5.3. Shannon Entropy Reduction

Let p r denote the empirical density of primes among the candidates of residue r in each class. By Dirichlet’s theorem on primes in arithmetic progressions, all four residue classes { 1 , 3 , 7 , 9 } are asymptotically equiprobable, so the bits within the k 1 class are approximately i.i.d. with parameter p ^ 1 / 4 .
For the restricted classes ( k 0 and k 2 ), only 2 bits are free, giving a maximum entropy of 2 bits; the observed Shannon entropy is approximately 1.82 bits due to the unequal probability of the symbol 0 (no primes in the active candidates).
The mean entropy per nibble is therefore
H = 1 3 H 0 + 1 3 H 1 + 1 3 H 2 1 3 ( 1.82 ) + 1 3 ( 3.64 ) + 1 3 ( 1.82 ) = 2.426 bits ,
compared to the naive 4 bits, a reduction of 39 . 4 % .
Experimentally, over 30 , 000 decades up to 300 , 000 , the measured entropy values are:
Class H (bits) Alphabet size
k 0 ( mod 3 ) 1.8174 4
k 1 ( mod 3 ) 3.6352 16
k 2 ( mod 3 ) 1.8198 4
Weighted mean 2.4241

5.4. Compressed Encoding Scheme

Since the class of k modulo 3 is known implicitly from the position index, no overhead is required to signal which alphabet is in use. The practical encoding assigns:
  • k 0 ( mod 3 ) : a 2-bit index into { 0 , 2 , 8 , A } .
  • k 1 ( mod 3 ) : the original 4-bit nibble (or a Huffman code for further gain).
  • k 2 ( mod 3 ) : a 2-bit index into { 0 , 1 , 4 , 5 } .
This yields an average of
1 3 ( 2 ) + 1 3 ( 4 ) + 1 3 ( 2 ) = 8 3 2.667 bits per decade ,
a lossless compression of 33.3 % in a simple fixed-code scheme, approaching the Shannon limit of 39.4 % if Huffman coding is applied to the k 1 class.
Remark 1.
The uniqueness of this compression is a direct consequence of 10 1 ( mod 3 ) . The prime 3 is the only prime p for which 10 1 ( mod p ) , making it the unique prime that induces an exact periodic partition on the nibble sequence with period 3 and zero overhead. For the primes 5 and 7, the congruence 10 ¬ 1 ( mod p ) breaks this clean periodicity, and no analogous lossless partition exists at those moduli.
Preprints 207352 i006Preprints 207352 i007

6. Experimental Validation

6.1. Verification of the Ternary Constraint

Over all 30 , 000 decades up to 300 , 000 , no nibble outside the prescribed alphabet was observed for k 0 or k 2 ( mod 3 ) :
Nibble k 0 k 1 k 2 Violation
0 4535 1914 4485 0
1 0 1036 2266 0
2 2235 1023 0 0
3 0 511 0 0
4 0 1061 2255 0
5 0 472 994 0
6 0 522 0 0
7 0 213 0 0
8 2214 1029 0 0
9 0 502 0 0
A 1016 506 0 0
B 0 218 0 0
C 0 513 0 0
D 0 206 0 0
E 0 205 0 0
F 0 69 0 0
Zero violations confirm Theorem 1 exactly.

6.2. Size Comparison

Method Size (bytes) bits/decade
Explicit prime list (32-bit) 95,148
Full bitmap on 0 , , 300 , 000 37,500
Odd-only bitmap 18,750
Caraccioli original 15,000 4.000
Caraccioli + ternary (fixed 2/4/2 bits) 9,544 2.667
Caraccioli + ternary (Shannon limit) 9,090 2.424

7. Comparison with Traditional Representations

A full bitmap stores one bit per integer but wastes storage on even numbers and on decimal residue classes that cannot contain primes. An odd-only bitmap is already a reasonable baseline.
The original Caraccioli encoding reduces to 4 bits per decade by exploiting the residue structure mod 10. The ternary extension reduces this further to 8 / 3 2.67 bits per decade (fixed scheme) or 2.42 bits per decade (entropy limit) by additionally exploiting the residue structure mod 3, with no loss of information and O ( 1 ) encode/decode complexity.
The combined compression ratios relative to the original are:
  • 1.65 × over the Caraccioli original (fixed scheme).
  • 1.65 × to 39.4 % improvement approaching the Shannon limit with Huffman coding on the k 1 class.
  • 4.12 × over an odd-only bitmap.

8. Discussion

The ternary constraint is not a heuristic observation: it is a theorem that follows from 10 1 ( mod 3 ) alone, and it holds for every k 1 without exception. Its consequence — that two thirds of all nibble positions are confined to a 4-symbol alphabet — is a direct structural fact about the distribution of primes modulo 30, since the admissible residues mod 30 are exactly those coprime to 30.
The compression gain is genuine and practically implementable. Decoding requires only knowing k mod 3 , which is free from the position index. No auxiliary table beyond the two 4-element alphabets is required.
A natural further extension is the wheel modulo 210 = 2 · 3 · 5 · 7 , which would yield 8 admissible residues per block. However, since 10 ¬ 1 ( mod 5 ) and 10 ¬ 1 ( mod 7 ) , the corresponding constraints do not produce clean periodic partitions of the nibble sequence and require a more involved analysis.

9. Conclusions

We presented a decimal–hexadecimal block encoding for primality (Caraccioli encoding) and established a structural theorem — the Ternary Constraint — that partitions the 16-symbol nibble alphabet into three effective alphabets of sizes 4, 16, and 4 according to k mod 3 .
This partition yields:
  • a provable lossless compression of 33.3 % (fixed scheme) to 39.4 % (Shannon limit) over the original encoding,
  • O ( 1 ) encode and decode complexity,
  • a combinatorial explanation for the dominant period-3 autocorrelation peak observed in the nibble sequence,
  • the observation that p = 3 is the unique prime inducing this exact periodic structure, by virtue of 10 1 ( mod 3 ) .

References

  1. Hardy, G. H.; Wright, E. M. An Introduction to the Theory of Numbers, 6th ed.; Oxford University Press, 2008. [Google Scholar]
  2. Crandall, R.; Pomerance, C. Prime Numbers: A Computational Perspective, 2nd ed.; Springer, 2005. [Google Scholar]
  3. Knuth, D. E. The Art of Computer Programming. In Seminumerical Algorithms, 3rd ed.; Addison-Wesley, 1997; Vol. 2. [Google Scholar]
  4. Cover, T. M.; Thomas, J. A. Elements of Information Theory, 2nd ed.; Wiley, 2006. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated