Preprint
Article

This version is not peer-reviewed.

Statistically Validating Homeric Clusters of Translatable Oral Characteristics

Submitted:

26 February 2026

Posted:

27 February 2026

You are already at the latest version

Abstract
Translatable oral characteristics can be translated between poetic languages that are linguistically or metrically very different. This paper begins with describing an effort of manually clustering such characteristics and listing the passages in which they co-occur. Following this process, five non-trivial clusters have been discovered in the Iliad. The clusters are non-trivial in the sense that they have many semantically diverse characteristics, are often detailed, are few in number, and are widely applicable. As such, they contrast with trivial clusters of semantically close characteristics, like king, wealth, and gold. The discovery of three of these non-trivial clusters is supported by chi-square tests of independence. They are conjectured to have origins in separate Mycenaean, Aeolian, and Ionian oral traditions. Nevertheless, the statistical investigation is about the existence of these three non-trivial clusters, not their geographical origins. A list of 21 statistically significant results is obtained, of which 13 are extremely significant, while only 4 are not significant. Given that no trivial concept, type-scene, or story explains the clusters, their existence, meaning, contours, and origins should be of primary concern for Homerists, as they probably have much historical and literary value.
Keywords: 
;  ;  ;  ;  ;  
Subject: 
Arts and Humanities  -   Classics

1. Introduction

This paper uses statistical methods to defend the claim that three separate non-trivial clusters of translatable oral characteristics can be distinguished in the Iliad.1 The many often detailed and semantically diverse characteristics,2 the many passages in which the clusters are used, and the small number of such clusters suggest that they may reflect oral traditions or remnants of oral traditions that have been translated or transmitted to the Ionian oral tradition of the Iliad and the Odyssey. The content of some prominent characteristics suggests that two of the transmitted clusters have a Mycenaean and Aeolian origin, while the third appears to be the Ionian tradition in a pure state.3 This discovery probably implies that certain passages in a centuries-old Iliad tradition were typically performed using one of the clusters. In any case, speculation about how our Iliad (the Iliad that has been handed down to us) came to be aside, this paper demonstrates that the existence of three non-trivial clusters can be supported by statistics. The characteristics and passages of the conjectured clusters, and a sheet with the statistical analysis, form the Supplementary Dataset for Cluster Analysis, deposited in Zenodo (Blondé 2026).4
Supporting a hypothesis via statistics often amounts to making a careful and unbiased case against a null hypothesis. In this analysis, the null hypothesis for a given translatable oral characteristic is the claim that the characteristic occurs independently of the passages of its cluster. The alternative hypothesis is that the occurrences of the characteristic have a tendency to cluster together in these passages. In order to make the analysis unbiased and reproducible, digital search queries determine the occurrences in the text for most analyzed characteristics.
The next section provides a brief background for the Homeric Question and the Homeric artificial language. In Section 3, the methodology of identifying and statistically supporting non-trivial clusters is explained and defended. The three non-trivial clusters are presented in Section 4. The statistical investigation is explained in Section 5, a short discussion follows in Section 6, and Section 7 concludes.

2. The Homeric Question

The Homeric Question is the question of how, by whom, where, and when the Iliad and the Odyssey were formed into their (more or less)5 fixed shapes, as well as questions related to the works’ historical backgrounds. Investigations into the Homeric Question date back to the Hellenistic period, when certain Alexandrian critics, the chorizontes (separatists), claimed that the Iliad and the Odyssey did not have the same author.6 Not much progress was made until Wolf published his Prolegomena ad Homerum in 1795, in which he proposed that the artistic unity of the Homeric epics is in fact the collective work of many editors who adapted the original versions to more modern tastes.7
In the 19th century, Homeric scholarship became an arena in which Analysts opposed Unitarians.8 The former claimed the Homeric works to be the result of many authors and editors, while the latter argued that the works are a whole that is best explained by a single genius: Homer. In particular, many Analysts believed that our Iliad consists of an Ur-Iliad to which many inferior interpolations had subsequently been added. However, in spite of the many detailed arguments presented by the Analysts, they could not find agreement about how the Iliad and the Odyssey had evolved from an original to a final state. Moreover, many of their arguments failed to take the oral nature of the epics into account. For instance, repetitions were often considered to be borrowed from an original rather than the outcome of an oral tradition in which they could serve as building blocks during an improvised performance.
The Oralists, in the wake of Milman Parry, largely ended the use of the Analytic methodology.9 During his fieldwork with Yugoslavian bards in the 1930s, Parry realized that many aspects of the Homeric texts could be explained by their oral nature: repetitions, type-scenes, epithets, digressions, etc. In fact, Neoanalysis is to some extent a continuation of the philological project of the Analysts that acknowledges Oralism.10 Neoanalysts investigate whether the Homeric works have borrowed earlier poetic material, such as from the Epic Cycle, or whether other poets have reused Homeric material. In doing so, Neoanalysts also acknowledge the inevitable fact that the Homeric works and other ancient poems received a fixed form at some point in their history. The relationship between the introduction of writing in Greece and the written fixation of the Iliad and the Odyssey has also been investigated.11
A widely accepted non-Greek influence on the Iliad and the Odyssey is that of the Near East.12 Eastern characteristics include epic quests, a journey to the underworld, catalogues, and the scales of Zeus. The presence of these characteristics in the Homeric works is undeniable; they can be identified through direct comparison with extant epics, such as that of Gilgamesh. However, such characteristics are less pervasive in the Iliad than are those that are discussed in this paper. For example, many of them occur only once and they do not appear in clusters. By contrast, each characteristic in the Supplementary Dataset for Cluster Analysis occurs at least three times in the passages of its cluster.
More must be said about the hexameter-based artificial poetic language of the Iliad and the Odyssey and about how it relates to the characteristics of the clusters.13 This artificial language largely makes use of the Ionic dialect, but it also has Aeolic ingredients, for example, for words or phrases that do not fit in the hexameter when translated to the Ionic dialect. Rarely, there are Arcadocypriot variants and archaisms that may even stem from Mycenaean Greece. Attic variants might be explained as intrusions that materialized after the Homeric works had been fixed. Dialectic variations are therefore not necessarily characteristics of any cluster – they are demanded by the hexameter system rather than by the contextual presence of a translatable, meaning-based cluster. In other words, linguistic features are most often context independent, as opposed to translatable characteristics, which are most often context dependent.
Acknowledging the different behavior of dialect-specific and translatable characteristics is a cornerstone of the clustering methodology utilized in this paper. For example, while the linguistic characteristics of modern Santa Claus rhymes may be some decades behind ordinary language, they betray little about the age-old origins of the translatable stories to which the rhymes allude.14 The analysis is admittedly more complex for an orally transmitted artificial language based on hexameters, in which linguistic characteristics can be much older, as shown by the Arcadocypriot archaisms. Nevertheless, if even translatable oral characteristics are sometimes context independent, then dialect-specific characteristics are, with our current knowledge about oral theory, unpromising candidates for characterizing meaningful clusters.15 After all, if a translatable oral characteristic does not fit in the hexameter, it is not altered but substituted with a synonym. If, on the other hand, a dialectic characteristic does not fit, it will only be retained in exceptional cases.
An approach to the Homeric Question that is somewhat similar to the characteristic-based clustering approach utilized in this paper is computational (or quantitative) authorship analysis, which clusters stylometrics.1816 However, most stylometrics, which are clearly discernible syntax-based characteristics that do not fully exploit the semantic layer, are not translatable to other languages. Hopefully, large language models will soon be able to achieve more.

3. Methodology

3.1. Manual Clustering

The methodology used in this study to discover non-trivial clusters in the Iliad centered on manually clustering translatable characteristics. The following steps, which have a loose temporal order, can be iterated in parallel to repeat this methodology:
  • Become familiar with a certain copy of the Iliad through reading and lookup.
  • Identify characteristics (simple concepts, patterns, or relations) that reoccur.
  • Memorize the context, book, and layout locations of these characteristics.17
  • List clusters of characteristics that often co-occur in the same passages.
  • Reduce advanced clusters, relations, and patterns to a single characteristic.
  • Start with a small cluster and expand the list later.
  • Make lists of passages that belong to the same cluster.
This process results in meaningful clusters, their lists of characteristics, and their lists of passages. Two terms need to be distinguished clearly: characteristics (the concepts themselves) and occurrences (the places where the concepts occur in the Iliad). The characteristics form clusters by being proximate in occurrence space (or contextual space), not necessarily in conceptual space. Clusters of conceptually proximate characteristics are trivial clusters that can often be grouped into a single characteristic.
One of the difficulties during this endeavor is that the clusters overlap, both with respect to the characteristics (conceptual overlap) and the passages (contextual overlap). It often takes time to realize that a characteristic, or two closely related versions of it, belong to two different clusters; likewise, there is a tendency to assign a passage to a single cluster rather than to multiple. Due to these challenges, it took many years to establish stability in the assignments of characteristics and passages to clusters. This muddling highlights the need for statistics to prove that the identified distinctions are meaningful. In any case, some degree of overlap is what we can expect for clusters that have geographically closely related origins.
The addition of each characteristic to a final list in the characteristics sheet of the Supplementary Dataset for Cluster Analysis has been carefully deliberated, especially with respect to each one’s level of generality. Generic characteristics capture more occurrences, but they are less precise than specific characteristics. For this reason, the included characteristics have many conceptual dependencies (such as gold and precious metal) and partial dependencies (such as gold and silver). Also, the length of the passages chosen for inclusion was a balanced choice. Shorter passages were sometimes selected for characteristic purity (the absence of characteristics of other clusters), while passages must remain long enough to counter the suspicion that they are selected for the sake of a small number of occurrences. Because the average number of lines of the passages is 22 (Mycenaean), 40 (Aeolian), and 83 (Ionian), the average number of occurrences per passage is tens or more than hundred, which invalidates this suspicion.
A challenge in providing the translatable characteristics with descriptions is the Homeric artificial language’s use of many synonyms that have exactly the same meaning. Examples are Achaioi, Argeioi, and Danaoi, all of which refer to the Greeks. However, it is the meaning that can be translated to other and younger dialects and languages that is best preserved through the ages. The many synonyms served to help the bards more easily fill in the demanding hexameters of the Homeric artificial language, whereas the meaning shared by the synonyms reaches far beyond that language in space and time. This makes the usage of Homeric Greek descriptions both inefficient and misleading. For this reason, the characteristics’ short descriptions are in English, not in Homeric Greek.

3.2. Statistically Investigating a Cluster

No statistical analysis was made before or during the process of publishing the results of the manual clustering in book form. The statistical test that is used in this paper is an independence test18 of two properties that a line of the Iliad might have for a given characteristic C of a cluster CC: a) whether C occurs in the line, and b) whether the line has been classified as belonging to a passage of CC. The outcome of the test is a p-value. Small p-values indicate high statistical significance and a small chance that the characteristic belongs to the cluster by coincidence. This test is especially useful for characteristics that can be found via digital search queries.
The short descriptions of the characteristics that have been published are reconstructive descriptions. This means that they aim to reconstruct how the characteristic functioned within the conjectured oral tradition from which it stems. In contrast, source-specific (here, Iliad-specific) descriptions created for the purposes of this paper describe how the characteristic can best be distinguished in our Iliad. For example, the descriptions ‘old man’ and ‘newly introduced old man’ are reconstructive and source specific, respectively, because most occurrences of ‘old man’ refer to Nestor or Priam in our Iliad. To determine statistical significance, source-specific descriptions must be used.

4. The Three Clusters

The results of the manual clustering that fall within the scope of this paper are three non-trivial clusters, called Mycenaean, Aeolian, and Ionian, and characterized by 58, 54, and 104 translatable characteristics, respectively. These characteristics are ordered by an importance score that is based on estimates of (from important to less important) how characteristic, historically informative, semantically proximate to other characteristics, and frequently occurring they are.

4.1. The Mycenaean Cluster

The fifteen most important characteristics of the Mycenaean cluster are the following: digressions (M1), age-old, well-known myths and stories (M2), kings (M3), seven-gated Thebes (M4), the change of power (M5), bloody feuds within the family (M6), revenge upon return (M7), (divine) genealogies (M8), wars against cities or between peoples (M9), the cycle of misery (M10), failed marriages (M11), the brave hero (M12), the many places and personal names (M13), the Peloponnese (and Central Greece) (M14), and the riches of the soil, typical of a place or city (M15). The conclusion that the origin of this cluster is the (early) Mycenaean Age is supported by many of these characteristics, especially those related to the struggle for the power over the Mycenaean palaces. Additional examples that show the semantic diversity of this cluster are superlatives, as in ‘the bravest of all mortals’ (M36), gods who cruelly punish those who insult them (M40), the old age of a man (M41), and male, godlike beauty (M43). Some characteristics that do not frequently occur are grouped with other characteristics that are similar – for example, Heracles, Tydeus, Neleus, Peleus, and Nestor (M20) – or that often occur together in a small subcluster or type-scene, such as the Furies (Erinyes), wrathful goddesses, and Hades (M28).
There are 70 passages in the Iliad that have been marked as Mycenaean, totaling 1560 lines of the 15691 lines in Lattimore’s Iliad (average passage length is 22.28 lines, and 9.94% of the lines are Mycenaean). The longest Mycenaean passage is the Catalogue of Ships, with 277 lines (Il. 2.494–770).19 A clear example of a Mycenaean passage is the digression about Phoinix’ youth (Il. 9.437–484). Here follows an annotation of Il. 9.478-84:
φεῦγον (1, 19) ἔπειτ᾽ ἀπάνευθε (24) δι᾽ Ἑλλάδος (13) εὐρυχόροιο (15),20
Φθίην (13) δ᾽ ἐξικόμην (24, 44) ἐριβώλακα (15) μητέρα (15) μήλων (31)
ἐς Πηλῆα (13, 20) ἄναχθ᾽: ὃ δέ με πρόφρων (30) ὑπέδεκτο (34),
καί μ᾽ ἐφίλησ᾽ (30) ὡς εἴ τε πατὴρ (30) ὃν παῖδα (30) φιλήσῃ (30)
μοῦνον τηλύγετον (5, 30) πολλοῖσιν ἐπὶ κτεάτεσσι (23, 30, 34),
καί μ᾽ ἀφνειὸν (23) ἔθηκε (5, 26), πολὺν δέ μοι ὤπασε (5, 26) λαόν (3):
ναῖον (44) δ᾽ ἐσχατιὴν (44) Φθίης (13) Δολόπεσσιν (13) ἀνάσσων (3).
Then (M1) I fled far away (M19, M24) through the wide (M15) spaces (M15) of Hellas (M13) and came as far (M24, M44) as generous (M15) Phthia (M13), mother (M15) of sheepflocks (M31), and to lord (M3) Peleus (M13, M20), who accepted (M34) me with a good will (M30) and gave me his love (M30), even as a father (M30) loves (M30) his own son (M30) who is a single child (M5, M30) brought up (M30) among many possessions (M23, M30, M34). He made (M5) me a rich (M23) man (M26), and granted (M5) me many people (M3, M26), and I lived (M44), lord (M3) over the Dolopes (M13), in remotest (M44) Phthia (M13).

4.2. The Aeolian Cluster

The twelve most important characteristics of the Aeolian cluster are the environment of Troy (A1), the close relationship with the Mycenaean cluster (A2), the mixture with war passages (A3), proper names specific to the Aeolian cluster (A4), precious, special horses (A5), defensive walls with a history (A6), eponyms (A7), seafaring, storms at sea, and islands (A8), Aeneas (A9), Heracles (A10), rivers (A11), and the fall of Troy (A12). The proper names specific to the Aeolian cluster are names of people, cities, regions, rivers, gods, horses, or ancestors.21 Most of these are related to Troy or its environment.
In total, 68 passages were marked as Aeolian, totaling 2717 lines (average length is 39.96 lines and 17.32% of the lines are Aeolian). The longest passage is 401 lines in length (Achilles fighting along and against the river Xanthus and the gods fighting in Il. 21.120–520). A clear example is Apollo’s care for Sarpedon’s body (Il. 16.666–683). An annotation of Il. 16.666-70 follows here:
καὶ τότ᾽ Ἀπόλλωνα (41) προσέφη νεφεληγερέτα Ζεύς (33, 42):22
‘εἰ δ᾽ ἄγε νῦν φίλε Φοῖβε (41), κελαινεφὲς αἷμα (38) κάθηρον (28)
ἐλθὼν ἐκ βελέων Σαρπηδόνα (4, 18), καί μιν ἔπειτα
πολλὸν ἀπὸ πρὸ φέρων (28) λοῦσον (28) ποταμοῖο (11, 25, 28) ῥοῇσι
χρῖσόν (28) τ᾽ ἀμβροσίῃ, περὶ δ᾽ ἄμβροτα εἵματα ἕσσον (28):
And now Zeus (A33, A42) who gathers the clouds spoke a word to Apollo (A41): “Go if you will, beloved Phoibos (A41), and rescue (A28) Sarpedon (A4, A18) from under the weapons, wash (A28) the dark suffusion of blood from him (A38), then carry (A28) him far away and wash (A28) him in a running river (A11, A25, A28), anoint (A28) him in ambrosia, and put ambrosial clothing upon him (A28).”
Compared to the other clusters, the Aeolian characteristics are semantically more diverse and thus apparently less dependent on each other. However, these characteristics are tightly connected by a myriad of stories outside of the Iliad. For example, the story that Thetis submerged Achilles in the river Styx by holding him by his heel, leading to his limited invulnerability and death by Paris’ arrow to that heel, has the following Aeolian characteristics: rivers (A11), Achilles (A15), sea gods and sea monsters (A21), nymphs and gods as one’s mother or father (A22), mighty mothers, women, and goddesses (A24), immersing a body in a river or sea (A25), a bow and arrow (A26), medicine, magic, and mysteries (A32), Paris and Pandarus (A35), injuries (A36), and fate and the wishes of the gods (A44).

4.3. The Ionian Cluster

The fifteen most important characteristics of the Ionian cluster are extraneous clusters as subspecialty (I1), materialism (I2), the guest friendship (xenia) (I3), the houses of nobles, servants, shepherds, heralds, and bards (oikos) (I4), the system of epithets specific to the Ionian cluster (I5), Homeric similes (I6), verbosity (I7), the gods in their home on Olympus (I8), type-scenes that repeat almost verbatim (I9), travel and travel matters (I10), jurisdiction (I11), bards (I12), shipping merchants, pirates, and slavers (I13), Sidon and Phoenicia (I14), and ships and shipping (I15). Additional examples include emotional, lovely, and poetic scenes (I17), double epithets (I22), sex and entertaining the audience (I52), and gods who swear by the Styx (I77). The Odyssey is a useful source for learning more about the Ionian cluster.
There are 50 passages that are marked as Ionian, totaling 4139 lines (average length is 82.78 lines, and 26.38% of the Iliad is marked as Ionian). These passages are those in which the Ionian cluster is present in a sufficiently pure state – there are no passages in the Iliad from which the Ionian cluster is entirely absent. In particular, the system of epithets specific to the Ionian cluster (I5) is both dependent on Ionian passages and pervasive in the Iliad. The longest Ionian passage is about the funeral games for Patroclus (639 lines, Il. 23.259–897). Clear examples of Ionian passages include the gods enjoying themselves on Olympus (Il. 1.601–611), Achilles and Patroclus going to sleep (Il. 9.656–671), Hephaestus receiving Thetis (Il. 18.369–427), and Zeus planning to save Hector’s corpse (Il. 24.20–111). Here follows an annotation of Il. 1.601–611:
ὣς τότε μὲν πρόπαν ἦμαρ (I98) ἐς ἠέλιον καταδύντα (98)23
δαίνυντ᾽ (37, 38), οὐδέ τι θυμὸς ἐδεύετο δαιτὸς (38) ἐΐσης (9),
οὐ μὲν φόρμιγγος (12, 28, 32) περικαλλέος (2, 5) ἣν ἔχ᾽ Ἀπόλλων (12, 32),
Μουσάων (32) θ᾽ αἳ ἄειδον (28) ἀμειβόμεναι (28) ὀπὶ καλῇ (17, 37).
αὐτὰρ ἐπεὶ κατέδυ (98) λαμπρὸν (46) φάος (46) ἠελίοιο,
οἳ μὲν κακκείοντες ἔβαν οἶκον (4, 25, 64) δὲ ἕκαστος,
ἧχι ἑκάστῳ δῶμα (4) περικλυτὸς (22) ἀμφιγυήεις (22, 39)
Ἥφαιστος (84) ποίησεν (2) ἰδυίῃσι (46) πραπίδεσσι (46):
Ζεὺς δὲ πρὸς ὃν λέχος (9, 24, 25) ἤϊ᾽ Ὀλύμπιος (7, 22) ἀστεροπητής (7, 22),
ἔνθα πάρος κοιμᾶθ᾽ (98) ὅτε μιν γλυκὺς (5, 17) ὕπνος ἱκάνοι (7, 9, 39):
ἔνθα καθεῦδ᾽ ἀναβάς (9, 67), παρὰ (64) δὲ χρυσόθρονος (24, 41) Ἥρη (9, 67).
Thus, the whole day (I98) until the sun went down (I98) they feasted (I37, I38). Neither did their hearts lack an equal portion (I38) of the banquet (I9, I38) nor the exceedingly (I2, I5) beautiful (I2, I5, I17) lyre (I12, I28, I32) of Apollo (I12, I32) and the Muses (I32) singing (I28) beautifully (I17), taking turns (I28, I37). But when the radiant (I46) light (I46) of the sun went down (I98) they went each to their own home (I25, I64) to sleep (I98) where (I39) for each the famous (I22), lame-footed (I22) Hephaestus (I84) had built a house (I4, I25) with (I39) skillful (I2, I46) craftsmanship (I2, I46). Zeus, the Olympian (I7, I22) lightning master (I7, I22), went to his bed (I9, I24, I25). Having gone up there (I64) he slept (I9, I67, I98), beside (I64) Hera (I9, I67) of the golden (I41) throne (I24).24
Several lines of evidence support the claim that the discovered ‘Ionian’ cluster reflects the pure state of the oral tradition of the Homeric artificial language. First, its historical core centers on the xenia and the oikos, which fits in the Homeric Age and the centuries that preceded it.25 Second, the large number (104) of characteristics suggests that they have not eroded during translation to another language or dialect. Several of them are also barely translatable; they are often demanded by the hexameter system, although they do occur more often in Ionian passages. Examples include the system of epithets specific to the Ionian cluster (I5), type-scenes that repeat almost verbatim (I9), double epithets (I22), descriptive clauses (I39), and a line with multiple addresses (I45).
Third, the system of epithets (I5) is strongly related to other characteristics in the Ionian cluster. Three kinds of subjects are provided with epithets in a natural manner in the Ionian passages: material objects (I2), women (I26 and I31), and animals (I81 and I83). Most often, they are described by how they look (all three) or how they move (primarily animals). Extraneous subjects from other clusters often adopt such epithets, such as perikallea diphron (the very beautiful chariot) and podas okys Achilleus (the swift-footed Achilles).
Just as with the Ionian epithets, the Homeric similes (I6) are rather context independent (and therefore probably Ionian), and they fit well with the Ionian cluster with respect to their content: emotional, lovely, and poetic scenes (I17), crafts and professions (I27), peacefulness (I37), actions and objects described in detail (I40), the animal world (I81), and hunting and farming (I83).

5. Statistical Analysis

This section describes the testing conducted to determine if a given characteristic occurs independently of the manually identified passages of its cluster (the null hypothesis H0) or if it occurs significantly more (or less)26 often in these passages (the alternative hypothesis H1).27 The independence test was executed by creating a 2x2 contingency table that contained the categories in which a line can be classified: 1) whether or not the line contains the given characteristic and 2) whether or not the line is part of a passage that is classified as belonging to the cluster of that characteristic. If assessed for a sufficiently large number of lines, this test can determine whether the two categories are significantly associated. Table 1 can serve as an example.
A line was classified as containing ‘Apollo’ (a subcharacteristic of ‘Apollo, Poseidon, and sometimes Artemis’ [A41]) if it was found via a search query on ‘Apollo’, ‘Phoibos’, or (son of) ‘Leto’. Whether or not a line is an Aeolian line is documented in the Supplementary Dataset for Cluster Analysis.
This methodology may seem flawed, because, as explained, the Homeric artificial language uses many synonyms to indicate the same concept, depending on the hexametric position in which the concept should fit. In the case of Apollo, the following synonyms exist: Σμινθεύς (Smintheus, Il. 1.39), ἑκάεργος (archer, e.g. Il. 1.147), ἀργυρότοξος (the silver-bow god, e.g. Il. 5.517), ἕκατος (the far striker, e.g. Il. 20.71), and Διὸς υἱός (Zeus' son, e.g. Il. 22.302). Including them in the search results in extra work that is more error prone, less reproducible, and, most importantly, having a greater chance to be biased. One can easily imagine the situation in which a non-Aeolian occurrence of ‘archer’ is overlooked as a synonym for Apollo, while more attention is payed to include all the Aeolian occurrences. For this reason, the analysis is deliberately limited to digital searches on a few prominent terms for concepts that occur frequently (resulting in 141 occurrences for Apollo). After all, this exploits the strength of (extreme) statistical significance: it excludes the null hypothesis with high confidence in spite of not using all the evidence.
The scripting language R was used to calculate the p-values (the probabilities of getting the observed or even more extreme values if H0 is true) via both a Fisher’s exact test and, if the smallest expected value was greater than 5, a Pearson’s χ2 test with Yates’ continuity correction.28 The least significant p-value was then selected and rounded to an even less significant value. So, there were several mechanisms in place to avoid an overestimation of significance.
In the case of Apollo in the Aeolian cluster, the expected value (141*2717)/15691 was found, which is about 24.4. This is the number of times we could expect to find Apollo in the Aeolian passages if H0 is true. The calculated p-value was rounded to 0.00000000008 (8E-11), which is (larger than) the probability of finding Apollo 57 or more times in these passages if H0 is true. The association between Apollo and the Aeolian passages was therefore found to be extremely significant. Note that this is the case even though the majority of the occurrences of Apollo are non-Aeolian. After all, the majority of the lines in the Iliad are non-Aeolian. These statistical calculations were necessary to validate the probabilistic intuitions acquired during the manual clustering phase. Table 2 shows all the (source-specific) (sub)characteristics that were analyzed statistically,29 ordered according to how frequently they occur. Characteristics with a p-value > 0.05 are not significant. Those with 0.05 ≥ p-value > 0.01 are significant, those with 0.01 ≥ p-value > 0.001 are highly significant, and those with 0.001 ≥ p-value are extremely significant.
Several attempts were made to find a frequently occurring characteristic that does not have significance by analyzing characteristics about which there was doubt: the old age of a man, the number nine, Heracles, a god using mist, three times the same action, a whip, and wine. The number nine, which occurs frequently in both the Iliad and the Odyssey, was not strongly associated with any cluster. There was doubt about ‘whip’ belonging to the peace-loving Ionian cluster, as whips are also used by the charioteers in the war passages. In spite of this, the intuition that I65 (the whip and unreluctantly trotting horses) is dependent on the Ionian passages was correct.
Most of the analyzed (sub)characteristics were chosen because they are digitally findable. The superlatives were found by reading the first 12 books of the Iliad, which was enough to achieve extreme significance.30 Furniture was identified by searching for several terms: lechos (bed), trapez- (table), thronos (chair), and threnys (footstool). The age of an untamed head of cattle is more easily found in Lattimore’s translation by searching for ‘year’ or ‘year-old’ and ‘yearling’ rather than for henis (one year old), pentaeteros (five year old), or hexetes (six year old). The analysis of the subcharacteristic ‘Pergamos’ exemplifies that a lack of occurrences results in a lack of significance.
One important finding of this analysis is the difference between reconstructive characteristics and source-specific characteristics (See Table 2 versus Table 3). For example, the old age of a man (M41) is much less significantly associated with the Mycenaean cluster than is the old age of a newly introduced man (0.06 versus 0.00005); outside of the Mycenaean context, many mentions of old age refer to Nestor and Priam. Even more striking is the difference between ‘Lycian’ and ‘Lycia’ (0.7 versus 0.000000000003) in the Aeolian cluster, as the Lycians are more often mentioned throughout the war passages. Lycia, in contrast, is almost invariably mentioned alongside many other Aeolian characteristics to form an Aeolian passage. This indicates that the manual clustering methodology is neither mechanical nor syntactical. Instead, its use has resulted in intuitions about meaningful concepts and how they occur in the passages of their cluster, without always realizing how semantically close, source-specific variants of them occurred outside of these passages. The fact that such distinctions could be made during the statistical analysis shows the success, rather than the failure, of the manual clustering methodology and its potential to reconstruct clusters with meaningful origins.

6. Discussion: Underlying Hypothesis

A hypothesis about oral theory that is probably needed to explain the observed mixture of the conjectured clusters is that certain passages in an Iliad-like song or cycle that was orally transmitted during centuries, were traditionally connected with certain clusters/traditions. Such connections are in turn explained by the presence of certain characteristics in these passages that are important within these traditions. For example, the Aeolian cluster appears more prominently in Iliad 5 (the aristeia of Diomedes), Iliad 20 (the aristeia of Achilles), and Iliad 21 (Achilles in the river Xanthus). This is probably because the prominent presence of Diomedes (A16), Achilles (A15), a river (A11), the name Xanthus (A17), immersing a body in a river or the sea (A25), and the mixture with war passages (A3) are important characteristics of the Aeolian tradition. Another example is that the passages that describe the gods in their home on Olympus (I8) are more strongly colored by the Ionian tradition because they are connected with the oikos: the houses of nobles, servants, shepherds, heralds, and bards (I4). Finally, age-old, well-known myths and stories (M2) most easily fit into a story via digression, which is why the other characteristics of the Mycenaean tradition are often also found in digressions. However, it is important to note that the clusters cannot be reduced to the trivial sets of characteristics that keep these clusters together in this hypothesized Iliad-tradition. The clusters as a whole are much more diverse, detailed, and widely applicable.31

7. Conclusions

The manual clustering methodology used in this study consists of listing translatable oral characteristics that Ionian bards may have used during the improvisation of certain subspecialties, such as extraneous oral traditions. Characteristics that often occur closely together in the text constitute a cluster together. Advanced clusters, patterns, and relations can in turn be reduced to a single characteristic. This clustering methodology identified three non-trivial clusters in the Homeric works. By comparing these clusters’ characteristics with historical and geographical facts, they were found to probably be of respectively Mycenaean, Aeolian, and Ionian origin.
The non-trivial clusters are difficult to identify because all but the Ionian cluster have been translated from a different or more ancient artificial language to the Homeric artificial language of the Ionian tradition via one or more steps. Due to this translation, the Homeric works have context-independent linguistic features; nonetheless, these works can be divided into many passages with different origins thanks to the manual clustering methodology explained in this paper, which is based on translatable oral characteristics that do not alter substantially during a translation step. Another reason for their difficult identification is that the discovered clusters overlap to some degree, with regard to both their characteristics and their passages.
Statistical analysis revealed that a large majority of the intuitions acquired during the manual clustering phase are in fact correct. Moreover, what initially appeared to be a failure for some characteristics has resulted in a distinction between reconstructive and source-specific characteristics. This distinction shows the potential of the manual clustering methodology to reconstruct clusters with meaningful origins.
Scholars may be tempted to reduce a cluster to a subset of its characteristics (or, worse, to its hypothesized geographical origin) and to explain the other characteristics as dependent on that subset. This reduction fails – the full sets of characteristics are large, often detailed, semantically diverse, and widely applicable, making them non-trivial. Admittedly, mastering even one of the clusters by acquiring the same intuitions, whether in a translation or not, takes time and effort. Therefore, for those who only want a validation, the proposed statistical approach can be used.

Supplementary Materials

The Supplementary Dataset for Cluster Analysis (ClusterAppStat.xlsx – Blondé 2026) contains a sheet with the oral characteristics of the three clusters, a sheet with the passages of these clusters, and a sheet with statistical calculations (https://doi.org/10.5281/zenodo.18730472).

Funding

This research received no external funding.

Data Availability Statement

The full Homeric Traditions Apparatus (Version v1) from which the Supplementary Materials is derived, is available in Zenodo (https://doi.org/10.5281/zenodo.17838991). The books relevant to this study are also available in Zenodo and cited in the References.

Acknowledgments

ChatGPT was used for help with the statistical analysis, R programming, and translations from Homeric Greek.

Conflicts of Interest

The author declares no conflict of interest.

Notes

1
The concept of (translatable) oral characteristic is not typically used in Homeric research, as the focus is usually on characteristics that require thorough engagement with the Homeric Greek original. The clustering effort described in this paper thoroughly engaged with translations of the Iliad, without neglecting the original in cases of doubt.
2
Not all characteristics are detailed. However, when several generic characteristics occur together, they can be considered to make up a detailed characteristic as a group.
3
The three clusters are described in Blondé 2018, 2020, and 2022. A fourth and a fifth cluster, probably with a non-Greek European origin, are more controversial and seem to reflect two alternating roles: a war and a narrative role (Blondé 2019 and 2021). As such, they are beyond the scope of this paper. Given that the clusters and their reconstructive characteristics are published, and in order to avoid bias, they have been left unaltered during analysis and presentation in this paper.
4
This dataset is derived from https://doi.org/10.5281/zenodo.17838991: the larger Homeric Traditions Apparatus (Version v1) that is in turn derived from Blondé (2018-2022).
5
There are many variants of the Iliad and the Odyssey. Many of the differing details are probably the result of errors that crept in during the transmission history of these works. Although such differences are small, Nagy (1996) 29 proposes that they are the result of an evolutionary text-fixation of the Homeric works, in which a fluid phase, with improvisation, was succeeded by a more static phase that involved written texts and memorization.
6
Griffin (2004) 4.
7
Wolf (1795), republished in Wolf (1985) 70.
8
Prominent Analysts include Hermann (1832), Lachmann (1847), von Wilamowitz-Moellendorff (1916), and Merkelbach (1951). Although West (2011) believed a single poet (“P”) was responsible for the Iliad, he used many of the ideas of the Analysts. Prominent Unitarians include Nitzsch (1830), Schadewaldt (1938), and Bowra (1958).
9
Prominent Oralists include Parry (1971), Lord (1960), and Ready (2019).
10
Kakridis (1949); Finkelberg (2011) 197 argued that the Iliad and the Odyssey had a unique position among ancient Greek epic texts, although not necessarily in a written form.
11
Powell (1996).
12
Burkert (1992); West (1997).
13
Adrados (1981) 13–17; Patzer (1996) 23–86.
14
Ghesquiere (1989).
15
In spite of this, many have argued that dialectic features are not (or not entirely) context independent in the history of Homeric scholarship. Analysts and later scholars claim to see traces of an earlier Aeolic phase of the epic (Janko 1982, West 2011) and Bozzone (2024) points out that dialectic characteristics may be adopted and/or retained because of their socio-linguistic value.
16
Päpcke et al. (2023); Pavlopoulos and Konstantinidou (2023); Sandell (2023).
17
Fast lookup is indispensable during this endeavor, as simply writing down references ultimately results in an expanded copy of the entire text. This is why the text itself has to be the device where lookup happens fastest.
18
Kateri (2014).
19
All references, English translations, and statistics are from Lattimore (2011), who translated the Greek edition of Monro and Allen (1920), because Lattimore translated all 15691 lines to which he gave a number. There are two differences of one line number between these two editions: between 11.543 and the end of Book 11 (11.847) and between 18.604 and the end of Book 18 (18.616). A comparison with West’s (1998–2000) more modern edition of the Iliad was also conducted; however, many of the lines that West rejected appear to fit well in the cluster of their context, which could mean that West rejected too many lines.
20
Digressions (M1), kings (M3), the change of power (M5), the many places and personal names (M13), riches of the soil, typical of a place or city (M15), the flight after a crime (M19), Herakles, Tydeus, Neleus, Peleus, or Nestor (M20), being rich (and honorable) (M23), long wanderings (M24), the reward of the king (M26), the special education (M30), large herds of cattle, horses, or sheep (M31), the loving education or adoption in a palace (M34), and the move to a distant place (M44).
21
The proper names themselves are part of the long description of A4 (Blondé 2020).
22
Proper names specific to the Aeolian cluster (A4), rivers (A11), the Lykians (A18), immersing a body in a river or the sea (A25), taking care of the dead and wounded (A28), the gods who interfere, divided over two camps (A33), corpses that are often mutilated (A38), Apollo, Poseidon, and sometimes Artemis (A41), and the supreme command of Zeus (A42).
23
The materialism (I2), the house of nobles, servants, shepherds, heralds, and bards (I4), the system of epithets specific to the Ionian Tradition (I5), verbosity, or using many words to say the same thing (I7), type-scenes that repeat almost literally (I9), bards (I12), emotional, lovely, and poetic scenes (I17), double epithets (I22), footstools, seats, and ornate furniture (I24), the facilities of Olympus (I25), singing, dance, and the lyre (I28), Muses and Apollo with the lyre (I32), peacefulness (I37), feasts and the preparation of meals (I38), descriptive clauses (I39), precious metals (I41), the duo of related terms (I46), the interior design and the positions of furniture and people (I64), spreading beds and sleeping in the back next to a woman (I67), the lame Hephaestus, god of blacksmithing (I84), and the alternation of day and night (I98).
24
Own translation.
25
Finley (2002).
26
All the characteristics of the three clusters are positive in the sense that they occur more (rather than less) often in the passages of their cluster. This means that one-tailed (right-tailed) tests could be used, which would result in greater significance. Nevertheless, in order to avoid any bias, the standard two-tailed tests were used for calculating the p-values.
27
Goodness-of-fit tests (D’Agostino 2017) were also conducted for passages annotated with characteristics, and these tests suggested good significance. However, they required too much explanation with respect to how partially dependent occurrences should be treated and how to objectively acquire triangulated estimates for them.
28
Franke and Christie (2012); Connelly (2016). The R script used is in the analysis sheet of the Supplementary Dataset for Cluster Analysis.
29
The corresponding reconstructive (super)characteristics in the list of 216 are proper names specific to the Aeolian cluster (A4), Heracles (A10), the name Xanthus (A17), the Lycians (A18), Paris and Pandarus (A34), Apollo, Poseidon, and sometimes Artemis (A41), a god who envelops a person in a cloud (A45), three times the same action (A46), materialism (I2), footstools, seats, and ornate furniture (I24), singing, dance, and the lyre (I28), olive trees and olive oil (I33), precious metals (I41), the whip and unreluctantly trotting horses (I65), the age of an untamed head of cattle (I70), the numbers nine and twelve (and ten and eleven) (I79), wine (for libation or drinking) (I101), seven-gated Thebes (M4), being rich (and honorable) (M23), superlatives, as in ‘the bravest of all mortals’ (M36), the old age of a man (M41), and the numbers nine and twelve (M56).
30
The first 12 books contain 7588 lines, of which 1070 are Mycenaean.
31
A somewhat more speculative hypothesis for the observed mixture is that the Iliad has always been an epic that was performed through alternating improvisation by two or more bards with different roles, namely by adding extra roles to the European war and narrative roles (Blondé 2022).

References

  1. Adrados, F. R. (1981), ‘Towards a new stratigraphy of the Homeric dialect’, Glotta 59, 13–27.
  2. Blondé, W. (2018). The Mykenaian Alpha Tradition: On the origin of Greek stories (p. 180). Amazon Independent Publishing Platform. [CrossRef]
  3. Blondé, W. (2019). The European Beta Tradition: On the origin of the Iliad (p. 155). Amazon Independent Publishing Platform. [CrossRef]
  4. Blondé, W. (2020). The Aeolian Gamma Tradition: On the origin of Roman stories (p. 192). Amazon Independent Publishing Platform. [CrossRef]
  5. Blondé, W. (2021). The Narrative Delta Tradition: Iliadic fairy tales (p. 227). Amazon Independent Publishing Platform. [CrossRef]
  6. Blondé, W. (2022). The Ionian Epsilon Tradition: Homer's finishing touch (p. 198). Amazon Independent Publishing Platform. [CrossRef]
  7. Blondé, W. (2026). Supplementary Dataset for Cluster Analysis of: Statistically Validating Homeric Clusters of Translatable Oral Characteristics. Zenodo. [CrossRef]
  8. Bowra, C. M. (1958), Tradition and Design in the Iliad, Clarendon Press, Oxford.
  9. Bozzone, C. (2024). Homer's Living Language: Formularity, Dialect, and Creativity in Oral-Traditional Poetry. Cambridge University Press.
  10. Burkert, W. (1992), The Orientalizing Revolution: Near Eastern Influence on Greek Culture in the Early Archaic Age, Harvard University Press, Cambridge.
  11. Connelly, L. M. (2016), ‘Fisher’s exact test’, MEDSURG Nursing 25.1, 58–60. [CrossRef]
  12. D’Agostino, R. B. (2017), Goodness-of-fit-Techniques, Routledge, Oxfordshire.
  13. Finkelberg, M. (2011), ‘Homer and his peers. Neoanalysis, oral theory, and the status of Homer’, Trends in Classics 3.2, 197–208. [CrossRef]
  14. Finley, M. I. (2002), The World of Odysseus, New York Review Books, New York.
  15. Franke, T. M., T. Ho, and C. A. Christie (2012), ‘The chi-square test: Often used and more often misinterpreted’, American Journal of Evaluation 33.3, 448–458. [CrossRef]
  16. Ghesquiere, R. (1989), Van Nicolaas van Myra tot Sinterklaas: de kracht van een verhaal, Acco, Leuven.
  17. Griffin, J. (2004), Homer. The Odyssey, Cambridge University Press, Cambridge.
  18. Hermann, J. G. J. (1832), De Interpolationibus Homeri Dissertatio, Typis Staritzii, Leipzig.
  19. Janko, R. (1982). Homer, Hesiod and the Hymns: Diachronic Development in Epic Diction. Cambridge University Press.
  20. Kakridis, J. T. (1949), Homeric Researches, Gleerup, Lund.
  21. Kateri, M. (2014), Contingency Table Analysis: Methods and Implementation Using R (Statistics for Industry and Technology), Birkhäuser, Basel.
  22. Lachmann, K. (1847), Betrachtungen über Homers Ilias, De Gruyter, Berlin.
  23. Lattimore, R., and R. Martin (2011), The Iliad of Homer (New Introduction and Notes by Richard Martin; First Published 1951), The University of Chicago Press, Chicago.
  24. Lord, A. B. (1960), The Singer of Tales, Harvard University Press, Cambridge.
  25. Merkelbach, R. (1951), Untersuchungen zur Odyssee, Zetemata II, Beck, München.
  26. Monro, D. B., and T. W. Allen (Eds.) (1920), Homer. Homeri Opera (Vols. 1–5). Clarendon Press, Oxford.
  27. Nagy, G. (1996), Homeric Questions, University of Texas Press, Austin.
  28. Nitzsch, G. W. (1830), De Historia Homeri Maximeque de Scriptorum Carminum Aetate Meletemata, University of California Libraries, California.
  29. Päpcke, S., T. Weitin, K. Herget, A. Glawion, and U. Brandes (2023), ‘Stylometric similarity in literary corpora: Non-authorship clustering and Deutscher Novellenschatz’, Digital Scholarship in the Humanities 38, 277–295. [CrossRef]
  30. Parry, A. (Ed.) (1971), The Making of Homeric Verse. The Collected Papers of Milman Parry, Oxford University Press, Oxford.
  31. Patzer, H. (1996), Die Formgesetze des homerischen Epos, Franz Steiner, Stuttgart.
  32. Pavlopoulos, J., and M. Konstantinidou (2023), ‘Computational authorship analysis of the Homeric poems’, International Journal of Digital Humanities 5.1, 45–64. [CrossRef]
  33. Powell, B. B. (1996), Homer and the Origin of the Greek Alphabet, Cambridge University Press, Cambridge.
  34. Ready, J. L. (2019), Orality, Textuality, and the Homeric Epics: An Interdisciplinary Study of Oral Texts, Dictated Texts, and Wild Texts, Oxford University Press, Oxford.
  35. Sandell, C. B. R. (2023), ‘One or many Homers? Using quantitative authorship analysis to study the Homeric Question’, Proceedings of the 32nd Annual UCLA Indo-European Conference: November 5th, 6th, and 7th, 2021. Vol. 21, 21–48, Helmut Buske Verlag, Hamburg.
  36. Schadewaldt, W. (1938), Iliasstudien, Hirzel, Leipzig.
  37. von Wilamowitz-Moellendorff, U. (1916), Die Ilias und Homer, Weidmannsche Buchhandlung, Bonn.
  38. West, M. L. (1997), The East Face of Helicon: West Asiatic Elements in Greek Poetry and Myth, Oxford University Press, Oxford.
  39. (Ed.) (1998–2000), Homerus Ilias, Bibliotheca Teubneriana (2 vols.), Berlin.
  40. (2011), The Making of the Iliad: Disquisition and Analytical Commentary, Oxford University Press, Oxford.
  41. Wolf, F. A. (1795), Prolegomena to Homer, (1985) transl. by A. Grafton, G.W. Most, and J. Zetzel, Princeton University Press, Princeton.
Table 1. A 2x2 contingency table that investigates the dependence of ‘Apollo lines’ on the Aeolian lines.
Table 1. A 2x2 contingency table that investigates the dependence of ‘Apollo lines’ on the Aeolian lines.
Title 1 Apollo No Apollo Row total
Aeolian 57 2660 2717
Not Aeolian 84 12890 12974
Column total 141 15550 15691
Table 2. An overview of all the translatable oral characteristics that were analyzed. ‘In’ is the number of times the (sub)characteristic occurs in a passage of its corresponding cluster. ‘Out’ is the number of times it occurs outside of these passages.
Table 2. An overview of all the translatable oral characteristics that were analyzed. ‘In’ is the number of times the (sub)characteristic occurs in a passage of its corresponding cluster. ‘Out’ is the number of times it occurs outside of these passages.
ID Source-specific (sub)characteristic Cluster In Out p-value
I2 Beautiful (kalos) Ionian 67 79 0.0000004
A41 Apollo (Apollon, Phoibos) Aeolian 57 84 0.00000000008
I41 Gold (chrys-) Ionian 69 41 0.000000000000002
M36 Superlatives Mycenaean 25 29 0.00000002
A46 Three (treis, tris) Aeolian 19 34 0.002
I24 Furniture (lechos, thronos, …) Ionian 29 20 0.000002
I101 Wine (oinos) Ionian 19 22 0.007
I65 Whip (mastix, mastizo) Ionian 14 13 0.007
A45 God using mist (aer) Aeolian 11 15 0.001
M41 Newly introduced old man Mycenaean 10 14 0.00005
A18 Lycia (Lykia) Aeolian 19 3 0.000000000003
A17 Xanthus (Xanthos) Aeolian 14 8 0.000002
I79 Nine (ennea) Ionian 11 10 0.02
M56 Nine (ennea) Mycenaean 5 16 0.06
A10 Heracles (Herakl-, Dios huios) Aeolian 9 8 0.0009
A34 Pandarus (Pandar-, Lykaon) Aeolian 9 7 0.0005
M23 Rich (aphneios) Mycenaean 10 4 0.00000007
I33 Olive (elai-) Ionian 7 3 0.005
M4 Thebes (Thebe, Thebai) Mycenaean 8 1 0.00000008
I28 Lyre (phorminx, kitharis) Ionian 6 2 0.006
I70 Age cattle (henis, hexetes, …) Ionian 6 2 0.006
A4 Pergamos (Pergamos) Aeolian 3 3 0.07
Table 3. An overview of all the reconstructive oral characteristics that were much less significant than their source-specific improvements.
Table 3. An overview of all the reconstructive oral characteristics that were much less significant than their source-specific improvements.
ID Reconstructive characteristic Cluster In Out p-value
M41 Old man (ger-) Mycenaean 23 131 0.06
A18 Lycian (Lykios) Aeolian 10 38 0.7
A46 Three times an action Aeolian 9 15 0.03
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated