A Unified Framework to Homologize Appendage Segments across Arthropoda

How to align leg segments between the four groups of arthropods (insects, crustaceans, myriapods, and chelicerates) has tantalized researchers for over a century. By comparing the loss-of-function phenotypes of leg patterning genes in diverged arthropod taxa, including a crustacean, insects, and spiders, we show that all arthropod legs can be aligned in a one-to-one fashion. We propose a model wherein insects incorporated two proximal leg segments into the body wall, which moved the ancestral leg lobe (exite) up onto the back to later form wings. For myriapods and chelicerates with seven leg segments, it appears that one proximal leg segment was incorporated into the body wall. According to this model, the chelicerate exopod and the crustacean exopod emerge from different leg segments, and are therefore proposed to have arisen independently. A framework for how to align arthropod appendages now opens up a powerful system for studying the origins of novel structures, the plasticity of developmental fields, and convergent evolution.


Introduction
Arthropods are the most successful animals on the planet, in part due to the diversity of their appendages. The vast diversity of arthropod appendage form is reflected in their diversity of function: arthropod legs have been modified for walking, swimming, flying, chewing, breathing, Legs can be split ("biramous"), meaning that two distal leg branches emerge from the same proximal leg base. In this case, the lateral leg branch is called the exopod, and the medial leg branch is called the endopod (Fig. 1A, D). While exopods and exites both emerge laterally, exopods are a continuation of the leg, so they have muscle insertions and are often segmented (Fig. 1D). In contrast, exites are lobe-like outgrowths on the leg that lack muscle insertions and segmentation (Fig. 1B). This difference is also reflected in their development: exopods and endopods arise when the distal end of the developing limb bud splits in two, while exites emerge later by budding off of the existing proximal leg (3)(4)(5).
Arthropod legs are divided into segments ( Fig. 1E -H). Leg segments are bounded to either side by joints where muscles insert ( Fig. 1B) (6,7). Leg segments sometimes have subdivisions within them where no muscle inserts, i.e. muscles pass through the subdivision without inserting. These serve as points of flexion, such as the subdivisions in the tarsus of insects, arachnids, and myriapods (Fig. 1F), but do not represent true segments.
The leg segments of chelicerates, myriapods, crustaceans, and insects have different numbers, shapes, and names. Chelicerates have either 7 or 8 leg segments, myriapods have either 6 or 7, insects have 6, and the crustacean ground plan has 7 or 8 leg segments (Fig. 1E-H) (2,6,(8)(9)(10). For over a century, researchers have proposed many different theories to account for this variation (2,6,11,12). Using morphology, authors have proposed leg segment deletions, duplications, and fusions to account for the different numbers of leg segments between arthropod taxa. Other authors concluded that arthropod legs cannot be homologized and aligned at all (13).
More recently, the expression of several leg patterning genes was compared in a chelicerate, a crustacean, and an insect, but this too provided no clear way of aligning and homologizing leg segments, likely due to dynamic changes in gene expression (14).

Alignment of insect and crustacean legs reveals the origin of insect wings
The origin of insect wings has been a contentious problem for over 130 years. Two competing theories have developed to explain their emergence. Given that insects evolved from crustaceans (15), one theory is that insect wings evolved from crustacean exites (outgrowths such as plates or gills on the proximal leg) (16). The second theory proposes that crustacean exite genes were co-opted and expressed by an unrelated tissue, the dorsal body wall, in order to form insect wings on the back, meaning wings are a novel structure not present in crustaceans. To test these two hypotheses, Bruce & Patel 2020 used CRISPR-Cas9 gene editing to compare the function of five leg patterning genes, Distalless (Dll), dachshund (dac), Sp6-9, extradenticle (exd), and homothorax (hth), in the amphipod crustacean Parhyale hawaiensis. By comparing the leg segment deletion phenotypes in Parhyale to previously published results in insects, they found that the six distal leg segments of Parhyale and insects (leg segments 1 -6, counting from the distal claw) could be aligned in a one-to-one fashion (Fig. 2) (17).They then wanted to understand the proximal leg segments. To do so, they compared the expression of pannier (pnr), the Iroquois complex gene aurucan (ara), and wing genes in Parhyale and insects (17,18). They found that, in both Parhyale and insects, the expression of ara distinguishes two proximal leg segments (leg segments 7 and 8; Fig. 2), each of which bears an exite that expresses wing genes; and that pnr expression marks the true body wall. These data suggested that insects had incorporated two ancestral proximal leg segments, 7 and 8, into the body wall, which carried their exites dorsally. An alignment of the function of five leg patterning genes in conjunction with the expression of ara, pnr (17), and wing genes(18) places the insect wing in register with the Parhyale tergal plate: both are exites on the proximal-most (8 th ) leg segment. Therefore insect wings are not novel, but instead evolved from a structure that already existed in the crustacean ancestor.
This work demonstrated that crustacean and insect legs could be homologized in a straightforward, one-to-one relationship. No deletions, duplications or rearrangements were necessary to make sense of leg segment homologies: insects and crustaceans each have 8 leg segments. Notably, this is the same as the chelicerate ground plan(6) as well as the ancestral arthropod ground plan, because the ancestor of all living arthropods also had 8 leg segments (19). If insect and crustacean legs can be homologized, this model may extend to myriapods and chelicerates as well, in a grand unified theory of appendages across all four groups of arthropods.

Homologizing chelicerate and pancrustacean legs
To align Parhyale and chelicerate legs, we compared our leg segment deletion phenotypes in Parhyale to previously published results in chelicerates, as we had done for insects. Functional experiments in chelicerates have been performed for hth, Dll, Sp6-9, and dac.
While functional data is not available for exd in a chelicerate, given that exd and hth are cofactors, we presume that the leg segment deletion phenotypes of exd and hth are similar in chelicerates as they are in other arthropods (20)(21)(22)(23). Based on the leg segment deletion phenotypes of hth, Dll, Sp6-9, and dac, the six distal leg segments of Parhyale, insects, and chelicerates (leg segments 1 -6, counting from the distal claw) can be aligned in a one-to-one fashion.
If the six distal leg segments of Parhyale, chelicerates, and insects are in alignment, this suggests a model of how to homologize all of their leg segments (Fig. 5). Chelicerates with eight leg segments, such as sea spiders, align one-to-one with the eight leg segments of Parhyale and insects. However, in chelicerates with seven leg segments, one of the two proximal leg segments is missing, and must be accounted for. One hypothesis is that one of the two proximal leg segments was simply deleted. Another possibility is that the proximal-most leg segment was incorporated into the body wall, similar to how insects incorporated the proximal leg segments into their body wall. We discuss observations from morphology, phylogeny, paleontology, embryology, and molecular studies that argues that the proximal-most leg segment was incorporated into the body wall.

Discussion
Early-branching chelicerates often have eight leg segments like the euarthropod ancestor, including sea spiders, trombidiform mites, hooded tick-spiders, and camel spiders, while laterbranching chelicerates have seven leg segments, such as whip scorpions, vinegaroons, pseudoscorpions, scorpions, and spiders (6,42). An interesting exception is the horseshoe crab, Limulus (6). Limulus is an early-branching chelicerate (42)(although see (43)) but has seven leg segments. However, it also has two proximal structures, a free endite and a pleurite (Fig. 1E), which may represent the remnant of the missing 8th leg segment. Proximal to the coxa of each walking leg is a small, spiny endite ("free endite", or "epicoxite") that does not belong to any obvious leg segment, but is nevertheless moveable by muscles (44). Dorsal to the coxa of each walking leg is a conspicuous Y-shaped pleurite (body wall exoskeleton plate), which articulates with the coxa. As moveable endites and pleurites are characteristic of leg segments, these may represent the remnant of a proximal 8th leg segment in Limulus (45,46). Support for this hypothesis can be found by examining the fossil xiphosurids Offacolus (47) and Dibasterium (48). In stark contrast to Limulus, these fossil xiphosurids have a large, segmented exopod on the most proximal leg segment. In Offacolus, the limb base is dorsoventrally elongated such that it occupies the entire lateral region of the body. The proximal leg region is not well preserved in these fossils, and thus the authors are uncertain how many proximal leg segments were present. However, a comparison to Limulus suggests that the limb base of Offacolus may actually be two leg segments (Fig. 6A): the elongate coxa like that seen in Limulus (8,44), and a smaller proximal leg segment from which the fossil exopod emerged. If this smaller proximal leg segment degenerated into the body wall of living horseshoe crabs like Limulus, this would explain how extant horseshoe crabs lost both the ancestral eighth leg segment and the ancestral exopod.
Embryological evidence of this proximal 8th leg segment can be observed in modern chelicerates with seven leg segments. In embryos of the spider Acanthoscurria, which is in the outgroup to other spiders, there is an additional leg segment-like structure proximal to the coxa in all walking legs ( Fig. 6B; (29)). Furthermore, even though there is no apparent exopod or outgrowth on this proximal fused leg segment, genes associated with outgrowing appendages, wingless (wg) and Distalless (Dll), are expressed here. In spiders, a dot of wg is expressed above each walking leg ( Fig. 6C; (30)), and in Limulus, a dot of Dll is expressed above each walking leg (49). This is reminiscent of insects, which have incorporated two leg segments into the body wall (Bruce 2020): wg is expressed in two dots above the leg, one dot on each incorporated leg segment ( Fig. 6D (50)). In insects, these wg dots pattern exites, but in chelicerates, the wg dot may be patterning the remnant of the exopod.
Thus it appears that chelicerate, crustacean, and insect legs can be aligned in a one-to-one fashion. We note that no functional data is available to myriapods, but given that they share a common ancestor with chelicerates and pancrustaceans, and the legs of the other three clades of arthropods align, we believe it's reasonable to assume that this leg model applies to myriapods as well.

A grand unified theory of dorsoventral patterning of arthropod appendages
This grand unified theory of arthropod legs allows for a reinterpretation of previous molecular work into a simple and coherent model of arthropod leg development. Wg is known to pattern body segments first, and then later, appendages. Previous studies in all four arthropod clades found that wg indeed patterns the ventral side of appendages. However, there seemed to be no obvious correspondence between the various lines and dots of wg expression in each clade.
However, if insect and spider lateral body wall are interpreted as incorporated proximal leg segments, wg expression across all four arthropod clades agrees quite well.
The crustacean Triops demonstrates this interpretation. In crustaceans (51), wg is initially expressed in a solid stripe in each body segment, just as it is in insects (52)(53)(54), myriapods (55,56), and spiders (30) (Fig. 7) . The Triops leg grows out like a shelf that wraps around dorsoventrally, rather than a cylinder. Endites emerge near the midline, exites emerge laterally, and endopod and exopod emerge between these. As these outgrowths develop, the line of wg expression is broken up and becomes restricted to the ventral region of each of these outgrowths.
If insect wing and lateral body wall are interpreted as proximal leg, wg expression in insects mirrors that of crustaceans. In Tribolium (50,54) and cricket (53) embryos, wg is expressed in the initial body segment stripe, and as the legs grow out, this line of wg expression is broken up such that wg is expressed in a ventral stripe in each leg. In addition to the ventral leg stripe, there are two regions of wg expression dorsal to the leg (Fig. 6D) A similar sequence is observed in spiders. According to the model presented here, arachnids with seven leg segments should have a cryptic ancestral proximal leg segment incorporated into the body wall, from which the ancestral chelicerate exopod used to emerge. In spider embryos, the initial body segment-patterning stripe of wg resolves into two domains, a ventral stripe on each appendage, and a dot dorsal to each coxa (Fig. 3C, (30, 57)). Thus, arachnids have one wg domain in the lateral body wall corresponding to one fused leg segment where the exopod once emerged (Fig. 6C), and insects have two wg domains in the lateral body wall corresponding to two fused leg segments, each with an exite (Fig. 6A). The expression domains of dpp are also consistent with this reinterpretation.

Independent origins of exopods
On the proximal-most leg segment of many fossil arthropods including trilobites, Leanchoilia, and Offacolus, there is a structure that many authors believe is an exopod (although see (8,12). The exopods of these fossil arthropods are often thought to be homologous to the exopod of crustaceans (Boxshall 2004;Wallosek 1997;Schram 1986), having a single origin inherited by the common ancestor of all arthropods. This is a reasonable hypothesis when only morphology is considered. However, when the molecular evidence is considered together with morphology, exopod homology becomes less plausible: the crustacean exopod emerges from leg segment 6, while the chelicerate and early arthropod exopod emerges from leg segment 8 (Fig.   8A). Thus, unexpectedly, the crustacean exopod appears to have a separate origin from the exopod of early-branching arthropods and chelicerates.
As discussed above, arthropod legs almost universally have a maximum of 8 leg segments. Exopods are essentially a splitting of the leg axis (Hejnol 200;Wolff 2008), such that two distal leg branches continue from the same proximal leg segment. Therefore, in a biramous leg, one would expect the maximum number of leg segments, counting down either axis, to always be 8. Thus, if the chelicerate leg becomes split at the proximal-most leg segment, there should be 7 remaining leg segments along the endopod and exopod, mirroring each other, for a total of 8 segments (Fig. 8). This 1 + 7 configuration indeed appears to be the case for fossil euarthropods like trilobites and fossil chelicerates (19,47). In contrast, crustacean legs have a 3 + 5 configuration, where the leg splits at the third most proximal leg segment, and therefore both endopod and exopod each have up to 5 segments(9), mirroring each other, for a total of 8 segments. In fact, this is not the first time that this has been noted. Stormer 1944 also noted that the crustacean exopod arises from leg segment 6, and concluded it was be homologous to the trilobite lateral appendage (12).
From a molecular standpoint, the alignment of the function of chelicerate and crustacean leg gap genes supports independent origins of their exopods. If chelicerate and crustacean exopods are homologous, then the exopod must have moved from the 8th to the 6th leg segment position. However, such a shift would require four deletions and two additions of leg segments in order to keep the number of leg segments constant between chelicerates and crustaceans (Fig.   8B), as well as the accompanying rearrangements of leg gap gene expression patterns. Given that leg segment additions are rare (Boxshall 2004), and that leg genes have the same configuration in both chelicerates and crustaceans, this is not a parsimonious hypothesis. Therefore, two independent origins for chelicerate and crustacean exopods is more plausible.
Several interesting and potentially useful implications emerge if chelicerate and crustacean exopods are not homologous. First, given that chelicerate and crustacean legs split at different points along the axis, and these two regions express different patterning genes, it is likely that different genetic mechanisms led to the generation of the exopod in these two groups.
Thus, chelicerate and crustacean exopods likely represent independent evolutionary gains of a bifurcated leg axis, and could be used to compare mechanisms of convergent evolution.
Second, the position of the exopod, on either the 6th or 8th leg segment, could be a powerful morphological character for determining the phylogenetic position of otherwise ambiguous arthropod fossils. This in turn might reconfigure existing arthropod phylogenies and necessitate a reinterpretation of the ground states of different arthropod taxa. For example, the problematic fossil arthropod Agnostus would be more closely allied with chelicerates, rather than a stem crustacean. For many fossil arthropods, the number of segments in the exopod will not be informative as they number fewer than 5, which would be equally valid for either a chelicerate or crustacean. However, the maximum number of segments in the endopod should be seven for chelicerates and five for crustaceans.
If chelicerate and crustacean exopods are not homologous, when did the crustacean exopod evolve? If it evolved in Mandibulata, then we might expect the as-yet-unknown stem myriapod to have an exopod on the 6th leg segment. However, if the stem myriapod retained the chelicerate exopod, it would be on the 8th leg segment. Alternatively, perhaps the stem myriapod had already lost the chelicerate exopod, but did not evolve their own exopod like crustaceans, so we would expect an animal without an exopod.
A third interesting outcome of this model is that the Limulus flabellum cannot be the remnant of the chelicerate exopod. The flabellum is an unsegmented lobe without muscle insertions, has a sensory function (58), and develops by budding off of the proximal leg (45).
These are also features of crustacean exites, which has led several authors to interpret the flabellum as an exite (8,45). Other authors have interpreted the flabellum as the remnant of the chelicerate exopod (2), because it emerges from what appears to be the most proximal leg segment, and because the flabellum expresses Dll (59). However, while Dll indeed patterns leg segments, it is also expressed in exites (17), where it patterns sensory structures (60).
Furthermore, the data we present here suggests that the proximal-most leg segment, which carried the exopod, was reduced and incorporated into the body wall in Limulus. Thus, in this model, the leg segment that carries the flabellum in Limulus is not the same leg segment that carries the exopod in fossil chelicerates. We therefore support the interpretation of the flabellum as an exite. This is unexpected, given that exites are believed to have evolved in crustaceans (2,61). However, this belief is based on morphology and fossils. To determine more definitively whether the flabellum is an exopod or an exite, the function of Dll should be examined in Limulus. In Parhyale, Dll knockout deletes the entire exopod and endopod, but leaves the exites unaffected ( Fig. 9) (17). If the Limulus flabellum is an exite, then Dll knock out will truncate the distal leg, but leave the flabellum unaffected, except for subtle sensory structures on the flabellum. If the Limulus flabellum is indeed an exite, this means that exites evolved well before crustaceans, and may be part of the ground pattern of euarthropods. This may allow reinterpretations of the lateral lobes in fossil arthropods, which may be exites (62). Based on the function of exd, hth, Dll, Sp6-9, and dac, the six distal leg segments of crustaceans and insects (leg segment 1 through leg segment 6) correspond with each other in a one-to-one fashion. Expression (*) of pnr and ara, as well as expression and function of wing genes, suggests that insects retain two additional proximal leg segments (7 and 8), each with an exite. In this model, the exites of pink leg segment 8 are homologous: the ancestral crustacean precoxa exite (pink, e), Parhyale tergal plate (Tp), and insect wing; and the exites of red leg segment 7 are homologous: the ancestral coxa exite (red, e), Parhyale coxal plate (Cp) and gill (G), and insect supracoxal lobes. (c) Leg segment morphologies in Parhyale and insect.  In spiders, harvestman, and Parhyale, a weak dac2 phenotype causes green leg segment 4 to be truncated and fused onto cyan leg segment 3. In harvestman, Parhyale, and Drosophila, a strong dac2 phenotype affects leg segments 3 -5.   Snodgrass 1952) and Offacolus (after Sutton 2002) were scaled to the same size, then red leg segment 7 in Limulus was superimposed on Offacolus to draw an approximation of this leg segment. If red leg segment 7 is the same size and shape in Offacolus and Limulus, then the exopod of Offacolus would emerge from a proximal 8th leg segment, here in pink. B. Embryo of bird spider Acanthoscurria with leg segments colored in. The embryonic spider coxa is readily identified by the conspicuous endite (arrows). An additional leg segment-like structure (pink) can be observed proximal to the spider coxa on all leg segments. C. wg is expressed in a ventral stripe on each leg, as expected, but also in a dot on the dorsal-most region of the leg (arrow). D. wg is expressed in two regions (closed arrow and open arrow) above the insect coxa in Tribolium  Fig. 7. Wg expression across all arthropods makes sense from the standpoint of our model. A, B. Triops crustacean, from Nulsen and Nagy 1999. C. In all arthropods, wg is initially expressed in a solid stripe in each body segment. The crustacean leg grows out like a shelf that wraps around dorsoventrally. As endites, endopod, exopod, and exites develop, the line of wg expression is broken up and becomes restricted to the ventral region of each. If insects incorporated two leg segments into the body wall, and each with an exite (wing and lobe), and spiders incorporated one segment into the body wall (perhaps patterning the exopod remnant?), then wg expression in insects and spiders mirrors wg expression in crustaceans.