Nature Controls Nurture in the Development of Sexual Orientation, and Voice is Nature’s Agent

A testable theoretical model is presented, proposing which brain parts and mechanisms are responsible for the nature and the nurture components of all human sexual orientations. The model integrates observations from humans and a wide range of animals. If validated, the model would provide a proximate explanation of the biological substrates of all sexual orientations. The basic assumptions of the model are: (1) Children learn automatically and subconsciously in non-sexual conditioning experiences cues for recognizing sexual mates. That skill emerges at puberty. (2) Adults in the child’s surroundings act as innocuous, unaware role-models that provide the learned cues for recognizing mates. (3) Voices of men and women serve as the innate, primary unconditioned stimuli (US) in that learning process. (4) The hypothalamus is the main area that elicits the signals of the unconditioned responses (UR). Those signals trigger the learning of the associated conditioned stimuli (CS) broadcasted by the role-models. (5) The amygdala, base nuclei of the Stria Terminalis (bnST) and hypothalamus play in humans similar roles to those they play in the other species. (6) The human medial geniculate nucleus (MGN) plays the roles played by the olfactory bulbs in rodents. (7) Detectors of innate primary US and activators of the unconditioned sexual responses (UR) are located in the MGN, Amygdala, bnST and Hypothalamus Axis (MASHA). The learned conditioned stimuli (CS) are recorded in the MASHA and in cortical areas. (8) The innate US-UR connections vary across three groups of children. In the first group, only men’s voices trigger the UR. In the second group, only women’s voices trigger the UR, and in a third group each voice can trigger the UR. That determines the learned cues. The first group will be attracted at puberty only to men, the second only to women, and the third group to both.


The open questions
The first step of any sexual interaction is recognizing a mate of a certain sex. That determines whether the interaction is heterosexual or homosexual. Although it is widely accepted that sexual orientation is innate and it has nature and nurture components, it is still not completely clear which parts of the brain are responsible for it, and how do they do it. Many brain areas and mechanisms are involved in the behavior. A large part of our understanding of the underlying biology of sexual behavior is derived from observations of other species, especially rodents that rely on pheromones for recognizing mates. However, humans do not rely on pheromones, and it has not been determined yet which human sensory modality or modalities replace the innate roles that pheromones play in sexual recognition. A theoretical model presented here introduces a number of biologically feasible hypotheses that, if further validated, could provide a coherent explanation of the phenomena. The main goal of the model is to explore new feasible explanations and to define intermediate goals that could lead to finding answers to those open questions.
At puberty, youngsters start to feel sexual attraction to people of a certain sex. The model proposes which brain structures and which learning mechanisms that operate in children determine the sexual orientations that emerge at puberty. The model integrates into a unifying comprehensive theory bits and pieces of empirical findings that have been collected from observations of humans and animals. Those findings corroborate all the model's assumptions.
In the following, the model's assumptions are introduced first. Then, empirical observations in humans and in other species that corroborate the model are described and discussed. Also, additional experiments that could further support the model or refute it are outlined.

Background
Eating and reproduction activities are crucial for the survival of the species. In both activities, the individual has to select appropriate objects and interact with them, while avoiding inappropriate objects. In eating, the appropriate objects are foods; in reproduction, the appropriate objects are sexual mates. In both activities, stimuli broadcasted by the objects are processed by the brain of the individual and elicit responses. Foods elicit appetitive arousal, and mates elicit sexual arousal. Both arousals are expressed by typical motor activities and by secretions of glands. Those arousals trigger sequences of activities that sustain the species.
Any species can consume only certain kinds of food. The species' innate systems are set to distinguish by tasting between the foods and the non-foods. Many species learn to make that distinction based on learned cues of various modalities that foods and non-foods broadcast. Animals rely on that learned skill for finding foods in their specific environment. That skill is learned by conditioning.
Classical conditioning was discovered by Ivan Pavlov (1). Since then, the general concept of conditioning has been expanded and now it includes a number of learning paradigms (reviewed in (2)). The common to all of them is that the organism adopts new behavior patterns due to their direct or indirect association with an innate behavior pattern. Conditioning enables animals to learn to identify food without having to taste it. Non-gustatory cues that are associated directly or indirectly with food are learned and become future cues of food.
In analogy to food, the model proposes that children learn by conditioning to identify sexual mates of a certain sex and to be sexually aroused by them (3,4). This learned skill, which may be called "the sexual orientation reflex", defines their sexual orientation that emerges at puberty. It is expressed by typical motor activities and by endocrine secretions that are elicited mainly by the hypothalamus (5,6). Sexual orientation is first expressed at puberty, and it sets in motion various sequences of sexual activities.
The model proposes that the primary unconditioned stimulus (US) that drives the conditioning of sexual orientation is the voices of men and women. Human voice distinguishes very effectively between humans and non-humans and between the sexes. It is available to children and it is easily detected by them. Cues of various senses that are associated with men's voice in non-sexual circumstances become by conditioning cues that identify male partners in sexual circumstances. Cues of various senses that are associated with women's voice in non-sexual circumstances become by conditioning cues that identify female partners in sexual circumstances.
The sexual attractivity of a person is determined by his or her sex and by other personal characteristics. People broadcast cues that describe both their sex and their non-sexual characteristics. Children learn by conditioning to identify those characteristics and later they use them consciously and subconsciously for recognizing mates and for being aroused by them. Those non-sexual cues are commonly referred to as emotional or social. Thus, the innate unconditioned stimulus of the learning process has two components: an auditory component (USAUD) and an emotional component (USEM). The combination of those cues guides the child's learning to identify sexual partners of a certain sex and of certain non-sexual features.
The signals of the unconditioned response (UR) that participate in the development of sexual orientation are provided mainly by the hypothalamus. After puberty, the hypothalamus elicits sexual behavior. During childhood, those signals do not trigger the physical expressions of sexual arousal, because the sex organs are not yet mature. However, during childhood, those signals trigger the learning by conditioning. At puberty, when the software and hardware of the sexual system as a whole are mature and integrated, the learned sexual orientation is expressed (3,4).

Information types
The model presented here proposes that sexual orientation is learned by primary and by secondary conditioning processes (reviewed in (2)) The learning is automatic and subconscious, and it is based on passive non-sexual childhood experiences that involve innocuous rolemodels. The main learning process is conditioning. It is based on innate patterns of US-UR relationships which are expanded by learning. Three types of information are involved in expressing the innate, unconditioned behavior: auditory (AUD), emotional (EM) and sex control (SC). Innate connections between neurons in the child's brain control the flow of that information, which activates the primary unconditioned stimuli (US) and the unconditioned responses (UR) of the learning processes. When paired with actual experiences of the child, neutral stimuli (the conditioned stimulus, (CS)) are conditioned and become triggers of the unconditioned response. That is how the innate behavior is expanded and becomes the sexual orientation that emerges at puberty.
The innate unconditioned stimuli (US) have two components: an auditory component (USAUD) and an emotional one (USEM). The auditory component is relayed by sound detectors that distinguish between men's and women's voices. The main contribution to the emotional component is provided by the amygdala. It is about emotions that are induced by non-sexual features of the partner. When the auditory and the emotional components act together, they trigger the sex control center (SCC), which provides the unconditioned responses of the learning process. The main brain structure of the SCC that provides the UR is the hypothalamus. During childhood, the SCC cannot trigger the actual adult sexual responses, because the sex organs are not fully operational. However, the SCC can participate in the learning process. It provides the signals that trigger the learning of the CS-UR associations, without triggering an actual reward sensation. At puberty, the SCC can trigger the sex organs, and the learned outcome is expressed as sexual orientation. This may be considered as an extension of the organizational/activational schema of brain development (8,9), according to which early hormonal effects organize the brain such that adult hormonal effects are constrained by that prior exposure. The proposed extension proposes that the establishment of certain sexual behaviors is the result of conditioning mechanisms during childhood that are expressed only after puberty.
Sexual interactions have three consecutive stages: Recognizing a mate, appetitive behavior, and consummatory behavior. In the recognizing stage, the pursuer recognizes a mate of a certain sex who possesses certain desirable features. The mate is the source of the information processed in this stage. In the appetitive stage, information of the recognized mate is merged with information about the mental state of the pursuer and the environment. In the consummatory stage, the information of the mate, the pursuer and the environment are merged with physical information about the bodies of the pursuer and the mate. Many brain centers are involved in processing all that in formation. The medial preoptic area of the hypothalamus (MPOA) is the main hub that regulates sexual behavior. It triggers genital responses, which are the internal outcomes of the recognition stage, and defines the external target of the appetitive and consummatory stages (7). The sexual orientation of a person is determined based on the recognizing stage, and that is the focus of the model presented here.

The auditory component
The model proposes that innate connections between auditory centers (AUD) and sex control centers (SCC) vary across children. In one group of children only auditory signals that are initiated by men's voice can reach the SCC. In a second group of children, only auditory signals initiated by women's voice can reach the SCC. In a third group, signals initiated by voices of men and women can reach the SCC. Those variations cause the corresponding sexual orientations: A child whose detectors of men's voice are innately connected, directly or indirectly, to the sex activation centers (such as hypothalamic nuclei) will be sexually attracted at puberty to men. Such a boy will become gay, such a girl will become straight.
A child whose detectors of women's voice are innately connected, directly or indirectly, to the sex activation centers (such as hypothalamic nuclei) will be sexually attracted at puberty to women. Such a boy will become straight, such a girl will become lesbian.
Along the same lines, a child whose detectors of both men's and women's voices are innately connected, directly or indirectly, to the sex activation centers (such as hypothalamic nuclei) will be sexually attracted at puberty to men and to women. Such a child will become bisexual.

The emotional component
Sexual attraction and arousal depend not only on the sex of the partner. There are features of the partner, other than his or her sex, that are necessary for sexual attraction and arousal to happen. Just before the interaction starts, the participants expect to feel safe during the interaction and to enjoy it, but there are some risks; the participants are about to infringe on the personal domains of each other. That combination of opposite emotions creates excitement (3,4). In very broad terms, those emotions may be referred to as feeling safe and feeling threatened in the context of sexual interaction. The feeling of safety is not only the absence of fear; it is an affirmative sensation of feeling safe. The intensities of those emotions vary from person to person, and they may change with time and with the circumstances. Those emotions are felt by both partners, especially by naïve youngsters at puberty. The model does not analyze those emotions; it treats them as a black box that contributes to the attraction and to the arousal. Figure 1 illustrates the information types involved in recognizing sex mates.

Types of information that drive mate recognition
Auditory -Sex Control connections vary across children. Some connections convey only signals initiated by men's voice, some only those initiated by women's voice, and some both. That determines which associated features of the role models will be learned by conditioning. Those learned features become cues for post-pubertal activities. A representation of a new concept ('sexy person') is thus formed in the child's brain. At puberty, sexually attractive people will activate that representation, which in turn will activate the physical expressions of sexual arousal. The figure illustrates a child (a boy or a girl) whose Auditory -Sex Control connections convey only signals initiated by men's voice (TOP). Consequently, that child will learn by conditioning only cues that are broadcasted by men. At puberty (BOTTOM), that youngster will be sexually attracted to and aroused by men.

The subcortical axis
The main tenet of the model is that innate connections between auditory and sex control centers vary across children, and that those connections determine which voice signals could serve as the unconditioned stimuli that trigger the UR that triggers the learning of the sexual attraction and arousal cues. Thus, those various AUD-SCC connections determine the individual's sexual orientation that emerges at puberty. Various brain areas could play such stimulus-response roles. There are some general considerations and preliminary empirical observations that could provide hints on more specific brain areas that are responsible for learning that behavior.
In general, intricate behaviors evolve from innate cores that are expanded by experience. The innate core of a behavior resides in specific brain areas and the expansions are carried out by processes that involve larger brain areas that are shared by other behaviors. The cores of many crucial processes reside in subcortical areas, and their learned parts reside in cortical areas. The cortical and subcortical areas interact with each other during the learning process and during the expression of the behavior.
Auditory signals propagate from the cochlea via the ascending auditory track (AAT) to the medial geniculate nucleus (MGN) where they split. One part continues to cortical areas, and the other part continues to the subcortical areas. The first subcortical structure that they encounter is the amygdala. From there, the signals continue by direct and indirect connections to the bed nucleus of the stria terminalis (bnST) and to the hypothalamus.
The model proposes that the core of human sexual orientation, which is responsible for the primary unconditioned stimuli and unconditioned responses that participate in the conditioning, resides in the subcortical MGN-amygdala-bnST-hypothalamus axis (MASHA). The expansions of the core, which are carried out by associative learning mechanisms, reside mainly in cortical areas. A variety of pathways convey information between elements of the MASHA and between the MASHA and cortical areas. Those assumptions of the model are testable, and if verified, they would provide a proximate biological explanation to the development of the sexual orientations during childhood.

Human observations
At puberty, youngsters discover new feelings and sensations that they have not felt before; people of a certain sex attract and arouse them. As children, they have learned cues that identify the sex of other people and cues that predict the emotions that result from interactions with them. Some of the cues for recognizing mates, such as cloths and hair styles, are arbitrary and vary with time and location. The process that the brain uses for learning those cues determines the sexual orientation that emerges at puberty. The process starts as early as the fetal state. Later, children acquire some more cues automatically and subconsciously by being immersed in their societies. Other cues are acquired by being told directly or indirectly.
Evidence suggests that between 33-41 weeks of gestation, neural networks sensitive to properties of mother's voice are being formed. Fetuses respond differently to mother's voice, father's voice, and voices of male and female strangers (10,11). The fetus can memorize not only his mother's voice but also more complex acoustic external sounds with a big ability of discrimination (12).
Cortical areas of infants between the ages of 4 to 7 months show sensitivity to voices and to emotional prosody, suggesting that as voice-sensitive cortical brain systems emerge between 4 and 7 months of age, representations of those features are formed in them (reviewed in (13)). Studies have been conducted to find out which kinds of sound draw the attention of infants and make them focus on the ongoing event (14): "At birth, infants prefer listening to vocalizations of human and nonhuman primates; within 3 mo, this initially broad listening preference is tuned specifically to human vocalizations. Moreover, even at this early developmental point, human vocalizations evoke more than listening preferences alone: they engender in infants a heightened focus on the objects in their visual environment and promote the formation of object categories, a fundamental cognitive capacity. This initially broad listening preference is tuned specifically to human vocalizations." There is ample evidence suggesting that children already know to distinguish between the genders at an early age and use that ability to develop various related behavior patterns (15): "Infants as young as three to four months of age distinguish between categories of female and male faces, as demonstrated in habituation and preferential looking paradigms" (16). By about six months, infants can discriminate faces and voices by sex, habituate to faces of both sexes, and make intermodal associations between faces and voices (e.g., 17,18,19). By 10 months, infants are able to form stereotypic associations between faces of women and men and gender-typed objects (e.g., a scarf, a hammer), suggesting that they have the capacity to form primitive stereotypes (20). Twenty-four-and thirty-month old children knew the gender groups to which they and others belonged (21). Similarly, most 24-and 28-month-old children select the correct picture in response to gender labels provided by an experimenter (20,22). "Taken together, these studies suggest that most children develop the ability to label gender groups and to use gender labels in their speech between 18 and 24 months." Those studies suggest that infants develop their concepts of 'human', 'men', and 'women' gradually. They keep building the concepts according to the information that becomes available to them as time goes on and depending on the existing mature brain structures. The process starts with cues that their innate systems are set to detect and process. Then, new cues that cooccur with previously adopted ones are added to the evolving concepts. Hebbian learning (23) is probably the basic neural mechanism employed in the process. The first cues that distinguish between the sexes, that are available to fetuses and infants, are auditory. Those cues are salient and detectable by fetuses, neonates and infants. They are effective in distinguishing between humans and non-humans and between the sexes. They are robust and universal; they are not masked by other stimuli, and they have been available in all societal environments. Consequently, neonates already have rudimentary perceptions of 'human', 'men' and 'women' at birth. At this stage, 'humans' are those objects that produce sounds at a certain frequency range and with a certain timbre, prosody and some other typical temporal features. Men are those humans that produce low pitch voice, and women are those humans that produce high pitch voice. With time, those concepts are expanded to include cues from other modalities including visual, tactile and olfactory. Some of those cues are sex-specific and some are common to both sexes. Some depend on the sex and others hint about non-sexual characteristics of the other person.
Observing the behavior of infants and toddlers suggests that voice is the stimulus that guides their brains in learning to distinguish between humans and other objects, and between men and women. During that time, children learn on their own to distinguish between the sexes based on visual cues that are arbitrary (such as cloths), that vary across societies, and that have changed throughout history. Yet, at puberty, those cues sexually arouse them reflexively. That may suggest that those cues where learned by conditioning, and that the unconditioned stimuli that have guided that learning were men's and women's voices. After early childhood, it is not known whether voice continues to guide the brain in its sexual development, or maybe different modalities take over. However, the facts that the voice of boys changes at puberty, that this change is genetic, that adult voice serves as the ubiquitous cue for distinguishing between the sexes, including during sexual activities, may suggest that voice continues to guide the development of sexual orientation throughout childhood; a role that it has been playing since the fetal stage.
Based on all those observations, the model presented here proposes that the typical features of the voices of men and women are the primary auditory cues based on which the brain learns to recognize a partner of a specific sex. Thus, the centers that process those cues determine the sexual orientation of the person.

Animal observations
During sexual interactions, the partners exchange information that enables them to sequence their own activities and to synchronize their activities with their partner's. A variety of modalities and codes are used by different species for exchanging that information.
Rodents exchange olfactory signals during their reproductive activities, and pheromones serve as cues in the exchanged information. Their olfactory system has two subsystems that operate in parallel: The accessory olfactory system (AOS), which handles pheromones, and the main olfactory system (MOS), which handles mainly other airborne stimulants. Rodents use their Vomeronasal Organ (VNO) for picking up pheromonal cues and for assessing their significance. Those pheromones guide the animal in recognizing appropriate mates, in becoming aroused, and in the subsequent stages of their reproductive activity. The VNO relays its output to the Accessory Olfactory Bulb (AOB), which relays its output mainly to the amygdala (AM) and to the bed nucleus of the Stria Terminalis (bnST), which relay their outputs mainly to the hypothalamus (HY). The HY mediates the secretions of the pituitary gland and innervates various motor systems that express sexual activities (8,9). In addition to their inter-connections, the AM, bnST and the HY interact with other brain areas and integrate information that they exchange (reviewed in (24)). The MOS projects to orbitofrontal cortex, the amygdala and the hippocampus. Those areas project back, directly and indirectly via other brain areas to the AM, the bnST and the HY, and thus they too are involved in the sexual activities.
A properly functioning VNO is critical for various stages of sexual activities. For instance, it was found (25,26) that in both male and female sexually naïve hamsters, removal or prevention of normal functioning of the VNO have severe impacts on sexual activities. However, if the animal had prior experience of sexual interplay with a partner, the effects of the removal of the VNO are less dramatic. That suggests that the VNO is critical for associative learning of secondary cues that can substitute for the innate primary pheromonal cues. The secondary cues could be provided by the MOS and/ or by other modalities. In the intact animals, VNO signals eventually reach the HY (24). Numerous brain regions and processes are involved in the various stages of reproductive activities. Coria-Avila and colleagues (27) have devised a three-circle Venn Diagram that illustrates the involvement of brain areas in parental and sexual attachments in mice. The circles group brain areas according to their contribution to: (1) social recognition, (2) reward and motivation, and (3) inhibition of fear/ anxiety. The authors conclude that the data suggest that the MeA, bnST and mPOA are brain areas where olfactory social recognition occurs. Both excitatory and inhibitory signals reach the mPOA and modulate its outputs that regulate many steps of sexual activities (reviewed in 28). For instance, Osakada and colleagues (29) describe excitatory and inhibitory pathways that regulate lordosis; a sexually receptive posture, which is one step of the sexual process. In mice, signals of ESP1, a pheromone that identifies a male target, is detected by a receptor V2Rp5 in a female VNO and relayed by the AOB to the MeA. From there it reaches the ventrolateral area of the ventromedial hypothalamus (VMHvl), whose activation enhances lordosis. That enables the continuation of the copulating process. However, male juveniles' tears that drip to their faces contain another pheromone, ESP22. When sensed by the female, her VNO receptor V2Rp4 relays signals via a specialized pathway that passes through the MeA and the bnST before it reaches the dorsal area of the ventromedial hypothalamus (VMHd). Signals of that pathway suppress lordosis and terminate the copulation process. The authors propose that the MeA acts as a hub to route vomeronasal inputs of varying functions to their appropriate downstream neural circuits.
It was found that the MeA is dimorphic (30). MeA neurons in male mice respond favorably to female urine, and MeA neurons in female mice respond favorably to male's urine. The VNO and the AOB do not show that dimorphism. Extrapolated to humans, those results may suggest that the MeA, bnST and the hypothalamus is where the social recognition stage of sexual activities takes place. In that stage, a mate of a certain sex and desirable other features is recognized. However, humans do not have a functional VNO (31), and the human analogs of the pheromonal cues have to be identified.
Like humans, birds too do not have a VNO. The bulk of research on the effects of social cues on gonadal development in birds has been conducted on female songbirds listening to male song (reviewed in (32)). Such research confirms that song alone, presented to photo-stimulated females via audio playback over a period of weeks, is sufficient to enhance LH secretion, follicle growth, egg laying, or nest-building behavior in females of many species. Correlations between active auditory brain centers and endocrine releasing centers have been documented in birds. For instance, recordings of conspecific male song were played to laboratory-housed female white-throated sparrows (32). Hearing song for only 42 min induced LH release and the immediate early gene Egr-1 expression in the mediobasal hypothalamus (MBH). Song-induced Egr-1 expression in the MBH was correlated with the expression in midbrain and forebrain auditory centers. That expression was found at two levels of the auditory system: at a midbrain processing center homologous to the inferior colliculus and at a forebrain center analogous to auditory cortex.
Fast influence of auditory signals on hypothalamic nuclei was also found in female ring doves (33): Single cell recording of neurons at the preoptic area (POA), anterior hypothalamus, and posterior hypothalamus were recorded from female ring doves that were exposed to various auditory stimuli: male nest coo, female nest coo, reverse female and male nest coos, and white noise. Concurrently with those recordings, plasma concentration of LH in pituitary blood veins were measured. Female-nest-coo-specific neurons were found exclusively in the POA-AMH areas. They are characterized by a two-burst pattern with approximately the same temporal contours as the auditory coo. Those activated neurons were labeled by the author as "femalenest-coo-unit". Those observations suggest that feature-detecting neurons, such as the femalenest-coo-specific units, are involved in gonadotropin-releasing hormone output. The operation of such units demonstrates that a variety of relevant auditory cues that reach the hypothalamus may still retain their original temporal auditory features. It may be due to direct connections between auditory centers and hypothalamic neurons, or due to the preservation of those features as the auditory signals are being processed on their way from the auditory areas to the hypothalamus.
Extrapolated to humans, that may suggest that auditory signals that carry sexual information eventually reach the hypothalamus. Some of those signals still carry with them auditory features that identify the sex of the mate. In the case of birds, that was the temporal pattern of the original auditory stimulus of a male conspecific. In the case of humans, that may suggest that some signals that affect the hypothalamus can be traced back to their originator, by auditory features that they carry, such as the frequency and timbre of the voice of their source.
Males of many songbird species have a repertoire of songs that they use. Males have to learn to sing those songs, and females may have to learn to recognize them and to respond accordingly (reviewed in (34)). In the wild, they learn the songs by being immersed in their flocks. Males that are reared in isolation develop abnormal songs. However, if songs are played to them, they develop normal songs (35). Young birds that had never before heard songs of conspecifics increase their heart rate and their beg when they hear playback of songs of conspecifics (36). These are indications that birds are innately predisposed to the songs of their species. Some species have time windows during which they can imprint new songs, and some do not have such limitations (37).
Extrapolated to humans, that may suggest that human fetuses and neonates are predisposed to various features of human voice, and that human voice may affect the development of their neural networks that recognize mates, thus eventually affecting the development of human sexual orientation.
This reliance on voice is not limited only to birds. For example, male frogs use courtship calls to attract females. Anatomical investigations have demonstrated the existence of pathways from thalamic and midbrain auditory structures to areas involved in responding, including the preoptic area (POA) and ventral hypothalamus VHY (38). Acoustic signals that convey social cues reach other frog's auditory regions and the hypothalamus. Responses of three different hypothalamic regions depended on the social contents of the sounds, whereas other areas responded regardless to the content (39). The responses of POA and VHY neurons varied in accordance with the type of stimulus (conspecific, heterospecific, white noise) and the season. Research with anurans has demonstrated that acoustic communication, both on the emitter and the receiver ends, is modulated by reproductive hormones, including gonadal steroids and peptide neuromodulators (40). Those results are similar to the results in birds.

Learning in the subcortical axis
Although there are many similarities between human and rodent brains, there is a significant difference. Rodents rely on innate pheromonal cues for recognizing mates, whereas humans rely on learned cues from several modalities. In order to generalize from rodents to humans, it is necessary to find out how humans learn their cues. A main learning mechanism at the neural level is by the modification of strengths of synaptic connections. Hebb's rule (23): "Cells that fire together bond together", is considered the underlying mechanisms in Long Term Potentiation (LPT) (reviewed in (41)). Hebb's rule may underlie learning in various other situations such as recording associations between benign events based on their saliency or their frequency, even when there is no strong emotional value to the events. For example, the association between scarfs and women or between hammers and men. Such associations can be learned even by infants (20).
A second group of learning situations involves conditioning, where the association has an emotional value, such as the bell ring and the food. Due to the significance of their consequences, reliance on conditioned associations is re-evaluated and updated frequently. The first stage of the acquisition process in conditioning is recording the associations between paired events. This stage is followed-up by a variety of evaluation and adjustment processes (reviewed in (2)) that extend beyond the brain area where the associations were first recorded. These follow-up processes include extinction, which occurs when the CS is repeatedly presented alone (without US). This leads to a decline in the elicitation of the CR (42); latent inhibition, which refers to the delay in learning when either the CS or US has been presented prior to the CS-US conditioning trials (43); and devaluation, which occurs after conditioning, when the significance of the US has changed (44). Those follow-up processes check the worthiness of the recorded associations and balance their expressions in actual behaviors. The processes involve several brain areas including: cortical and subcortical areas where the learned associations are recorded; cortical area such as the PFC, where evaluations are made; memory controllers such as the hippocampus that coordinate transfer of memory records between brain areas; sensory-motor areas that engage the processed stimuli with motor units; emotion centers that handle information about the emotional value of events, and more. Details of connections between parts of the MASHA and cortical areas that participate in conditioning have been reported (45). The model does not deal with the follow-up processes; it deals only with the basic association process of conditioning that underlie sexual orientation which are handled by the MASHA

The analogs of pheromones
The model deals with three aspects of human analogs of pheromonal activities: 1. The voice of men and women is the analog of pheromones that recognizes a mate of a specific sex. It provides the innate primary cue (USAUD) that is used in learning how to recognize a mate of a specific sex. Human voice is also one of the cues used during actual sexual activities for recognizing a mate. In other species, dedicated pheromones play those roles. That auditory cue is expanded by associative learning mechanisms, and as a result, at puberty the youngster can recognize mates of a certain sex based on cues of various modalities.
2. Humans learn by conditioning cues that encode non-sexual features of potential mates. Those learned cues are analogs of pheromonal cues. The details of the innate human USEM based on which those cues are learned are not clear yet. For example, there are pheromones that elicit avoidance and pheromones that elicit sexual aggression (reviewed in (24)). Such pheromones apparently do not have their analog one-to-one human auditory cues. Those cues have to be learned in humans, and they are handled by a range of sensory, association and executive brain areas.
3. Human homologs of the neural pathways that are activated in other species by pheromones, are activated in humans by the analogs of those pheromones. (For example, ESP1 recognizes a male in mice, and activates certain MeA-bnST-VMH pathways. The human analog of ESP1 will activate the human homologs of those mice pathways.) Extrapolated to humans. There is a consensus that the amygdala, bnST and the hypothalamus constitute the fundamental core that carry out the recognition and the ensuing appetitive and consummatory stages of sexual behavior. The model argues that human voice and stimuli that are conditioned by voice play the same roles that pheromones play in those other species (Figure 2). Centers of the MASHA process voice elicited information in analogy to processing that information when elicited by pheromones.

The roles of the MGN
Auditory information is relayed tonotopically and diffusely from the inferior colliculus to the MGN (46), and from there it splits to the cortex and the amygdala. About 10% of MGN projection bifurcate and go to both cortex and amygdala, and other projections go separately to their targets (47). Using functional Magnetic Resonance Imaging (fMRI), cortical areas that respond selectively to men's or women's voices have been identified in adults' brains: (48,49,50,51,52,53). Audio-visual association areas have been identified in the superior temporal sulcus (STS) (reviewed in (54)). Some of the cortical audio-visual representations are organized tonotopically. That could happen as auditory stimuli that arrive at those areas tonotopically recruit local neurons to represent their associations with the arriving visual stimuli. The result is a cortical tonotopic representation of audio-visual associations. It is possible that some representations of audio-visual stimuli in the amygdala and the MASHA are also represented tonotopically.
Based on that, the model assumes that the human medial Geniculate Nucleus (MGN) is a homolog of the rodents' olfactory system. Like the AOB, the MGN relays to the amygdala the primary cues that encode the sex of a potential mate. Like the MOB, it relays stimuli to cortical areas, where they create representations that combine innate and associated cues about the sex of a potential mate. When activated, those representations, which are expansions of the innate representations of a mate, are fed to the MeA and propagate in the subcortical pathway to the hypothalamus.
However, the human MGN does not relay analogs of non-sexual cues of a mate, which in rodents are encoded by pheromones, because the MGN does not receive such information. In humans, those cues have to be learned based on a variety of cues from visual and other modalities. The model assumes that they enter the system via other entry points of the MASHA (Figure 2). Learning by conditioning depends on primary US that the system can detect and its primary UR's. The model assumes that in mate recognizing events the primary US have two components, auditory (USAUD) and emotional (USEM). The model assumes that the networks that detect them and trigger the learning process reside in the MASHA. The learning process involves also other brain areas and mechanisms, but those are general purpose venues that are used also for learning other behaviors. The learned associations can reside in the MASHA and in cortical areas.
Every center of the MASHA has input and output connections to other MASHA and brain centers. A MASHA center may play the role of a US or a UR in a conditioning process. The second role in the process may be played by another MASHA or brain center. An activated MASHA center that plays the role of a UR can elicit a conditioning process, in which the CS-UR association is recorded or updated. After puberty, the MASHA center can elicit physical expressions of the process.
The innate connections between auditory and MASHA centers that control the flow of USAUD signals (thick arrow) vary across three groups of children. That determines the sexual orientation that emerges at puberty.

The roles of the amygdala
The amygdala can learn by conditioning and store the learned information (55,56). Two nuclei of the amygdala appear to play different roles in appetitive and in aversive conditioning. Based on analysis of experimental observations of appetitive conditioning, a model was developed (44) postulating that the BLA enables the CS (e.g., light) to access the affective value of its particular US (e.g., food) and that this information is used to control the CeN and its projections to produce a behavioral response. However, the CeN can directly encode stimulus-response Pavlovian associations and influence the conditioned responses through its projections to the midbrain, hypothalamus and brain stem. Another model (57) that deals with aversive conditioning, postulates that the association between CS and US is coded in the lateral and basolateral nuclei of the amygdala, but not in the CeN. This information is then sent to the CeN, which promotes the execution of the URs. In both models, direct and indirect CS-UR associations are expressed in the amygdala and are relayed to downstream centers from there.
Those observations were made mainly in fear conditioning, where voice was the standard CS. However, in sexual behavior, the MeA acts similarly to the CeN. Therefore, experiments will have to find out the details of the interactions of voice with the MeA in humans.

The MGN-Amygdala area
In rodents, the AOB is the main transition station of olfactory information on its way from the sensory system to processing by the brain. From the AOB, information is relayed to the MeA. When the information leaves the MeA, it already shows dimorphic filtering (30). It affects different brain targets according to both the sex of the source of the stimuli and the sex of the receiver.
Extrapolated to humans, the model implies that the amygdala merges the incoming auditory information with other information and parses it according to the sex of the source of the stimuli and the sex of the receiver. The parsed information that is relevant to sexual activities is relayed to its MASHA and cortical destinations. The parsed information that is relevant to other activities, such as gender non-sexual activities, is relayed by the amygdala to other destinations via other pathways.

The role of the mPFC
The amygdala and the medial prefrontal cortex (mPFC) are highly connected areas involved in associative learning and decisions making. Both innervate the lateral hypothalamus (LHA). It was found (45) that separate pathways connect the BLA and the CeA to the LHA, and other pathways connect the BLA and neurons in the mPFC that innervate the LHA. That is in line with the model's argument that a core of MASHA pathways handle innate information, and cortical pathways that are connected to them handle information that was learned by associative processes. Amygdala-hypothalamus pathways may be conveying the auditory information about the sex of the source, and amygdala-mPFC-hypothalamus pathways may be handling information about the relevant learned associations.

The loci of sexual orientation
In vitro studies of the bnST and the hypothalamus have shown anatomical variabilities, such as differences in cell densities, that correlate with sexual orientation (58 -65). It is not known how those variabilities affect the information flow through those locations. Those anatomical variabilities may be an indication that at the level of the bnST, auditory elicited flows convey information about the sex of their source, and only information pertaining to a specific sex can reach its targeted structures and affect them. That assumption could be tested experimentally. If verified, it would bolster the in vitro anatomical findings and together they would enhance our understanding of the phenomena.
On the input end of MASHA, in the cochlea and ascending auditory track (AAT), variabilities in otoacoustic emission (OAE) and evoked potentials (EP) in the ascending auditory track (AAT) that correlate with sexual orientation have also been observed (66,67). The exact roles that those variabilities play in sexual orientation is not clear. It may be related to the tuning on human voice displayed starting at infancy (14). If further experiments find out that those variabilities are corelated with the processing of human voice, then they may be contributing to the diversity of sexual orientations.
Although humans do not have an operational VNO (31), it was found that inhaled putative pheromones affect hypothalamic and cortical areas in a way that correlates with the sexual orientation of the subject (68,69). The putative pheromones are found in exocrine secretions that become active at puberty. On the other hand, boys voice also changes at puberty, and becomes a universal sex cue. It would be interesting to find out if human voice affects the same structures that are affected by the putative pheromones. If so, that may be an indication that at some point of human evolution, voice has taken over the role of pheromones in sexual activities.

Types of involved connections
A basic assumption of the model is that innate variabilities across children of connections between neurons that respond to voices of men and women cause the diversity of sexual orientations that emerge at puberty. Those variabilities control which signals reach certain critical neurons that express the URs that guide the learning of the behavior. Two of the general mechanisms by which center A that handles two kinds of information (e.g. men's and women's voices) can relay it to center B are: first, by excitatory connections between A and B, where only one option can flow in the connecting channel (only men's, only women's, or both). Second, both kinds can flow in the channel from A to B, but the flow is modulated by inhibitory signals that originate from a third-party, C. The actual inhibition can be on neurons in A or in B. A third mechanism is a signal that inhibits the inhibition of an excitatory signal. All those mechanisms have been observed in the MASHA (28,29), and the model accommodates all of them. The connections where that variability in the flow of auditory information occurs are the sites that determine the sexual orientation that emerges at puberty. It may occur at one or at several points along the MASHA. The model predicts that the spectrum of the flow of auditory information changes as it traverses the MASHA. At the entry area, the MGN, the flows elicited by high frequencies and low frequencies voices are comparable, because that flow contains information that is not sex specific. At the exit area, the hypothalamus, the flows are the most lopsided in favor of the frequencies of the targeted sex (more high frequencies for targeted females, more low frequencies for targeted men). In between the two ends, the spectrum changes. The most dramatic change is as the flow leaves the amygdala. The amygdala is still receiving information that is mostly sex independent. As the information approaches the hypothalamus, it becomes more sex dependent.

TESTING THE HYPOTHESES
The model assumes that innate connections between neurons that are located in certain brain areas are tuned to respond selectively to certain voices, and that property determines the sexual orientations that emerge at puberty. Those assumptions could be tested by direct observations. A variety of non-invasive imaging technologies including task-based fMRI. MEG, EEG, and diffusion MRI (70) are being used now for tracing the connectivity of neural networks in the brain. Those methods could be applied to study how men's and women's voices are handled by MASHA structures, and if there are correlations between the responses of those structures to certain voice frequencies and the sexual orientation of the subjects. In addition, due to the tonotopically representation of auditory information in some brain areas, it might be possible to identify the frequency of the sound that a certain bundle conveys based on its anatomical location in the track, or its connection with a bundle in a tonotopic area whose frequency is known. That could identify groups of neurons that are selectively affected by men's or women's voices. As more information about the human connectome becomes available, more details about the connectivity of MASHA structures that is relevant to the development of sexual orientation may be uncovered.
A major part of the proposed model is based on observations in rodents. Since the reproductive system of rodents is innately tuned to respond to olfactory cues, the vast majority of the research of this system has been concentrating on olfactory stimuli. However, voice too plays a role in the reproductive activities of rodents (reviewed in (71,72)). Part of that voice is ultrasonic (USV), which cannot be detected by the human ear. Nonetheless, it is auditory. Auditory stimuli have been shown to be involved in a range of reproductive activities, but more studies are needed to map the details of the connections between the auditory and the reproductive brain circuitry of rodents. Once identified, such mapping could provide insights about their human analogs.

SUMMARY
Every person that engages in sexual interactions, real or imaginary, has a sexual orientation. That behavior has implications on the person's wellbeing and on the attitudes of society. Therefore, improved understanding of the underlying biology of the behavior would benefit both the individual and society. Due to practical constraints, a large part of our knowledge about the biology of sexual orientation is derived from observing animals. It is important to generalize from animals to humans very cautiously (73,74). Theoretical models and hypotheses, like the model presented here, are crucial stages of the Scientific Method, and are usually employed for optimizing such generalizations and for the exploration of new ideas.
Compared to other modalities, voice has been the least studied modality in the context of human sexual behavior. The model proposed here illustrates how voice may play central roles in the developmental stages of behaviors, and then, when mature, those behaviors rely on other modalities. More interdisciplinary studies are needed to uncover the roles that voice plays in the development of sexual orientation and in other sex and gender related behaviors.
The basic assumption of the model that human voice plays a major role in the development of sexual orientation is corroborated by many experimental observations. The model also proposes which learning mechanisms and brain structures participate in that development. The model's assumptions and predictions are empirically testable. The model proposes that the behavior is learned by conditioning, and the neurons that represent the innate primary US's and UR's reside in the Medial geniculate nucleus, Amygdala, base nuclei of the Stria Terminalis, and Hypothalamus Axis (MASHA). The US's have two components: auditory and emotional. The auditory component guides the learning of the sex of the desired partners, and the emotional component guides the learning of all their other non-sexual features. The model proposes that the auditory component of the US varies across three groups of children, and that variability determines the three groups of emerging sexual orientation. In one group, men's voice is the auditory US. In another group, it is women's voice, and in the third group both voices serve as auditory US's. The emotional US's, which include all the desired features of the partner except his or her sex, are common to all sexual orientations. An auditory US and an emotional US are required for triggering the UR. The UR triggers the learning of the associated cues (the CS). After puberty, the UR elicits the physical expressions of sexual attraction and arousal.
A variety of sub-networks of MASHA neurons can contribute to the learning of sexual orientation. The model proposes that each such sub-network has an auditory US component that varies across three groups of children, an emotional US component, and a UR. That is how the innate connections of those sub-networks, which are distributed in the MASHA, determine the sexual orientation that emerges at puberty.
The model is a sufficient-conditions model. It describes the development in the most common situations. It is not a necessary-conditions model. Sexual orientation may develop based on other primary cues, for instance in hearing-challenged children. The model does not deal with post-pubertal development of sexual orientation, where the intense sensual rewards that accompany sexual interactions affect the learning of additional arousal cues and responses.
The model does not exclude the possibility that neurons in the AAT send projections that bypass the MGN and innervate directly other MASHA structures. If that happens, the MASHA could be expanded to include such neurons.
The model proposes that the outcome of the learning process that determines one's sexual orientation is already determined at birth. In the development of sexual orientation, nature controls nurture, and voice is nature's agent.