A large part of our understanding of the biological substrates of sex-recognition and mate-recognition is derived by studying animal models. In performing those tasks, rodents rely mostly of pheromones and other olfactory cues, whereas humans rely mostly on visual cues. That may hinder the translation of rodents’ biology to humans’ biology, especially at the neural-networks level, where those cues traverse different networks in humans and rodents brains. That may be called the “pheromonal-visual gap”. A theoretical model presented here addresses those issues. The model merges observations from humans and model-animals, as reported in specific scientific reports, and general biological principles that are accepted by the scientific community. The model suggests that the voices of men and women are the innate cues based on which humans learn to use visual cues in sex-recognition and mate-recognition. Children learn the two tasks in associative learning mechanisms, by being immersed in their community, and observing adult role-models in innocuous, non-sexual scenarios. The model proposes that the human medial-geniculate-nucleus (MGN) is the analog of the rodents’ accessory-olfactory-bulb (AOB) and the main-olfactory-bulb (MOB), and that the human MASH pathway (MGN, amygdala, bnST, hypothalamus) is the analog of the rodents’ VNOP (Vomeronasal-organ-pathway). Considering the differences in the pathways should facilitate the translation from rodents’ brain nuclei and tracks to humans’. Also, the model hypothesizes that innate direct and indirect connections between auditory centers, e.g., MGN, and sex-control centers, e.g., hypothalamus, vary across three groups of children, and those variations determine the individual’s mate-recognition that emerges at puberty.