Preprint
Review

This version is not peer-reviewed.

The Origin of Human Theory-of-Mind

Submitted:

06 November 2024

Posted:

07 November 2024

You are already at the latest version

Abstract
Is there a qualitative difference between apes’ and human ‘ability to estimate others’ mental states’, a.k.a. ‘Theory-of-Mind’? After opting for the idea that expectations are empty profiles that recognize a particular content when it arrives, I apply the same description to ‘vicarious expectations’ –very probably present in apes. Thus, (empty) vicarious expectations and one’s own (full) contents are distinguished without need of meta-representation. Then, I propose: First, vicarious expectations are enough to support apes’ Theory-of-Mind (including ‘spontaneous altruism’). Second, since vicarious expectations require a profile previously built in the subject that activates them, this subject cannot activate any vicarious expectation of mental states that are intrinsically impossible for him. Third, your mental states that think of me as a distal individual are intrinsically impossible states for me, and therefore, to estimate them, I must estimate your mental contents. This ability (the original nucleus of human Theory-of-Mind) is essential in human lifestyle. It is involved in unpleasant and pleasant self-conscious emotions, which respectively contribute to ‘social order’ and to cultural innovations. More basically, it makes possible the human (prelinguistic or linguistic) communication, since it originally made possible the understanding of others’ mental states as states that are addressed to me, and that are therefore impossible for me.
Keywords: 
;  ;  ;  ;  ;  

1. Introduction

This article will propose that apes’ Theory-of-Mind (ToM) is supported by vicarious expectations, and that these, like any other expectation, are –let’s put it this metaphorical way– empty profiles that will recognize a particular content when it arrives. Thus, vicarious expectations, since they are empty profiles, can be automatically separated from the subject’s own (full) mental contents. By contrast, in the human ToM, the subject estimates foreign (i.e., others’) contents, which need some meta-representational resource that separates them from the subject’s own contents. After having described in this way the contrast between apes’ (i.e., ‘primitive’) and uniquely human (i.e., ‘advanced’) ToM, I must try to answer the following question: For what function was the estimation of foreign contents –that is, the costly duality of one’s own (full) content and foreign (full) content– originally advantageous?
If it is accepted that vicarious expectations require a previous empty profile in the subject that activates them, then it must be also accepted that such expectations cannot correspond to states which are intrinsically impossible for the subject. Thus, I propose that the ability of estimating foreign contents originally arose when mental states intrinsically impossible for the subject needed to be thought in the human lifestyle. But here it is necessary to pause very briefly to deal with this lifestyle.
The new –human– lifestyle, which is the key in the co-evolution genes/culture, can be characterized by two features. i) A ‘cultural’ feature. ii) A ‘social’ one.
i) An increasing technology: This would have needed some degree of teaching (Gärdenfors 2022; Laland 2017; Tatone & Csibra 2015), or, at least, of parental approval / disapproval (Castro & Toro 2004), and, therefore, some increase in communication. But the technological increase also needs self-control, not only to acquire technological skills, but also and above all, to surpass previous cultural products and, later, to support creative innovations (which are the essential factor to achieve the cultural advances1).
ii) A high degree and wide span of collaboration and ‘partner choice’: This would have required increasing communication (Mussavifard & Csibra 2023), and also (since there is “competition to be chosen as a partner in cooperative ventures”: Baumard et al. 2013) self-control that “refrains from blatantly selfish actions” (ibidem).
Returning to our thread, we must ask ourselves why actions intrinsically impossible for the subject needed to be thought in the new, human lifestyle. Self-conscious emotions (if we opt for the idea that they are originally based on an interpersonal relationship, not on an innate moral core) are advantageous because they provide the self-control necessary to care one’s own reputation. In addition, the subject who experiences those emotions ‘thinks what others think of him’ (of him as a distal, foreign individual), and, therefore, he thinks a foreign mental state which, being impossible for him in any circumstances, is not graspable through vicarious expectations.
But, to get to the origin of the matter, we must focus on a basic question –how comes the human subject to think originally what others think of him?. Thus, we will study the new communicative reception (not production, at the very beginning) that distinguishes human –even prelinguistic– communication from that of chimpanzees. The human addressee must think a foreign mental state as a state addressed to him. By contrast, apes –I propose– can think a foreign mental state –not content, but vicarious expectation– only if this mental state is not addressed to them, and can understand that a message is addressed to them, only if there is no need of estimating foreign (i.e., the producer’s) mental states.
After proposing the double identification ‘apes’ Theory-of-Mind / vicarious expectations’, and ‘human Theory-of-Mind / foreign mental contents’, I will add two clarifications. Firstly, the strict condition for the very origin of ‘foreign mental contents’ (that is, the strict requirement that the mental states that must be thought are impossible for the subject in any circumstances) is not necessary for the subsequent development of human Theory-of-Mind. Secondly, it is convenient to focus in a more detailed way on the two receptions –by apes and by hu-mans– of pointing gestures.
Section 2 briefly exposes the old descriptions (around the year 2003) of the primitive and the advanced Theory-of-Mind, and then presents the recent changes. Next, it focuses on three articles –Tomasello 2018; Southgate 2020; Lurz et al. 2022– who attempt to accommodate the new data regarding the abilities of Theory-of-Mind in infants or apes without having to dismiss the qualitative separation of the two modes of the Theory-of-Mind. I share those authors’ goal, but I do not agree with their proposals. Section 3, after highlighting the lack of consensus regarding the format of radically non-linguistic ‘expectations’, and after facing the mentalese (in 3.1.1), chooses to call them ‘well-defined, empty profiles’. Such emptiness, which gets in any animal the automatic separation between goals and perceptions, can also be applied –I propose– to a special type of expectations, the ‘vicarious expectations’. These special expectations, very probably present in humans and apes, are processed as ‘belonging to the other’ through the simplest way of the two proposed by Ereira et al. 2018, i.e., “through an encoding of agent identity intrinsic” to them. The nuclear Section 4 –or rather 4.2– proposes that the estimation of foreign contents originally arose when mental actions intrinsically impos-sible for the subject needed to be thought. Section 5 focuses on self-conscious emotions, which are essential in ‘the new –human– lifestyle’ (5.1) and require the ability of estimating foreign mental contents (5.2). Section 6 specifies that the (above mentioned) strict conditions for the evolutionary emergence of ‘foreign mental contents’ are not necessary for the subsequent (ontogenetic and historic) functions of the ‘second line’ of mental contents. Section 7 suggests that the really effective (I will call it ‘unified’) reception of pointing gestures requires the estimation of the producer’s mental content, and in such sense is similar to the reception of gestures and gazes that causes in the addressee self-conscious emotions, and similar to the dialogic nucleus of any linguistic reception.Section 8 provides a general outline of the article. I summarize the article in the sense of listing all its hypotheses and suggestions without distinguishing between the main one and those that are subordinate to it. In fact, the order followed there aspires to be the evolutionary order in which the different capacities would have arisen. The outline does not, of course, cite any bibliography, nor does it accompany each proposal with the qualifications ‘according to my proposal’ or ‘I propose’, which were obligatory in the other sections. Finally, Section 9 deals with the testability of the proposals.

2. The Theory-of-Mind from 2003 Until Now

2.1. A Very Brief Summary

For the authors that accepted Theory-of-Mind around 2003, its primitive mode was the ability to know what the other sees (/ does not see) –or has (/ has not) seen immediately before. This ability is possessed, not only by children much younger than 4 years, but also (as Tomasello et al. 2003 showed) by chimpanzees. These results, soon extended to goats or ravens (see Bugnyar & Heinrich 2005, and Bugnyar et al. 2016) were explained by a very simple mechanism, namely, that the subject both tracks a line from the (visually or acoustically) perceived location of a conspecific to the relevant object and is aware of the (possible) opaque barriers obstructing that straight line.
The advanced Theory-of-Mind was linked to the ability of attributing ‘false beliefs’ to others. The early tests of ‘false belief’ show a video in which a child (Maxi) puts his marble inside a vase and then leaves; afterwards, his mother puts the marble inside his toy box and leaves. Right then, Maxi comes back, and the experimenter asks the children who have seen the video, ‘Where will Maxi look for his marble?’ The answers coming from children under 4 do not show the false belief which Maxi is bound to have, but their own knowledge. Within this general framework, the implicit knowledge of somebody else’s false beliefs in some 3-year-olds that gave however the wrong explicit answer (Clements & Perner 1994) did not seem to disturb the mentioned descriptions of the two modes of Theory-of-Mind.
But nowadays, there are new data. Let’s begin by attending to Karg et al. 2015, which shows that apes’ ability to estimate what the other sees (or does not see) goes well beyond its old description. These new experiments investigated whether chimpanzees could use self-experience to infer what another sees. Subjects first gained self-experience with the visual properties of an object (either opaque or see-through). In a subsequent test phase, a human agent interacted with the object and the authors tested whether chimpanzees understood that the experimenter experienced the object as opaque or as see-through. Crucially, in the test phase, the object seemed opaque to the subjects in all cases (while the experimenter could see through the one that they had experienced as see-through before). Therefore, the chimpanzees had to use their previous self-experience with the object to correctly infer whether the experimenter could or could not see when looking at the object. Chimpanzees in a competitive context (that is, when they were sufficiently motivated) successfully used their self-experience to infer what the competitor sees.
This experimental design is an ‘ecological’ one. Let’s think of an ape that must estimate if his peer sees the object (/ the immobile object) that he, the subject-ape, sees (/ “has previously seen”, Karg et al. 2016). In the wild, it is probable that the ape-subject must estimate if the foliage prevents the peer from seeing the object, but note that, since apes can often find themselves at different heights from each other, ‘the possible foliage that might –or not– prevent the peer from seeing the object’ is often hidden from the ape-subject’s eyes. To make such an estimation, he certainly could move. However, apes (heavy and lacking wings) would take too long to reach a location which would allow them to see their peer’s the visual field. This problem was –I suggest– solved in apes’ evolution by vicarious expectations.
There is also news regarding false-belief tasks. More concretely, since Onishi & Baillargeon 2005, numerous results in non-verbal tests have been offered in favor of the estimation of false belief by infants. That type of tests was later applied to great apes, who achieved not very different results –Krupenye et al. 2016 and Kano et al. 2017. But, since the percentage rate of success in prelinguistic children, and even more, in apes, is smaller, more variable than the rate obtained in verbal tests, we must ask: Are those successes in non-verbal tests based on the same resource which supports traditional tests?

2.2. Discussing Some Proposals About the Difference Between the Primitive and Advanced Theory-of-Mind

Before moving on to my proposal, let’s see that some articles try to disconnect the new data from what would be achieved by the advanced Theory-of-Mind. We will focus on [9,10] and also [11], which goes in a different direction. I am close to their goal, but not to their proposals.
According to [9], the infant grasps others’ beliefs because he “disregards his own (diverging) knowledge”. In my view, such reason is not convincing, since disregarding the knowledge of the situation in which we find ourselves is at any age a non-convenient inattention. But it is also true that, as Tomasello argues, if one’s own mental content, instead of being disregarded, is simultaneously carried with somebody else’s content in one’s own mind, then the two contents must be distinguished and compared by the subject, and thus, we would be identifying the primitive mode with the advanced one –an identification which I am opposed to.
Let us look at [10], which, being relatively like [9], is more recent and elaborate. Southgate (who, unlike Tomasello, doesn’t mention the experiments about ‘foreign false beliefs’ in apes) proposes that “human infants have an altercentric bias, which results from a combination of the value that human cognition places on others, and an absence of a competing self-perspective”, and that such bias causes that the events that are not co-witnessed with the protagonist of the play are encoded with less strength. (About the altercentric bias, Southgate cites Bräten 2004, and we could add Gallese 2018.) This is what explains, according to Southgate, infants’ successes in non-verbal tests of false belief.
I will start by saying that I very much like the idea that for infants, ‘altercentrism’ is beneficial, since it helps them to know what is relevant to others. However, I reject the alleged “weakness of self-perspective” for the same reason I rejected Tomasello’s proposal that the infant “disregards his own (diverging) knowledge”. Note that typical perceptions are evolutionarily much older than altercentrism and are used at any age much more frequently. Thus, it is unlikely that the degree of conservatism that evolution necessarily includes fails there. Certainly, while infants typically pay a lot of attention to what others look at, they sometimes do not pay attention to the change of location of an object. However, according to my proposal, such lack of attention would only appear if the object is not salient enough for the subject, and, therefore, it would not be a consequence of ‘altercentrism in the strong sense’ (i.e., ‘weakness of self-perspective’).
Let’s also focus on [11]. This article –quite different from Tomasello’s or Southgate’s ones– proposes that apes’ success can be explained in “a simple way: Apes don’t use meta-representations, but they merely simulate (/imagine) to believe what the other agent believes”. But note that this simulated (/imagined) belief or “low-level simulation” (as Lurz et al. say) requires to deal with two contents about the same thing and to distinguish each from the other. Thus, this task, as implicit as it may be, is not “a really simpler model”, as these authors defend, but is still a meta-representation.2
Despite rejecting Tomasello’s and Southgate’s idea that infants and apes “disregard their own diverging knowledge”, I accept that the union of “inattention to one’s own mental states” and “attention to somebody else’s mental states” characterizes the primitive Theory-of-Mind. But I will propose that such inattention and such attention take place, not at the content-level, but at the expectation-level.

3. Expectations and Vicarious Expectations

After having criticized those three articles, can we keep the idea of a qualitative difference between apes’ and humans’ Theory-of-Mind? Let’s focus on [17], which, as above said, shows that chimpanzees can use self-experience to infer what another sees. Probably, on the one hand, they activate their own expectations about what they would see if they were in the same location and circumstances as their peer, but, on the other hand, they process such expectations as belonging to the observed peer. These would be expectations of a special –vicarious– kind. But what exactly does a vicarious expectation consist of?

3.1. Expectations in General

Let us begin by attending to expectations in general. These, mainly since Bar 2007, are more often called ‘predictions’ (Latin prae-dictio: said or evoked in advance), a term that I don’t like to use for non-human animals because of the view presented on the next lines. I borrow ‘innate or learned expectations’ from Lorenz 1966.3
General expectations, mainly the goals, are a vital resource to guide behavior and –as ‘teaching mechanisms’– also learning in any animal. The matter is how expectations act in radically prelinguistic minds (and possibly also in our most spontaneous mental processes), while expected things are absent. Probably, instead of proposing that the animal agent has a simulation (or evocation, or off-line copy) of expected ‘things/events’, it could be helpful to understand such ‘presence of absent elements’ in a less demanding, non-evocational mode.4 Therefore, we could describe them as well-defined but empty profiles hierarchically arranged according to its lesser or greater degree of dependency on learning. These empty profiles can recognize the appropriate content when it arrives.
Okasha 2022, like other researchers, claims that ‘the mental representations of goal’ in avian and mammal species are objective facts, and he justifies such claim “on grounds of (their) evolutionary continuity and neuro-physiological similarity (with humans)”. But I, doubting that those “grounds” are enough of a guarantee, suggest the following alternative. It was the very beginning of the new lifestyle –that is, the initial strong increasing of cooperation and communication– that made more and more advantageous the full representation of goals: Individuals needed to communicate their displaced goals to their group, so that the group can cooperate towards reaching that goal.5 Such need for producing and understanding such communications could progressively make evocation of the goal easier. In short, ‘well-defined, empty profiles’, which had been sufficient in the old lifestyle, no longer were.
But all this view is opposed to Fodor 2007, i.e., to ‘language-of-thought’ or ‘innate mentalese’. Therefore, now I must focus on the contrast between this and my underlining of the crucial role of communication. In this way, I am close to the goal pursued, for example, by Fedorenko et al. 2024, but not to their way of dealing with the relationship between language and thought. (The next sub-subsection offers some proposals about language, its origin and cognitive consequences. However, it is only later when I will propose what I call ‘the new –even prelinguistic– communicative reception’, which is the point where apes’ Theory-of-Mind had to be transformed.)

3.1.1. Does the ‘Language of Thought’ Exist?

Fodor 1975 postulated that an innate ‘language of thought’ (discrete symbols, and syntax) supports perceptions (even if “these, unlike discursive representations, lack no canonical decomposition”, as Fodor 2007 adds) and makes evocations possible. Certainly, this idea constituted a root of artificial intelligence, and this fact explains its current revival. However, I lean towards rejecting the existence of the innate ‘language of thought’. More concretely, in some works, I have opposed, not only to innate syntax, but also to innate semantics, since our semantics is indelibly shaped by syntax: See Bejarano 2008, Bejarano 2010, Bejarano and Bejarano 2011 (chapters 10-16). Without syntax, there aren’t nouns / verbs / adjectives, etc.: There are not even nouns –my proposal insists against a deep-rooted idea that influences Hurford 2007, for example.6
Fodor’s theory, although not focusing on the origin of human cognition and language, closely channels the hypotheses about such origin. Therefore, facing that theory requires facing its potential for derivations. And hence this section is going to be longer than what one might think at first glance is appropriate.
Phillips 2024trying to explain language-of-thought, proposes that “(perceptual) data are projected onto a base (conceptual) space in one direction, and in the opposite direction, these data are referenced by that space”. I agree, of course, that the elements (objects, qualities, relations) of a perception are recognized by the individual who perceives. However, in my view, no independent element is used in genuinely prelinguistic perceptions. That is, while in linguistic understanding, each of the meanings receives independent attention before they are integrated into the total meaning, in those perceptions, on the contrary, such attentional, non-subpersonal independence of each relevant element would consume time, and therefore, far from solving any problem, would be a detrimental feature. (Certainly, we, using other human abilities, can attend slowly and long to any perception –let alone an artistic painting. However, perception evolved for survival in a world where rapid response is crucial.7) In addition, I think that in preverbal human infants (Cesana-Arlotti et al. 2018) and in animals, the called ‘logical reasoning’ or ‘intuitive logical reasoning’ requires neither decomposition nor compositional explicitation. Or if (as Durdevic & Call 2022 propose) “deductive reasoning, rather than relational or belief reasoning, is so far the best candidate for a human-unique derived cognitive ability”, it is because deductive reasoning requires syntax and syntactic semantics.
But if I reject, not only the innate language-of-thought, but also its innate ‘semantics’, then I must try to give an alternative account of the emergence of language. Before that emergence, there were –I propose– pre-syntactic (that is, holophrastic) ‘requests for a certain material’ or ‘calls to a certain individual’, which would use pre-words, i.e., meanings always linked to conative function and conative intonation. These holophrases could sometimes (when the individual was absent or the material was gone) reveal the speaker’s false or outdated beliefs to the listener, and, therefore, provoke the theme/rheme composition, which corrects, completes or updates those beliefs (and is ‘meta-communicative’ in the minimal meaning of Dingemanse & Enfield 2023).
The emergence of this pre-grammatical syntax would have been helped by a new and broader intonatory pattern that girds into a single unit the theme and the rheme. (This suggestion fits well with the link between intonation and semantics: “Regarding prosodic cues that correlate with distinct communicative function, the brain responds very rapidly, but not in communicative situations without semantic content”, Tomasello (Rosario) et al. 2022.8) Such intonatory help –a case of physical, pre-symbolic embodiment in human communication– probably facilitated the victory of voice over gesture –Bejarano 2014– (an evident victory, even if gestures continue to accompany and complement vocal communication).9 Thus, a complex management of the two different levels of packaging (the word-level and the intonatory level) became necessary. It is worth focusing on that.
Linguistic structure, including hierarchical structure, is “a special case of structured action” (e.g., Planer 2023). In addition, Gallardo et al. 2023 propose: “In Broca’s area, an action-related region evolved into a bipartite system, with a posterior portion supporting action and an anterior portion supporting syntactic processes”. Could we then suggest that the structured action immediately and directly linked to syntactic structure is the action of managing the two different levels of packaging? Let’s note that Osiurak et al. 2021 assign the “technical” dimension (not the “motor” one) of actions to Broca’s area. This hypothesis is also consistent with the theory that recursive skills (Corballis 2011) go beyond language and explain the evolution of several human processes.
The new and broader intonation that girds into a single unit the theme and the rheme resulted in a duality of different sounds for the same sign (with the conative intonation in holophrases, and with the non-conative one in the genuine word used in pre-grammatical syntax). With this perhaps the problem arose of how to identify the same meaning in two different vocal patterns. The final solution could be the learning of articulatory-phonetic sequences, which are able to be produced with one intonation or another depending on the circumstances.10 In this way the ‘super-high fidelity copying’ could perhaps arise.11 Obviously, “this type of imitation makes sense in intransitive or object-free actions” (Heyes 2021b). It is a “mimicking” (Tomasello 1999) of ‘conventional’ motor sequences. Regarding the ‘high-fidelity copying’, I agree that it was not necessary in the earliest technologies (see Andersson & Tennie 2023, Osiurak et al. 2022, Sterelny 2023 and Tennie et al. 2016). However, if, as I have just suggested, the strictest motor imitation (i.e., the super-high fidelity in articulatory-phonetic sequential imitation) was really essential for the deployment of syntactic language (and, therefore, also of ‘collaborative computation’, Dor 2023), then such imitation is a very important cause of the human cultural advances.
Regarding a deeper link between language and those advances, I suggest that the predicative, really compositional language –beyond making communication easier– is likely to strengthen complex innovations, since these may be supported by the same cognitive resources (of decomposition and recomposition) used in syntactic language. Note that the primary cause of cultural advancement is not the ability to know-how copying (see van Leeuwen et al. 202412), but the ability to produce innovative solutions, mainly through creative problem-solving (although more serendipitous processes, e.g., of drift from copying error, can sometimes lead to improvement of previous results).
I have no intention of turning into a proposal the previous suggestion that ‘syntactic language, or, more precisely, its cognitive resources of decomposition and recomposition, help to support creative –even in non-linguistic areas– innovations’. Anyway, I’ll bring some quotes. “Members of modern Homo sapiens can mentally combine and recombine symbols, according to rules, not only to consciously describe the world as it is, but to generate new visions of it as it might be” (Tattersall 2023) (who, in my view, excellently describes the consequences of syntactic language). Vyshedskiy 2022 highlights the same issue, namely, ‘the voluntary imagination component of language’. This imagination must –I would add– be used even in simple receptions of theme / rheme, since, for the typical, i.e., non-informed addressee (versus the atypical, perfectly informed one), the content provided by the theme doesn’t include yet the rheme, and thus, the addressee will have to imagine a new situation that he/she has not perceived. A clear example: the reception of “The blanket turned to ashes”. In addition, note that, for this communication, the real blanket (or, more exactly, ‘the blanket for the speaker’) is decomposed in two elements –firstly, ‘the addressee’s false belief about the blanket’, that is, an inadequate means to reach the producer’s communicative goal, and secondly, ‘the adequate correction or updating’. Thus, it is communicatively recomposed. This is an ability to transform others’ mental contents. Returning to the previous suggestion: Could that ability later –and more creatively– be exercised on one’s own mental contents and support difficult problem-solving? Thus, in addition to connecting –in my nuclear proposal– human Theory-of-Mind with human communication, I also suggest to connect it with creative problem-solving. (This latter ability is more general than “human causal cognition”, which Gärdenfors & Lombard 2020 persuasively connect with technology.)
But let us return to our thread. What have I achieved in this section? Having said above that the innate mentalese is incompatible with my proposals, I unfortunately have not offered any strong argument against it. However, I have tried to show that an alternative hypothesis may also have potential for derivations. So, if it is accepted that the innate language-of-thought is not necessary, then expectations can be more easily described as empty profiles. (If, then: Needless to say, this article uses the hypothetico-deductive method and is based on data from several disciplines.)

3.2.1. Can the Metaphorical Description Also Apply to Vicarious Expectations?

So far, we have dealt with expectation in general, which is inseparable from any animal life, and involves extremely basic competences (for instance, the physical understanding of the effects of gravity, or the daily exposure to the principles of causality). But what is interesting for us –what can, in my view, connect with apes’ Theory-of-Mind– is only the vicarious expectation. Thus, we must focus on the following question: Can the metaphorical description (‘well-defined but empty profile’) also be applied to vicarious expectations? Such application seems plausible. Note, for example, that such emptiness can explain why ‘level II perspective-taking’ is absent in the primitive Theory-of-Mind.13 Rakoczy 2022 underlines in his general review this absence. See also Woo et al. 2024.

3.2.2. An Argument That Favors That Application: Primates’ Mirror-Neurons

To favor that affirmative answer, I will try to show that vicarious expectations derive quite directly from a particular non-vicarious expectation. Let us focus on macaques’ ‘mirroring’, and, more concretely, on its origin according to Keysers & Perrett 2004. Hands are (together with the forearm) perfectly visible to their possessor, which must look at them very attentively during the actions of grasping. Thus, the proprioceptive and tactile feedback of any grasping will end up being connected with the visual perception of that movement.14 There are growing indications that such proposal is correct. Thus, Heyes & Catmur 2022 conclude that the abilities of mirror-neurons are learned “through correlated experience of seeing and doing the same actions in the context of self-observation”.15 In addition, the alleged neonatal motor imitation, which tried to enhance the innatism of mirroring, is increasingly challenged –Oostenbroek 2023.
If Keysers & Perrett are correct, then we could underline that, while visual / proprioceptive connection is forming in a macaque, it is still a non-vicarious expectation: It is the macaque’s immediate grasping that activates in him the (general, non-vicarious) expectation of the two versions –visual and proprioceptive– of the adequate ‘feedback’. But, when the visual version is given without the corresponding inner sensations –that is, when it is someone else’s hand–, the subject must disengage from himself the hand that is in sight. (Could this fit with the neuroscientific findings of Pomper et al. 2023?) Such disengagement is confirmed by the results of all the rubber-hand experiments: See e.g., Pfister et al. 2021: “A single tactile stimulus applied to the rubber hand –but not to the real hand– triggers substantial and immediate disembodiment”. But this ‘disembodiment’ (or exclusion from one’ own body) does not only concerns the hand at sight, but also the proprioceptive and tactile expectations which the observation would have activated in the subject, and which this subject now needs to process as ‘belonging to other’. This is when vicarious expectations and the very beginning of the primitive Theory-of-Mind would arise. In short, while it is typically emphasized that “mirror-neurons map other-related information onto self-related brain structures” (Bonini et al. 2023), I underline the later, inverse mapping (my own failed proprioceptive expectations become vicarious expectations automatically processed as belonging to another).
Certainly, the vicarious expectations that I propose to attribute to apes concern the entire body, not only the hand. However, this could be an irrelevant difference. Piaget 1954 showed that it is from hands and (since hands bring food to mouth) also from mouth that the child builds correspondences between his own body and other bodies. In addition, Errante et al. 2023 have found (in human participants) that “actions-observation activates specific cortical and subcortical sectors not only during hand actions observation but also during the observation of mouth and foot actions”. In this way, if the hypotheses seen in this sub-subsection turn out to be correct, then we could deduce the desired conclusion –i.e., vicarious expectations are directly derived from non-vicarious expectations, and, therefore, if it is accepted that this latter type is an empty profile, then the same has to be accepted with respect to vicarious expectations.
What do I finally get from all this? If vicarious expectations –instead of requiring imagined (/simulated /evoked /off-line) contents– are ‘well-defined but empty states’, then no meta-representational separation between contents and vicarious expectations is necessary, and also then, the contrast between vicarious expectations and foreign contents can support the contrast between apes’ and humans’ Theory-of-Mind. But here we must add some clarifications.

3.2.3. Some Clarifications on Vicarious Expectations

In 2. 2, I proposed that it is ‘the subject’s own expectation that is absent’ when the subject activates vicarious expectations and encodes them as ‘belonging to other’. Why do I use “absent” (instead of “disregarded”, the term applied by Tomasello to ‘the subject’s own knowledge’)? Let’s remember that behavioral activity necessarily activates expectations of goals and sub-goals. Therefore, “inattention to one’s own expectations” can only take place when the subject, being behaviorally inactive, has not any general expectation activated.16 Thus, the confusion is impossible, not only between (empty) vicarious expectations and one’s own (full) mental contents, but also between both types of the subject’s expectations –(absent) general expectations and (present) vicarious expectations.
I add another clarification. The so-called ‘attribution of ignorance’ in the primitive Theory-of-Mind does not require any resource different from vicarious expectation. The mere ‘absence of vicarious expectations’ –when, for example, the other chimpanzee has not seen the food– can explain why in that case the (subordinate) subject goes (as Tomasello et al. 2003 showed) to food. This view is close to Barone et al. 2022, who studied early implicit measures of false belief understanding: “The results from a new ‘Ignorance’ control condition in which children largely behave like in the ‘False-Belief’ condition, suggest that the epistemic state ascription does not amount to full-fledged belief attribution. Rather, children probably merely track knowledge versus ignorance”. In addition, basic and implicit ToM capacities seem not to be the same ones as those tapped in standard explicit false-belief tasks, since –as Poulin-Dubois et al. 2023 found– there is no stability in Theory-of-Mind skills from infancy to early childhood.
From Neuroscience, Schüler et al. 2024 say: “While the primitive Theory-of-Mind is supported by the salience network, it is the default network that supports foreign false beliefs and, more in general, the processing of internal, perceptually decoupled representations”. This is compatible with my hypothesis. Note that vicarious expectations are perceived in the body and movements of another agent, are really salient perceptions for the behaviorally inactive subject.17
The following clarification is also very important: If vicarious expectations are accepted, then we must accept that the self-other distinction is automatic in the primitive Theory-of-Mind. Let’s see [12], who worked with human adult subjects: “When another agent’s mental state is inferred, it can be identified as ‘belonging to other’ in two different ways.” A way is that “a learning signal (prediction-error or belief) is encoded in an agent-independent pattern. In this case, the learning signal and the identity of the agent to whom the signal is attributed would need to be encoded in 2 separate activity patterns.” This first way, with its meta-representational separation, would be linked to the advanced Theory-of-Mind (in Ereira et al.’s words, “to standard false belief task”). But these authors claim that, to identify mental states as ‘belonging to other’, there is another way, which operates “through an encoding of agent identity intrinsic to fundamental learning signals (my emphasis)”. This second way of self-other distinction (which in human adults is limited to the most spontaneous processes) would be, in my view, based on vicarious expectations.18
In a similar line to Ereira’s (but focusing on ‘altercentrism’), Tebbe et al. 2024 report: “A highly specific neural signature of visual object processing was also present when their view was blocked and only another observer saw the object”. This, which was found in infants and adults, could perhaps indicate that the visual vicarious expectation shared the empty profile that served the subject to search for and recognize the particular object.19 The core of the experimental design of Tebbe et al. 2024 should be applied to apes. (See above the paragraphs about Karg et al.’s articles.)
Vicarious expectations can include what Michael & Székely 2019 call ‘goal slippage’. In summary, the ‘slippage’ into the circumstances of the other, or the ‘disembodiment’ of expectations (i.e., the ‘exclusion from one’s own body’,Pfister et al. 2021) which the subject performs when the observed hand is a foreign one, or, as in Ereira et al. 2018, ‘the encoding of agent identity intrinsic’ to the mental state –all these terms– describe ‘vicarious expectations’. I would now add that such easy slippage is abruptly interrupted (both in humans and in apes) as soon as the other agent turns around and looks at the subject. We have to not forget that, in a subject, vicarious expectations are incompatible, not only with behavior, but also with high probability of immediate behavioral activation.
That rupture of the easy slippage is similar to what happens when, after having imitated (copying all his turns left or right) someone who walks ahead of me, I realize that he turns around and faces me. Certainly, humans can continue such bilaterally accurate motor imitation, but only if they start doing something different from what they were doing before (i.e., different from the mere ‘slippage’ into another location). More precisely, if I want to continue the imitation, I will have to imagine myself in a situation which is as intrinsically impossible for me as being in a different spatial relationship with myself.20 (Let’s think of the gesture of two individuals of shaking the other person’s right hand: Could this gesture originally involve –or try to provoke– the grasping of foreign mental contents?) All in all, the similarity of the collapse of the flowing “slippage” in the two mentioned cases –with imitation and without imitation– is clear.

4. Primitive and Advanced Theory-of-Mind

4.1. Working-Memory and Non-Verbal Tests of False Belief

Indeed, in the (so-called) ‘non-verbal tests of false belief’ there are successes (significantly above chance), but they are quantitatively limited. Regarding non-human primates, see, for example, Berke et al. preprint. In addition, ‘replication findings’ are mixed. This is why “a large-scale multi-lab collaboration will examine whether 18-27-month-olds and adults’ anticipatory looks distinguish between knowledgeable and ignorant agents” (Schuwerk et al., work in progress). About those difficulties to replicate successes, Rakoczy 2022 proposes: “There might be two classes of implicit tasks”.
How could we interpret all this? Certainly, regarding this matter, we must wait for new data. In addition, evolutionary emergence and ontogenetic development can never be identified (and even less so in our case, since “infants’ experience is already enlanguaged” (Dreon 2023). However, when we are asking a question so difficult to answer –when we are wondering about the evolutionary origin of human Theory-of-Mind– we should not rule out anything that can provide us even a little bit of light. Thus, I will add some little commentaries.
In my view, those non-verbal tests require –regarding just Theory-of-Mind– vicarious expectations, and do not need meta-representation of foreign contents. In other words, –regarding just Theory-of-Mind, I repeat– such tests mainly depend on the primitive, easier one (even though sometimes, of course, adult humans apply to them the advanced Theory-of-Mind). However, they require other abilities beyond Theory-of-Mind. Thus, the two scenes (original location, changed location) and the consequent demand on attention and working-memory can often provoke a great difficulty in less motivated subjects.21 Such difficulty mainly appears if these are at the same time prelinguistic subjects. Note, please, that developmentally –and, very plausibly, also evolutionarily– the reception of multiple-word messages causes a great expansion of working-memory.
Leaving all this, let’s focus on the only nuclear proposal in this article. Certainly, all the other proposals or suggestions offered in the article can and should be evaluated in themselves. However, if I have included them here, it has been to hold the main one.

4.2. What Made the Estimation of Foreign Mental Contents Originally Advantageous?

My proposal up to this point has been that the contrast ‘primitive versus advanced Theory-of-Mind’ equals the contrast ‘(empty, easier) vicarious expectations versus (full, more difficult) foreign contents’. Therefore, the following question arises: What made the estimation of foreign mental contents originally advantageous? As the reader can see, I believe that only if we explain the difference in function –and not only in features– between vicarious expectations and foreign mental contents can we move forward.
I propose the following three points. First, to support the adaptive advantages provided by apes’ Theory-of-Mind, vicarious expectations are sufficient resources. Second, since vicarious and non-vicarious expectations require previous, well-defined profiles in the subject that activates them, this subject cannot activate any vicarious expectation of mental states that are impossible for him in any circumstances.22 Third, your mental state of thinking of me as a foreign, distal individual, since it is a mental state that is impossible for me, cannot be a vicarious expectation of mine, and therefore, I will only access that state if I am able to estimate foreign contents. Above I proposed that ‘the situation of being in a different spatial relationship with myself’ is impossible for me. Now, a new example –your state of interacting with me as with a foreign, distal individual– would be equally impossible for me, but much more relevant for human needs.
Thus, we can reformulate our previous question in the following way. For what function was ‘the ability of estimating the foreign mental states that involve me as a distal, foreign individual’ originally advantageous? Or, more concretely: In the new lifestyle, were there problems that such ability could solve?

5. Self-Conscious Emotions

I am now going to separate (as other researchers have done) Theory-of-Mind from ‘false belief’ a bit, and to focus more on those emotions and other issues. “A developmental approach that focuses on a plurality of domains makes us able to generate useful insights that may not be obvious when focusing on a single domain”, Ruba et al. 2022. Or, in other words, a puzzle is more difficult if some pieces are missing. But let’s go back to work step by step.
“The thinking what others think of us” (Darwin 1872, about blush; my emphasis) necessarily requires, according to my proposal, the estimation of foreign mental contents, and therefore, the beginning of the advanced Theory-of-Mind. That phrase can describe self-conscious (or ‘self-other-conscious’: Reddy 2010) emotions, which are “embarrassment, shame, guilt, pride” (e.g., Lewis 2000).
Is there a common neurophysiologic signature for these four emotions? We can see Piretti et al. 2023. However, these authors unfortunately did not include pride –the only pleasant self-conscious emotion– in their study.
I am opting –it is already evident– for the idea that such emotions are originally based, not on an innate moral core, but on an interpersonal relationship. Thus, in self-conscious emotions (unlike in basic emotions23), the subject “thinks what others think of him”. Beyond Darwin’s phrase, Frith & Frith 2007 is essential: “The appropriate reception of deliberate social signals depends on the ability to take another person’s point of view. This ability is critical to reputation management, as this depends on monitoring how our own actions are perceived by others”.24 Indeed, when we experience self-conscious emotions, the contents of the foreign mind become more real, more relevant for us than any other reality in our surroundings. Cf. Peeters et al. 2023: “[O]bserver-memories are often associated with events where the memorizer experienced a high degree of self-awareness, such as during public speaking. This could be explained by appealing to the context of encoding, where the relatively intense emotions guide encoding towards an observer perspective.”
In this Section, firstly, I will argue that self-conscious emotions relate to the new, human lifestyle. They are “survival circuits” (as LeDoux 2012 and LeDoux 2023 describe the function of any emotion), but survival circuits of a very special type that evolved linked to the human lifestyle. Secondly, I will propose that self-conscious emotions require the estimation of somebody else’s mental contents.

5.1. Self-Conscious Emotions Are Useful in the Human Lifestyle

The new, human lifestyle is based on special cooperation and communication. Consequently, the care of one’s own reputation, and therefore, also an enhancement of self-control became crucial: Leary 2004 and Sznycer 2019. (All this did not replace “the old dynamics of social dominance, which are based on aggressive and submissive interactions” –Royo et al. 2024–, but was added to them. Hence, prestige is associated with evolutionarily new nonverbal displays: Witkower et al. 2020.) In this way, Baumard et al. 2013, who focus on “competition to be chosen as a partner in cooperative ventures”, practically identify the care of reputation with the habit of refraining from “blatantly selfish actions”.25 This refraining is certainly essential in the care of reputation. However, even in “cooperative ventures” other aspects are important –e.g., the reputation concerning good communicative abilities. In addition, beyond cooperative ventures, there are –see Crespi et al. 2022– other “arenas of runaway social selection” where reputation is equally crucial. We must also consider that when narrative language and thereafter the (negative or positive) gossip arose, the care of reputation became more intense.26
But let us pay attention to a different usefulness of self-conscious emotions. ‘The new, human lifestyle’ requires also the “deliberate practice” –Ericsson 2002, Rossano 2003– that is necessary to achieve any kind of cultural expertise. Here, a self-conscious emotion –pride– intervenes. Experts arouse admiration. (About the two types of admiration –for skill and for moral virtue–, see Algoe & Haidt 2009. About admiration –versus envy– for experts: Onu et al. 2016.) Therefore, experts experience the only pleasant self-conscious emotion –pride. See Sznycer & Cohen 2021, and Sznycer et al. 2017.27 The search for those attractive rewards can support, at least in some of the admirers, prolonged, effortful acquisitions, not only of the admired level of expertise, but also of a better one. This role of pride could become even stronger in “collaborative computation, which is the foundation of our cumulative cultures” (Dor 2023).
In addition to providing motivation in that way, pride could influence in an indirect, but still effective way. Progress towards a goal > Higher ‘self-efficacy’ > More difficult goals are perceived as possible. And a goal that is perceived as both difficult and possible (that is, a goal in Vygotsky’s zone of proximal development –Vygotsky & Cole 1978) can improve the subject’s level. In other words, “there may be a positive relationship between difficulty and progress when self-efficacy is high”, as Thorme et al. preprint will try to confirm.
Certainly, children at first are concerned with learning by observing their parents. However, from about age 8, they switch to copying the local expert instead.28 This tendency is probably universal (Henrich & Broesch 2011). expertise, despite not influencing automatic imitation (Nevejans & Cracco 2022), can cause desire to acquire such expertise, and in that causality, “admiration is more decisive than prestige bias” (Chellappoo 2021). In addition, let us look at experimental results by Brinums et al. 2023: “Children that were asked to imagine succeeding in the test and to focus on what they will be feeling (Emotional Condition) practiced longer than those in the Non-emotional Condition.” More in general, Shimoni et al. 2022 report that a strong link between delay of gratification and pride has been found among preschool-aged children, an age at which self-regulation abilities are still developing.
Thus, pride can, I propose, support the cultural advances. Pride is a reward that subjects get when they see the admiration with which they are looked at by the group –a reward that the subject, of course, will seek to obtain again. Certainly, there are other rewards for an outstanding skill. E.g., André & Baumard 2023 –who do not underline the causal role of pride, or, more concretely, of its pleasant nature– focus on “reputational and material benefits to the recognized artists”. However, the pleasure that pride provides, being less deferred and more easily evocable than those benefits, could originally be the best resource to support the prolonged effort that an outstanding skill requires. “Regarding, for example, the learning of post-Acheulean shaped stone tools, we should be concerned to explain the hours of effort with little or no short-run return” (Spurrett 2024). About this, Castro & Toro 2004 and Castro et al. 2024 talk about a reward –parental positive evaluation–, and also Sterelny & Hiscock 2024 (in their reply to Spurrett) focus on children. However, such focus is useful only to support the basic acquisition of skills, not to sustain the attempt to surpass the previous level of the group.
Therefore, pride –I propose– could be an important cause of the innovations that gave rise to our cultural advances. (Mere serendipity, in my view, would have had only a moderate influence.) Thus, the two features of the new, human lifestyle described above (in Introduction) would be supported by self-conscious emotions. In other words, not only the negative self-conscious emotions are partially responsible for its ‘social’ feature (as it is generally admitted), but also the only pleasant self-conscious emotion has a strong influence on its ‘cultural’ feature.29
In short, self-conscious emotions support self-control, which is necessary in different aspects of the new lifestyle.30 Certainly, self-control will be bolstered later by ‘speech directed to oneself’ or, even later, by ‘inner speech’ (see Bejarano 2022, in its section 4), and can be put at the service of any type of goal (even the goal of exercising what I call –see previous note 18– the most demanding moral capacity). Probably, those very special types of speech originally arose when the gossip (which “gives gossipers an evolutionary advantage”, Pan et al. 2024) spread more and more. However, before ‘speech directed to oneself’, self-conscious emotions were crucial for the growth of self-control in humans.

5.2. Self-Conscious Emotions and the Estimation of Foreign Contents: The Two Connections Between Both Traits

Now, let us move on to the link between self-conscious emotions and the ability of estimating foreign contents. I propose that if the human being can experience self-conscious emotions, it is because he is capable of imagining himself in a situation as impossible as seeing himself as a distal, foreign element. (An earlier, more embodied version of that imagining was offered in 3.2.3.) Thinking what others think of oneself requires the ability to estimate other people’s mental contents: Vicarious expectations would have been useless there. This is the first of the two connections mentioned in the title of this subsection.
Let’s move on to the second connection. Having opted for the idea that originally such emotions were based on an interpersonal relationship, I suggest that, very likely, such interpersonal relationship originally occurred as a prelinguistic intentional communication, that is, as expressive ‘gestures or vocalizations’ accompanied by gazes. (I agree with, for example, Bohn et al. 2022 that the main link between the kinds of signals our human ancestors used and human language “is the interaction engine”. More in general, I start from Tomasello’s claims that human uniqueness is previous to language.31) Such prelinguistic intentional communications –for example, ‘gesture of disgust (/ surprise) + eye-contact with the addressee’– could have caused unpleasant (/ pleasant) self-conscious emotions in the addressee. (I’m not really proposing these examples, but just putting them here to facilitate the exposition.)
Such productions are “simultaneous multilevel communications”: Lipschits & Geva 2024 (who also underline the decisive role of the adult-receiver). More concretely, in such communications the intentional level would control and use the behavioral one and even the autonomic one, i.e., those movements or expressions that originally were not intentionally communicative. This transformation of the old levels makes ‘the dissociation between expression and intentional communication’ “murky” (Warren et al. 2023), or, more concretely, there is no such dissociation at all in the intentionally communicative production of the great apes.32 The proposal that “in such communications the intentional level would control and use the behavioral one” (Lipschits & Geva 2024) is similar to ‘the recruitment view’ about the origin of great ape gestures –“Great ape gestures recruit features of their existing behavioural repertoire for communicative purposes”, Graham et al. 2024.
Certainly, the prelinguistic intentionally communicative messages that caused self-conscious emotions in the addressee stand out due to their special importance (focused on in 5.1) for the development of the new, human lifestyle. However, as communicative productions, they are examples just like any other within apes’ and infants’ ability. Despite it, we need to underline such messages: Note that, while the above cited phrase of Darwin perfectly serves, with its “of us”, to distinguish what vicarious expectations cannot do, it however ignores a basic question –how the human subject originally comes to think what others think of him– and therefore, it can’t get us to the human communicative reception, which is an (or the?) essential root of human uniqueness. So, henceforth this subsection will focus on that root, and in this way will give a second argument in favor of the link between self-conscious emotions and the estimation of foreign contents.
As seen just above, in non-human primates the intentional control of the behavioral and autonomic levels can occur in production. Thus, in the very beginning, human communicative uniqueness only happens at the reception, I would add. In other words, according to my proposal, it is the recipient who originally needs to strive –and to estimate foreign mental contents.
This proposal (it is the recipient who originally needs to strive) maybe can seem like a way to escape from the controversy between, on the one hand, Scott-Phillips & Heintz 2023, who agree with Grice that “the communicative producer typically intends that the recipients recognize his/her communicative intention” and, on the other hand, Moore 2015 or Geurts 2019, who reply that it is only to hide his/her communicative intention that the producer must strive. However, in my view, that ‘second Gricean requisite’ is not the best terrain to focus on the very origin: Note that, while Grice starts from a clear contrast between natural and non-natural signs, I propose that it is the transformation of a ‘natural’ (or rather, returning to Lipschits & Geva 2024, merely “behavioral” or even “autonomic”) sign into a ‘non-natural’ one –precisely such transformation– that must be recognized by the addressee.33 This transformation is just the “behavior of marking entities (e.g., objects and actions) as communicative” (Mussavifard, preprint). However, at the very origin, the recognition of such ‘marking’ (i.e., its understanding by the addressee) required an evolutionary transition: This is my point.
But let’s get back to our thread. What I really propose is (as in subsection 4.2) that, if an addressee identifies through vicarious expectations the outcome that is intended by the producer, then this addressee will not be able to perceive the producer’s behavior as a communicative behavior towards him/her –i.e., towards the addressee. Therefore, the eye-contact that typically accompanies chimpanzees’ intentional communications to an addressee will be, of course, understood by the ape-addressee as a communicative resource, but it will not be applied to the behavior that activates vicarious expectations. This non-unified reception is certainly more hazardous and less effective than human, unified reception. However, if, as I propose, the non-unified one exists, then it sometimes must produce the result wanted by the producer.
But my proposal can only be defended if we find which is the condition which allows some intentional communications of that type to be successful –i.e., allows that they get the addressee to satisfy the producer’s desire. The proposal makes the following prediction: In such successes, the behavior with which the ape-producer tries to manipulate the addressee’s attention toward evidence of the intended outcome –that behavior or resource– may be well understood even if it is not perceived as communicative. If that were so, then we could hypothesize that failures do not derive mainly from a deficient ability for pragmatic interpretation (even if interpretation, “in a novel situation, requires the integration and assimilation of multiple pieces of information to guess at outcomes”, Warren & Call 2022), but above all from the limitations of non-unified reception.
Melis & Rossano 2022 –as others had done before– claim that monkeys’ and apes’ communicative production is better than reception. These primates can intentionally produce messages to ask, or even show (“A female adult baboon tries to draw the attention of her offspring toward the piece of fruit that she waves between her fingers”, Meguerditchian 2022.34 However, when the non-human primate receives the message that is addressed to him, he cannot –I propose– grasp that ‘such action of trying to draw attention’ is simultaneously ‘mental’ and ‘addressed to him, i.e., to the recipient’.35
Returning to the purpose of finding which is the condition which allows some intentional communications of that type to be successful, I will start by recognizing that such task is a difficult one. Leavens et al. 2005, studying their captive but untrained chimpanzees, have found that ape-producers no longer use pointing gestures as soon as the recipients leave, and confirm, therefore, that those communications are intentionally targeted at the addressee, but nothing is said about reception, because the addressee is human. Contrarily, in Hobaiter et al. 2014, the addressee of the pointing gesture is the chimpanzee-producer’s mother, but in the case observed, the mother did not satisfy the desire. (She probably did not as it would have been risky –we can suppose–, but, anyway, this case cannot be used as an example for successful communication.) Loud scratch, despite its great relevance, doesn’t seem to help us enough either, since it has typically been regarded as ritualized. (However, in this case, we can remember ‘the recruitment view’ (Graham et al. 2024), and also a question that was raised above, in 3.2.2 –Is it the case that only the primates possess vicarious expectations? If it is so, then loud scratch could activate vicarious expectations instead of the general expectations that are activated by the overwhelming majority of animal ritual signals.)
Does all of this constitute an insurmountable obstacle? I believe it does not. We –again– must consider that, if those gestures or behaviors could never be understood by apes, then they would not be produced by wild (Hobaiter et al. 2014), and captive but non-trained (Leavens et al. 2005) chimpanzees either.
We can see that those productions occur when a very conspicuous obstacle (the cage in Leavens, or the dominating individual in Hobaiter) prevents the producer to satisfy his/her goal. Therefore, ‘the behaviors that try to signal the purpose of the producer’ can be understood by the ape-addressee as behavior which merely responds to the producer’s goal (although, due to the obstacles, he, the producer, was unable to achieve such goal). Or, describing it according to my proposal: Those behaviours easily raised vicarious expectations in the chimpanzee-addressee, and did not need to be understood as communicative by that addressee.
The non-unified reception may seem surprisingly inappropriate. However, it was –I suggest– kept in apes for two interrelated causes. Firstly, in apes’ lifestyle, the non-unified reception, despite being suboptimal, is a sufficiently useful resource. Secondly, the change to unified reception requires a new ability and probably also brain modifications that allow the duality of contents.
A clarification can be here convenient about the unified, human reception of that type of prelinguistic communications (i.e., the unification between ‘gaze towards the addressee’ and ‘behavior that tries to signal the outcome that is intended by the producer’). While such reception already must be supported by the estimation of foreign mental contents (or, more concretely, of a foreign thought that interacts with the addressee subject), it is still different from the predicative language. Note that, on the one hand, only predicative communications are primarily used to correct (or complete or update) the addressee’s beliefs. On the other hand, the role that the gaze towards the addressee fulfills in those human prelinguistic communications is dispensable in linguistic communication: The non-natural feature of linguistic signs is sufficient to reveal that they have intentionally communicative function. (See above, in this same subsection, the debate about the second Gricean requisite.)
Therefore, the predicative language (the only communicative function that absolutely requires syntax and syntactic semantics) could mark a new stage, which would be characterized by more working-memory (see above, 4.1, and, first of all, Coolidge 2023), and –I suggest– also by constituting an interpersonal, easy precedent for creative problem-solving (see above, the end of 3.1.1). Certainly, the role I have proposed above for pride would have begun before creative problem-solving, and continued afterwards. However, creative problem-solving, which transforms the subject’s own mental contents so that they become adequate for solving the problem, could correlate with the emergence of more decisive innovations.
In humans, the non-unified communicative reception is practically absent. The human addressee that possesses the advanced Theory-of-Mind, not only can activate vicarious expectations, but also estimate foreign mental contents. Let’s apply this –if only to close the argument– to self-conscious emotions. I have accepted that, for communication to cause self-conscious emotions, the recipient must estimate the interiority of the producer –i.e., a foreign interiority which is communicating with him, the recipient, or thinking of him.36 But if the recipient’s ability to estimate foreign interiority is reduced to the activation of vicarious expectations, then, that ability –I propose– will not be able to apply to a foreign interiority which is at that very moment communicating with –or thinking of– the recipient.
In conclusion, self-conscious emotions 1) support the ‘cultural’ and ‘social’ features of the new, human lifestyle, and 2) are linked to its most basic and crucial feature, which is the new, advanced type of communicative reception. In the Introduction (when I focused on the question, ‘What is ‘the new, human lifestyle’?), it was stressed that this lifestyle needed increasing communication. In addition, I mentioned this need in relation to the issue of “collaborative computation” (Dor 2023). But now we can say that prior to that quantitative increase, the new lifestyle needed a deep change of communicative reception.

6. The Advanced Theory-of-Mind Beyond Its Origin

‘The thinking foreign mental states which involve us as their distal addressees’ is, in my view, a requirement only for the origin of the human Theory-of-Mind. In fact, I propose that, once the ability to think ‘two lines’ of contents becomes strong, the advanced Theory-of-Mind can carry complex functions which do not fulfill that requirement. Such complex functions are varied.
Sometimes they use foreign but non-interactive contents, as in verbal false-belief tests, which involve “a non-dialogic capacity of mind-reading” (Dor 2016). Note that in those verbal tests, the communicative interaction, instead of being between the subject who attributes the mental content and the ‘attributee’, is reduced to that which is established between child ad experimenter. Regarding this feature of verbal tests of false belief, Gallagher 2015 states that “given the specific attraction of the second-person interaction (versus third-person perspective), the saliency of the interaction with the experimenter takes precedence over the third-person task”. Elaborating that contrast, Barone & Gomila 2019 conclude that second-person attributions of false belief (unlike third-person attributions –for example ‘The Ancients believed that p’) “are transparent, extensional, nonpropositional and implicit”.
By way of a parenthetical digression, I will comment about first-person beliefs. Regarding current first-person beliefs, if it is required that they possess the meaning of ‘believe’ that habitually is activated in second- or third-person attributions (‘He –mistakenly– believes that p’ vs. ‘he knows that p’), then we must say that originally, such first-person beliefs did not exist. In the beginning, for human subjects, their non-outdated beliefs are just the reality (and –in the beginning, again– their outdated beliefs are immediately replaced in an automatic way by the new perceptions, and so, the origin of the predicative negation was probably not intrapersonal but interpersonal). In short, the ‘believer’ cannot have first-person beliefs in the above-described sense, but only ‘knowledge’: On this point I agree with Phillips et al. 2020 (at least, for a primitive, prelinguistic sense of ‘knowledge’ –as Rakoczy & Proft 2022 specify). The concept of belief (and of some traits of character: remember what Ross 1977 called ‘fundamental attribution error’) emerged –I suggest– in an interpersonal way. In my view, the called ‘animal meta-cognition in great apes’ (summarized in Tomasello 2022; see also Tomonaga et al. 2023) is not a judgment on one’s own contents, but a mere hesitation about one’s own general expectations, or (as Edwards-Lowe et al., preprint say) “subpersonal uncertainty estimates”.37
Once the digression is over, let us return to “second-person (versus third-person) attributions”. According to my proposal, this type of attributions is included within ‘the advanced (or uniquely human) Theory-of-Mind’. However, I fully accept its great simplicity. As said above in 3.1.1, even pre-syntactic ‘requests for a certain object’ or ‘calls to a certain individual’ could reveal the speaker’s false beliefs to the listener: Therefore, those easy, second-person attributions of mental contents could provoke the origin of syntax.
Other times, non-original functions of the human Theory-of-Mind are not only non-dialogical. Indeed, they can use even non-foreign contents. These contents are either the subject’s beliefs which he no longer holds, or ‘possible’ contents, in any of the senses of ‘possible’.
However, according to my proposal, the uniquely human Theory-of-Mind originally arose from a directly relational, interpersonal process, which requires neither language nor experience with narratives. In my view, the linguistic modeling of Theory-of-Mind –(Heyes & Frith 2014; Moore 2020)– is a much later step, which required new discoveries. Among these, it is worth highlighting above all the irreducibly hypotactic ‘referred speech’, and the verbs ‘say’, ‘believe’, or ‘imagine’. (See Bejarano 2011, chapter 21.)38 Thus, the original ‘estimation of foreign mental contents’ is what cognitive archeologists recommend looking for, namely, a “component attribute” (versus ‘compound concept’): See Foley & Mirazón 2020.39 Likewise, my proposal on the origin of human Theory-of-Mind fits with the suggestion that “a priority for future research is to identify the genetic ‘start-up kit’ for the cultural inheritance of mind-reading” (Uta Frith, cited with approval by Heyes & Frith 2014; my emphasis).

7. The Advanced Reception of Pointing

7.1. Pointing Gesture In Evolution of Language

In children’s acquisition of language, pointing gestures are important (Southgate et al. 2007 and Kishimoto et al. 2007). Since the child’s pointing gestures may often provoke linguistic comments from the adult about the signaled object, it is evident that those gestures create the ideal context for learning words. Note that, although the words that appear in the adult’s comments may be unknown to the child, this will rely on the trick of knowing which object such comments refer to. But in evolutionary origin of language –I propose– pointing gestures may have been even more important.
This Section, even if now I will add new arguments and data, will repeat the same hypothesis above applied. More concretely, in 5.2, I applied it to the reception of the communications that cause self-conscious emotions, and now, to the reception of pointing gestures. However, I have considered it appropriate to delay in dealing with pointing gestures, since, while self-conscious emotions are almost unanimously considered uniquely human, regarding pointing gestures, however, things are very different.
In addition, at the end of this section I return to ‘the cooperative eye hypothesis’ (Tomasello et al. 2007, built on Kobayashi & Kohsima 2001). Certainly, my proposal will put the evolutionary transition (i.e., my proposed transition to the human, unified reception of pointing) precisely in the process that unifies the two gazes –or, in other words, extends the communicative function of ‘the gaze towards the addressee’ to ‘the gaze towards the object’: Therefore, it fits well with the fact that human eyes make the horizontal travelling of the iris conspicuous. Likewise, such conspicuity is certainly an embodied resource, like that of the broad intonational pattern that was proposed above regarding the origin of syntax. However, despite all that, I’m not totally convinced that the human type of eye emerged in synchronization (or relative synchronization) with the beginning of the human Theory-of-Mind. In other words, while I am fully convinced that human eyes are very effective facilitators of the advanced, or ‘unified’, reception of pointing gestures, I have only a faint hope about that synchrony. Anyway, since the problem of when the transition occurred is so difficult, I strongly recommend that researchers in Paleogenomics try to answer the question of when the human-type eye appeared in evolution. As said above, we should not rule out anything that involves any possibility of giving us light.40

7.1.1. Responding to a Possible Objection: Pointing in Apes

On the one hand, I have proposed that the advanced Theory-of-Mind is uniquely human. On the other hand, we know that many chimpanzees raised by humans have been taught to produce pointing gestures and to understand them (even the declarative type of pointing: Lyn et al. 2011) What answer can I give to all this?
I will begin by admitting two indisputable facts. One, “human children display this ability to use communicative cues only after many months of intensive exposure to cultural environments characterized by frequent referential signaling, both verbally and nonverbally” Clark et al. 2019. Two, the absence of pointing is not at all harmful in “apes’ lifestyle”.
From those statements, some authors conclude that in non-human primates that ability would be present, although scarcely exercised or developed. See Vasilieva 2019: “Not only the presence / absence of a trait, but whether it manifests in animals to the same degree as in humans, is equally important for our understanding of trait evolution”. The following example is offered by Heintz & Scott-Phillips 2022): “Human bodies are not especially well-suited to swing from trees. However, there is no absolute barrier.” In that same line, Berio & Moore 2023 recommend resuming great ape enculturation studies.
But, according to my proposal, it is only the effective, ‘unified’ reception of pointing gestures that is uniquely human. Certainly, in this way, I place as vital criterion a process which is still unobservable, which may seem like a withdrawal towards “untestability with scientific methods”(Leavens 2021). However, as it can be seen, the proposal relates to some facts and to several potential experiments and research.

7.1.2. Authors Who, When Dealing with Pointing in Apes, Have Focused on Reception

The focus on reception is not new. Moore 2013 focuses on the receptive failure of apes and proposes that “since pointing gestures provide poor evidence for a speaker’s message, they exceed the pragmatic capacity of apes”. Likewise, Morrison 2020 emphasizes the ambiguity and necessary disambiguation of pointing gestures. I agree with these claims. But, in my view, those ‘poor evidence for the message’ and ‘poor pragmatic ability’ are insufficient to explain the frequency of receptive failures in apes.
Lyn & Christopher 2018list three conditions in which the experimenter may point out and whose reception by apes is differently successful: “i) Proximal-Proximal: The choice items are close together and the point is close to the correct item. ii) Proximal-Distal: The choice items are close together, but the point is further away. iii) Distal-Distal: The choice items are further apart, and the point is therefore necessarily further away.”
According to that work, in Proximal-Proximal and in Distal-Distal, point-following can be achieved by simple mechanisms. However, “in Proximal-Distal, the best predictor of success is ontogenetically previous human social contact”. I would underline the fact that it is just in Proximal-Distal where the direction of head of the producer (that is, the cue that chimpanzees use to estimate what others can see: Tomasello et al. 2007) is unable to signal the object.

7.1.3. Unlearned Production in Apes

Before focusing on the contrast between the two receptions, it is convenient to go again and in a more detailed way over unlearned production in apes. “Unlearned (i.e., with no explicit training whatsoever) captive chimpanzees frequently point to unreachable foods. These are communicative signals because apes will not reach towards obviously unreachable food if there is nobody around to see them do it” (Leavens 2005). In addition, in those chimpanzees a repeated gaze-alternation between the food and the experimenter was significantly associated with their pointing gestures.
Since then, Leavens and other authors began to ask themselves whether conditions like those (cage and benevolent recipient) which in the mentioned observations were considered as decisive appeared in wild chimpanzees too. Hobaiter et al. 2014 offer the following proposal: “Wild chimpanzees experience few physical barriers, but the presence of a dominant, unrelated chimpanzee monopolizing a particular resource may be a greater barrier to a young chimpanzee’s access than bars on a cage. To overcome this challenge, a juvenile’s only resource is another chimpanzee, mainly its mother.” Thus, they found a case in the jungle which they classified as “possibly deictic”. A possible conclusion: Wild chimpanzees that use this type of production with their conspecifics can thus achieve (at least sometimes) their goals.
Nevertheless, for such production to be a useful resource in the wild, it is necessary for recipients to deliver (at least sometimes) the desire object. Is it possible? Animal altruism is a controversial matter: see, e.g., Rendall et al. 2012 versus De Waal 2010. But I do not discard it, if it does not cross the limits of the (always narrow) ‘spontaneous altruism’.41

7.1.4. Reception of Pointing Gestures in Chimpanzees and in Humans

Regarding the reception of pointing gestures in chimpanzees, I begin by highlighting that they understand the communicative value of gazes towards the addressee. Indeed “the sensitivity to being watched is both innate and shared by most vertebrates” (Klein et al. 2009. Thus, in the species that are able to perform ‘recipient-directed’ communication, recipients of that gaze understand that they are the addressees of this innate communicative resource. (But, while in gorillas, eye-contact communicates mild threat, in chimpanzees, by contrast, it is a friendly communicative resource.)
However, in the chimpanzee-recipient such communicative value is not applied –this was proposed above in 5.2– to the other element produced by Leavens’ or Hobaiter’s untrained chimpanzees, that is, to the gaze towards the object and hand/arm movements. The gaze towards the object and hand/arm movements’ is, for an ape-addressee, a non-communicative behavior that can sometimes activate vicarious expectations in him (in the addressee). It is fair to specify up to which point this description of non-human reception of pointing gestures seems implausible to human intuition. The producer, both before and after making movements in a certain direction with his arm and head, communicates with the recipient by means of eye-contact. Why would the recipient not understand that the producer’s movements are communicative, or, in other words, that the communicative value of eye-contact is applied to those movements and gives them a communicative function? For humans, that unification of the two consecutive instants is obliged and unstoppable, I acknowledge it. But is such unification present in chimpanzees?
As said above, the cage (Leavens et al. 2005) or the dominating individual Hobaiter et al. 2014) make the chimpanzee’s gesture non-absurd for conspecifics even if it is not interpreted as communicative. On the contrary, our human reception of pointing gestures can be considered closer to that of communicative pantomimes.42 Tomasello 2008 stresses how strange any pantomime can be for a recipient if the gestures involved are not interpreted as being communicative (“the recipient will see my iconic gestures as some kind of strangely misplaced instrumental action”43), but he does not say it about our pointing. However, according to my proposal, in both cases a same problem arises for apes. As said above in 4.2, vicarious expectations –the only resource that, in my view, apes have to estimate the interiority of others– cannot involve any action that is impossible for the subject in which they are activated. Therefore, vicarious expectations cannot be understood by the subject –that is, by the ape-addressee– as involving communicative actions directed by the producer to him.
Now, let’s pay attention to the alternation between gaze to the object and gaze to the addressee. This alternation appears in apes’ and humans’ production of pointing gestures. In Leavens et al. 2005 we already read that in those captive but untrained chimpanzees the repeated gaze-alternation between the food and the experimenter was significantly associated with their pointing gestures. Even more important –of the utmost importance really: Paulus & Fikkert 2013 show that the necessary and sufficient element for human babies to first understand pointing gestures is not the hand-movement (or its situational / cultural variations –see Cooperrider & Slotta 2018), but the alternation between the two gazes. Thus, we must focus on it.
On the one hand, the ‘gaze towards the object’ causes the recipient to estimate what the producer sees. On the other hand, the ‘gaze towards the recipient’ (a.k.a. ‘eye-contact’) informs the recipient that he is being the addressee.44 In addition, inter-brain consequences of eye-contact in humans are increasingly studied. Pan et al. 2020 mainly focus on teaching. Di Bernardi Luft et al. 2022 stress that “inter-brain synchronization mainly flows from leader to follower”, and thus, from the producer of pointing gestures to the addressee. In general, second-person approaches underline eye-contact: Cañigueral et al. 2022.
But what must be highlighted is that in our human communicative reception, those two instants (‘gaze towards the addressee’ and ‘gaze towards the object’) cannot in any way remain separate, but they must be unified. The addressee has 1) to estimate what the producer from his place and in his circumstances is looking at, and 2) to understand that what the producer is looking for by looking at the object is to point at the object for him, for the addressee. According to my proposal, it is –as the reader already knows– in that unification where the problem arises for the ape-recipient. Let’s return one more time to the nuclear subsection (i.e., to 4.2). Certainly, vicarious expectations are automatically processed by the subject as belonging to the observed individual. However, since there can be no vicarious expectation of the results of an action intrinsically impossible for the subject, the recipient-subject will be unable to apply to such expectations an interpersonal communicative function towards himself.
Therefore, the unified, fully effective reception of pointing gestures will only be possible by the estimation of mental contents of the producer. Thus, there would be a common capacity to that reception and to that of prelinguistic messages that cause self-conscious emotions (and, in general, to linguistic reception45). That ability can be colloquially described as the one of ‘remaining in your shoes when you look at me’ (a description that highlights the similarity to a more embodied version –see above the end of 3.2.3– of the ability).
A preliminary test about these proposals could investigate in humans whether there is some relevant neurophysiologic similarity between the interpersonal activation of all (negative and positive) self-conscious emotions and the unified communicative reception of pointing gestures. If such similarity is found in the future, then the plausibility of the general proposal would increase. But it is convenient to specify that the proposed explanations of self-conscious emotions and of the effective, unified reception of pointing might be evaluated by future discoveries differently each one of them.
In other words, in addition to total success and total failure, there are other two possibilities, the partial results. Thus, it might be discovered that, while the proposal about the advanced reception of pointing can be maintained, the explanation of self-conscious emotions must be transformed –for example, rejecting their interpersonal origin and deriving their ontogenetic and evolutionary emergence from ‘an innate core’ of moral norms. Or, conversely, the result might be that, while the proposal about self-conscious emotions can be maintained, the effective, non-hazardous reception of pointing does not require any process of unification between ‘gaze towards the addressee’ and ‘gaze towards the object’ –because, for example, their mere succession might be enough for full effectiveness to be achieved through “the human pragmatic competence, which is greater than that of apes” (Moore 2013) or, alternatively, because human beings are much more inclined to gaze-following (an inclination that might derive from the salience of human eyes or connect with a supposedly prior, not subsequent, type of ‘Natural Pedagogy’, (Csibra & György 2006). Anyway, for now, I bet on my proposal in the most ambitious way (or rather the most self-reinforcing one: to give a recent example, see ‘causal-association inferences’ in Currie et al. 2024), that is, applying it to both abilities.
Of course, in the beginning of ‘the new lifestyle’, several behaviors (not very different from the ones carried out by Leavens’ and Hobaiter’s untrained or wild chimpanzees) could achieve some degree of reception and could be useful for both producer and recipient. Let’s consider, for instance, the action of pushing a conspecific until we place him so that he can see a relevant object. These types of communicative production would have been multiplied in the beginning of the ‘new, cooperative lifestyle’, without the recipient grasping the simultaneously mental and communicative nature of the behavior yet. But this problem finally became accessible to co-evolution genes/culture. And so, the effective, unified reception of pointing gestures appeared, together with the estimation of foreign contents.46 Now, I will propose that the unified reception of pointing gestures is strongly facilitated by a little anatomical feature.

7.2. The Human Eye and the Unified Reception of Pointing Gestures

Tomasello et al. 2007 focused on the universally human white sclera, or, more precisely, on both its horizontal enlargement and its depigmentation, and proposed that these human peculiarities enhance “the visibility of eye-gaze orientation”. But gaze-following –a phylogenetically old ability– is carried out without the help of the white-of-eye. Indeed, Tomasello et al. 2007 showed in apes the reliance on head (versus eyes) in gaze-following. Likewise, Moore, Chris 2008 concluded from his experiments that when infants first start to follow gaze (at that age –note, please– they are still unable to receive pointing gestures), “they do so on the basis of head direction, not eye direction”.
Despite all those possible objections, Tomasello et al. 2007, putting ‘the enhancement of the visibility of eye-orientation’ in the evolutionary context of human special cooperativeness, hypothesized that humans evolved such unique eye morphology to facilitate joint attentional and communicative interactions among conspecifics. See also Wolf et al., and Yáñez & Gomila 2018,who, after underlining ‘the interactional importance of gazes’, add: “especially when oneself is the focus of that attention”, i.e., during eye-contact. I will specify this emphasis on cooperation and interaction to connect it with my proposal of the ‘unified’, effective reception of pointing gestures. Let’s start by redescribing “the enhanced visibility”.
Mayhew & Gómez 2015, Perea-García et al. 2019 (but see Mearing & Koops 2021) and Caspar et al. 2021 have proposed that the chromatic contrast in human eye is not unique among ape species. But let’s focus on horizontal elongation. This feature may have evolved to allow non-arboreal primates to scan their environment widely. Nevertheless, such elongation together with the universal “totally/bilaterally white sclera” make the location of the iris conspicuous not only in averted but also in direct gaze. In addition, “the eye-outline is easier to see in humans (than in apes) irrespective of skin color” (Kano et al. 2022) and this makes the location of the iris even more conspicuous. See also Prein et al., preprint, who conclude that human ‘gaze understanding’ is “based on the pupil location within the eye”. Thus, human eyes –this is my point– make the successive locations (that is, the horizontal travelling) of the iris conspicuous.
In this way, the continuity of the two gazes in pointing (or, in other words, the crucial –remember Paulus & Fikkert 2013– alternation between gazes) is really enhanced. It might be said that, when the producer moves his iris from the ‘gaze towards the object’ to the ‘gaze towards the recipient’, that movement is perceived by human recipients like if it was injecting the ‘gaze towards the object’ –and, consequently, also the vicarious expectations activated by recipients– into the ‘gaze towards the addressee’, that is, into the communication. So, the human eye would lead the human recipient of pointing gestures to unify the two instants –and, therefore, to estimate the producer’s mental states that, involving himself, i.e., the recipient, as their distal addressee, are intrinsically impossible for this addressee– and, therefore again, to estimate ‘foreign mental contents’.
In short, in my view, human sclera is an anatomical, universal ‘facilitator resource’ of a mental process –the unified communicative reception of pointing gestures, of course– in the addressee. It is also a strong ‘facilitator resource’. These qualifications could maybe arise the suspicion 1) that the ‘unified’ communicative reception of pointing gestures was the evolutionary first function of the ability of estimating foreign mental contents, and 2) that this estimation –and the consequent ‘duality of mental contents’– was originally difficult and demanding. However, such deductions (let us not forget!) would require us to choose the option of the synchronic or quasi-synchronic emergence between human eyes and human Theory-of-Mind.
The depigmented sclera could become universal in an evolutionarily very short time, and therefore (if there was such synchronic emergence) the human sclera could arise in the same species in which the effective, unified reception of pointing gestures was beginning to emerge. But did it happen in Sapiens? And if so, did it happen at the very beginning of our species? Or later?47 Or did it emerge in Neanderthals / Denisovans? This can be a crucial question. I hope that Paleogenomics and Genomics specialists will answer it soon. Certainly, the depigmented sclera is a quite simple feature. However, its universality makes, of course, their task difficult.
If we follow the option –the faint hope, as I said in 7.1– that the peculiarity of human eyes emerged in relative synchrony with human Theory-of-Mind, then we could propose that this facilitator is an essential basis for any human communicative reception (i.e., our ability to understand messages as foreign mental states and, simultaneously, as addressed to ourselves). But such proposal can accept either that such basis –such estimation of foreign contents– emerged in Sapiens, or that, on the contrary, in Sapiens, only its derivations emerged (see above, in subsection 5.2, the separation between the human reception of prelinguistic messages, on the one hand, and predicative language, on the other hand, and see also Section 6), while the estimation itself had emerged in Neanderthals. In short, that option, in addition to being based on a ‘faint hope’, could predict only that the human type of eye will not be found in earlier hominins. Therefore, regarding Neanderthals, it does not possess a strict falsifiability. This is an extremely unfortunate fact, since it is just the Neanderthal genome that is being studied. Anyway (and returning again to 7.1, but now to the recommendation that “we should not rule out anything that can provide us even a little bit of light”), the question of whether Neanderthals –or even, as suggested in note 47, our species in its beginning– possessed eyes like ours should be answered. If such answer is negative, then it could give us a useful supply of light. But now all this is just a very faint hope.
I do not want this last paragraph, with its lack of confidence and pessimistic tone, to mark readers’ final impression on my hopes. Please remember that such tone has not been the norm for this article. Indeed, as said above, I’m much more convinced of my general proposal than of the synchrony between the emergences.

8. General Outline

I) Animals do not evoke their goals. Expectation, which is an empty profile, is enough to guide their behavior.
II) The primate hand (which its owner can see, and needs, during his grasping action, to see) gives rise to a first novelty. When a movement is to be executed with the hand, there is not only kinesthetic and proprioceptive expectation but also a visual expectation. Thus, the sight of a foreign hand can activate the expectation of the normally concomitant kinesthetic and proprioceptive sensation. When at the very next moment, this error is corrected, those kinesthetic and proprioceptive expectations are automatically processed as belonging to the individual whose manual movement was observed by the subject. Vicarious expectation has appeared.
III) Vicarious expectation can, perhaps only in great apes, extend beyond the hand. In this more complex vicarious expectation the subject, having established a correspondence between his own body (felt but not seen) and the other’s body (seen but not felt), gains the highly adaptive ability to activate vicarious expectations about what the other sees from his position and orientation, even though at that moment the subject does not have access to such a visual field. (Of course, such evolved vicarious expectations will only be possible if the subject knows the area very well and has often been in the place where the observed individual is now.)
IV) All vicarious expectations, both original and other, remain what all expectations in general are, that is, empty profiles.
V) But with the human way of life, new skills become necessary, which vicarious expectations are unable to sustain. It is now necessary that communicative messages, though still prelinguistic, be understood by the recipient simultaneously as mental states of the producer and as addressed to him, to the recipient. It is communicative reception, not production, that originally required a great change. (In the communicative production of the great apes, a merely communicative use had already been given to behaviour and movements that were not originally communicative.) Apes can estimate the mental states of another individual, since, as already said above, a particular type among all the expectations they activate in themselves – that is, vicarious expectations – are automatically processed as those of the other individual. But such an incipient Theory of Mind is not sufficient now. In human communication, the recipient has to grasp a thought that could never be his own under any circumstances, and could never, consequently, be an expectation of his: Note that the thought of the communicative producer necessarily includes the feature of being addressed to him, the recipient, as to a distal individual. Human estimation of the mind of others must therefore capture (full) contents, and not mere (empty) expectations.
VI) Part of this human communicative reception was applied to understanding messages that would trigger self-conscious emotions in the recipient. These very particular emotions emerged in large groups, that is, among individuals who were not permanently together, and where the behavior of one could surprise another. (Hominids who lived in small groups probably evolved only in another direction, that is, in the direction of greater social cohesion and spontaneous altruism.) The role of social control played by the three self-conscious emotions that are unpleasant is well known, and my proposal does not add any qualifications to it. But pride could lead to improvements in the group culture by an individual.
VII) With the emergence of this uniquely human type of communicative reception, communication becomes much more useful, and many more meanings are created. The first meanings in human communication had nothing to do with our semantics, since this is intrinsically shaped by syntax. The first meanings were only calls to someone in particular or requests for something specific, and they could not have any other intonation than that of a request or call. The message was made up of only one of these pre-words.
VIII) But these primitive messages, despite their limitations, were capable of designating concrete realities – an individual, an object. And this, together with the already acquired ability to capture other people’s mental contents, soon gave rise to syntax. Note that syntax is only needed in language with a predicative function, and this communicative function seeks –except in lies, of course– to correct, complete or update the mental content of the listener, which the speaker judges to be erroneous. (Of course, this syntax was pregrammatical –that is, ‘theme, rheme’– and remained like that for a long time probably. Complex grammatical devices – subordination, deictics converted into anaphoras – originated only with ‘reported speech’ or with long interventions by a single speaker.)
IX) But let us stop focusing only on the evolution of language. The great transition, the cerebral change that the new communicative reception entailed, had effects beyond communication. If humans can simultaneously think about their own mental content and the mental content of others, they will also be able to evoke, as (full) mental contents, their past perceptions, or possible future perceptions. One key to the difficulty is in all cases the same. The brain has to prevent any content other than ‘its own at the moment’ from directing its behavior. This difficulty had no previous precedents, since during dreams, first, the dream situation is the only one that the subject pays attention to, and, second, there is a motor paralysis (except in sleepwalkers).
X) Creative problem-solving also had to do with the great transition. To try to connect the two, we have to go back to language. Creative problem-solving consists in the transformation of our mental contents, which initially seem inadequate to achieve the resolution, into ones that do solve the problem. This is, of course, much more difficult and, both in evolution and development, much later than the predicative communication. But in predicative communication, in the mere ‘theme, rheme’, there is also a transformation of an inadequate element (the false belief of the listener, which is, of course, the only thing the listener can grasp in the thema) into one that is adequate to communicate to the listener what the speaker judges to be the reality of the matter. The difference lies in the fact that the operation in creative problem-solving is intrapersonal, not interpersonal. But that difference, that enormous distance, could be bridged during human genetic-cultural coevolution. Thus one should distinguish between cultural innovations that occurred only through pride and other, later and more crucial ones, that were based not only on pride but also on creative problem-solving. (The connections accumulated by any fully linguistic individual throughout not only his years of language acquisition but throughout his entire life would facilitate the search for a way to transform initially inadequate content and achieve problem resolution. And one might suspect that such connections are stored not only in language but among the resources an individual learns in music, painting, and other areas.) But in the linguistic area we might perhaps find a slightly less distant precedent for creative problem-solving than predicative syntax. Note that in partial interrogations the speaker has to communicate what he does not know.

9. Summarizing, and Looking Towards the Future

This article has hypothesized that the contrast ‘vicarious expectations versus foreign mental contents’ is a genomic, brain novelty that appeared in co-evolution genes/culture. Thus, I have proposed that such novelty was required by ‘the new, human lifestyle’, which was increasingly technologic (humans are ‘obligatory’ users and producers of tools) and cooperative (with a way of cooperating that is based on a particular type of communication). More concretely, in the origin of this lifestyle, two extremely important abilities (self-conscious emotions and, more basically, the new communicative reception of even prelinguistic messages) required, according to my proposal, the ability of estimating foreign contents. The key to my argument has been that only in human communication the addressee has to think foreign (i.e., others’) movements as mental states addressed to him. As the reader already knows, my hypothesis is dialogic, embodied and deeply embedded in evolution.
I have proposed that the advanced –uniquely human– Theory-of-Mind and human (even prelinguistic) communication are inextricably linked. Or more precisely: On the one hand, the set of those two abilities and, on the other hand (and more initially), the new lifestyle, feed off each other in a growing spiral. Therefore, while there is absolutely no suggestion on my part that all uniquely human capacities evolutionarily arose at the same time, I maintain that one of them –namely, the estimation of foreign contents, and not only of vicarious expectations– underlies the rest.48
The contrast ‘extinct species of Homo versus us’, if it becomes finally an area of Comparative Neuroscience, might fulfill in a special way the promise to help us to ‘know ourselves’, as classical philosophy wanted.49 Such a result could perhaps be achieved with the help of Genomics / Paleogenomics, as said above. And also with “the use of evolution to identify meaningful categories of mental activity”: Cisek 2019 and Cisek 2021, which apply this resource to animals in general. But the use of co-evolution genes / culture is also necessary to identify categories of human mental activity. In other words, the nuclear categories of human mental activity will be more easily found the more we seek their link with the emergence of the human lifestyle.
Returning to the nuclear proposal, this article has not offered any new empirical result. However, the main proposal and each sub-proposal raise questions. Let us mention some of those questions. My view of expectations? Apes’ vicarious expectations? The anti-intuitive ‘non-unified reception of pointing’ in chimpanzees? Interpersonal origin of syntax and syntactic semantics? Is there genuine metacognition in great apes, or, on the contrary, only ‘subpersonal uncertainty estimates’? These questions can lead to different experiments and to research in Neuroscience or Genetics, whose results will have an impact on my proposal, in one way or another. But I have already dealt with this above.
Therefore, I will add only a more personal comment. I am really looking forward to those results that can make my hypothesis testable. Even if those results discarded my proposals, I would feel that my effort has been useful. Obviously, the hypotheses are most useful when they point out a correct path, but if an apparent road really leads nowhere, then the task of promoting its testability is also a service to community. In short, I ardently wish that these tests are conducted. However, since such empirical research is out of my reach, I can only request them. This is what this article would want to do now and in the medium-term future.

Funding

“This research received no external funding”.

Institutional Review Board Statement

“Not applicable”

Conflicts of Interest

“The authors declare no conflicts of interest.”

Notes

1
Even if population size and connectivity have been too drivers of the cultural advances and also –mainly in African Middle Stone Age– of cultural droppings: (Scerri & Will 20235; Shipton 2024).
2
However, I agree that apes’ ability in those tests is related to “affective empathy” (Lurz et al. 2022). Or, in my words (Bejarano 2022) , ‘vicarious expectations’ are related to ‘spontaneous altruism’.
3
So, the methodological, more particular matter of the violation-of-expectation paradigm (see the general review by Margoni el al. 2023) will not be discussed here.
4
Nowadays it is known, at least, that unexpected events can only be connected to superficial layers (of visual primary area) while expected events are also connected to the deeper levels of that area and, thus, it is possible that expectations are coded in the brain in a very different format than perceptions. (Thomas et al. 2024 showed this in human brains as well. That does not conflict at all with my proposal. Humans, although we can evoke absent things, also have the empty expectations of animals.)
5
Such communications would already use non-innate resources (based not only in iconicity, but, probably even more, in ‘past conditioned associations known by the group’: Cartmill et al. preprint). However, it is very probable that these cultural gestures or calls still lacked ‘super-high fidelity’ transmission (which supports articulatory-phonetic imitation). In addition, let’s note that in the reception of these messages, the principle “Teleology, first” in Theory-of-Mind (Perner et al. 2018) was, of course, obeyed. We could even suppose that such type of individual message attempted, firstly, to become more and more choral to, finally, influence group behavior: In other words, it would not be ‘dialogic’. All these features would place this type of communication far from even prelinguistic human communication. Despite this, such messages would go beyond empty expectations of goals.
6
“The first words ever spoken is a key issue for the research in evolution of language” (Gasparri 2023). I agree with the importance of such issue.
7
Planer 2019, a defender of languages-of-thought, understands perfectly that “if the brains of many animals instantiate languages of thought, then we face a serious explanatory challenge. That challenge is to explain how languages-of-thought might have evolved.” But I am not persuaded by his explanation.
8
Or, more precisely, without a semantic content either produced simultaneously with the prosodic cue, or immediately previous in a dialogue –I add. This second type can be produced with a minimal articulation originally empty of meaning (e.g., the ‘huh?’ of Dingemanse et al. 2013).
9
“In human infants, shoulder movements, controlled by ipsilateral motor pathways from the right hemisphere, precede the left-hemisphere control of the right hand” (Rönnqvist 2003) and also of culturally learned motor sequences. Nowadays it is also known that in humans, certain muscles that are mainly associated with shoulder movement –and, therefore, also with the expressive gestures that involve arm-movement– are likely to interact with the voice (Pow et al. 2023). Thus, the superiority of arm-gestures over vocal resources that is observed in intentionally addressed communications of non-human primates, that indisputable (even if relative, Lameira et al. 2024) superiority, could perhaps be conserved in multimodal communication of human infants as the anteriority of arm-gestures –less complex than hand-movements– over cultural vocal learning. If that were so, then we could suspect that such anteriority, interacting with the voice, caused the new, broader intonatory unit, and, in this way, paradoxically ended up giving rise to the mentioned ‘victory of voice on gestural communication’. We must take into account that “in apes, communicative gestures, unlike manipulative movements, are controlled by areas that in human brain are responsible for human language”: Becker et al. 2021, Becker et al. 2022 and Meguerditchian et al. 2011. In short, I wonder if the following similarity has a basis in the ontogenesis and phylogenesis of our brain: Culturally learned movements of the right hand (controlled, of course, by left hemisphere) are embedded in a previous, simpler arm-movement (right hemisphere), and, similarly, culturally learned vocal signifiers (left hemisphere) are embedded in an intonational pattern (perhaps right hemisphere: Gainotti 2024 again vindicates the recently challenged “graded, right-hemisphere dominance for emotions”).
10
The learning of articulatory-phonetic sequences, even if it does not have to face the problem of perceptual-motor correspondence –one hears oneself–, is a difficult type of imitation. Certainly, as Heyes 2021a says, “I could copy a sound you make by simple trial-and-error, varying my vocal output until it matches my memory of the sounds you made”. This perfectly describes the babbling. However, note that unitary articulatory-phonetic sequences of several different steps cannot be reproduced simultaneously with their hearing, nor can they be easily reproduced –at least not in a precise way– except after hearing them repeatedly.
11
So, I am wondering about the possibility that the early language did not depend on the ‘super-high fidelity copying’. (Planer et al. preprint focus on an apparently similar puzzle –“an early language previous to know-how copying”, although these authors perhaps do not sufficiently emphasize the difference between the know-how copying that is used in technology and the super-high fidelity copying, and thus they solve it in a different way than I do, i.e. they adopt a merely gestural-iconic origin of early language.) Note, please, that the delay in the appearance of articulatory-phonetic sequences is a reliable fact in the first manifestations of writing. Could the same thing have happened in oral language? This suggestion, already put forward by Hockett 1960, has been defended by Fleming 2017, but in the context of studying the ‘clicks’ of South African languages.
12
That article shows that chimpanzees used ‘know-how social learning’ (from a chimpanzee that experimenters had taught) to acquire a skill they fail to innovate. Thus, we can think that if wild chimpanzees use such type of learning only very infrequently, it is because they don’t produce complex innovations.
13
Certainly, recent research –Steven et al. 2022– points to perspective-taking as a flexible and context-specific suite of abilities. However, here we can continue with Flavell’s dichotomy.
14
If this (in my view, very attractive) hypothesis turns out correct, then we could deduce that the so-called ‘audio-motor mirror-neurons of birds’ cannot be mirror-neurons. Note that, while learning the song-dialect, the bird does not sing yet. Therefore, the externally perceived dialect (that is, the dialectal enrichment of the innate template) is stored without any connection with proprioceptive expectations. Thus, if the proposal of Keysers & Perrett is accepted, the research about ‘the mirroring’ would have to refocus on primates, without it meaning undervaluing any type of ‘analogous similarities’ (underlined, for instance, by De Waal & Ferrari 2010).
15
However, Heyes 2021b and Heyes & Catmur 2022 rather emphasise that cultural practices –“childrearing practices that encourage adults to imitate infants and children, or the use of optical mirrors”– solve the problem of visuo-motor correspondence. I accept, of course, that these factors have a powerful influence on development (Essler et al. 2023). However, as regards the (both phylogenetic and ontogenetic) origin of the solution, the key is, in my view, the –perfect and at the same time indispensable– vision of one's own hand.
16
So the activation of spontaneous altruism towards a partner who is unable to communicate in a sufficiently salient manner (see Schüler et al. 2024) depends solely on the subject’s previous state —or, more concretely, on his /her /its “spectatorial, non-active attitude”— and not on the state of the other individual. This is, of course, a limitation of spontaneous altruism.
17
But, beyond that compatibility, the contrast shown by Schüler et al. 2024 puts a very interesting need at the centre of the scene. The human Theory-of-Mind (which will be fully deployed in Section 6) must prevent all those internal, perceptually decoupled representations from influencing our behavior. Obviously, such prevention –I add– is a much more difficult task than the one required in nightmares, for example. While in this latter case, there is only one line of mental contents –nightmare situations–, in the human Theory-of-Mind, however, there are ‘two lines’ of contents, and, therefore, in the default network (in this peculiar, human ‘resting-state’) the prevention must be much more subtle and complex than mere muscle paralysis.
18
Those two ways might be relevant to solve a repeatedly alleged conundrum (“the empathy-sharing conundrum, which mainly refers to the self-other differentiation that empathy entails”, Vincini 2023). In my view, the type of self-other distinction that is based on vicarious expectations does not involve any clash between self and other. This is the type that, when it is linked to ‘empathy’, intervenes in spontaneous altruism. On the contrary, the other type, when it is linked to ‘empathy’, appears, for example, when the subject receives a request that he/she feels as an obstacle to –or, in other words, as a clash with– his/her own activated goals. (Bejarano 2022 focuses on the second type –‘the most demanding moral capacity’– and proposes that, while the estimation –or, ultimately, perception– of foreign mental contents is an adaptively very advantageous resource in human lifestyle, it however caused that the two typical features of perceptions —one, that of informing about the surroundings, i.e., of being true, and the other, that of being useful to the subject’s interests— became, for the first time in evolution, dissociated from each other.)
19
Thornton & Tamir 2024 (who use the term ‘affordances’) too can perhaps support that vicarious expectations (and primitive Theory-of-Mind) are also activated in adult humans.
20
Corballis 2000 and Corballis claimed that we interpret the ‘images in the mirror’ as the left-right reversal of the original objects, and that, while a reflection’s reversal is a product of optics, “such interpretation comes from neuroscience”. This link with neuroscience could be lengthened: The sudden acknowledgement of standing before a mirror –and not before a peer– inhibits the mentioned high-level resource.
21
Lewis & Krupenye 2022, for example, underline apes’ competitive motivation. About infants’ motivation, see an interesting proposal in Woo & Spelke 2022, who apply to this question (infants’ estimation of others’ false belief) an idea relatively similar to the link between “look for cheaters” and reasoning (Cheng & Holyoak 1985, or Cosmides 1989). In short, Woo et al. 2022 underline that, since in some contexts “the estimation of others’ false beliefs may facilitate the ability to morally evaluate others’ actions”, such estimation is an adaptive task even in toddlers.
22
Obviously, any mammal or bird has expectations about the behavior of animals that are vastly different from him. But those are general, non-vicarious expectations.
23
Thus, it is not surprising that, for example, pride, when it is compared to joy, involves what Bornstein et al. 2023 call “a relatively more distant perspective”.
24
We could also remember Baader’s anti-Cartesian formulation (“Cogitor, ergo sum”), even if Baader (1765-1841) interpreted it “more theologically than interpersonally” (Geldhof 2005). I would reformulate it in the following way: ‘If I grasp foreign (i.e., others’) thoughts that involve me, I am human’.
25
Baumard et al. 2013 really propose: “The best care of reputation (the most adaptively advantageous one, since the error of mistakenly assuming that no one is paying attention to a blatantly selfish action may compromise an agent’s reputation) is the genuinely moral habit”. This, of course, is also proposed by many other authors, for example, Boileau (“Pour paraître honnête homme, il faut l'être”). I shall not comment such proposal here, but see Bejarano 2022.
26
This could relate to what, on a higher, later level, Di Francesco et al. 2021 say: “People’s self-defining life stories have an intrinsically defensive nature; the description-narration of one’s own inner life is organized on the basis of the fundamental need to construct and defend a self-image endowed with an at least minimal solidity.”
27
According to my option, pride originally arose interpersonally: The “hubristic, narcissist pride” that is mentioned by Tracy et al. 2024 would have been a late (“evolved”) intrapersonal derivation.
28
As said above, while none of the earliest technological abilities implied high-fidelity transmission, this type of transmission not only supported later technologies, but also what I called (in 3.1.1) the set of all ‘super-high fidelity copying’ –the articulatory-phonetic copying, and the learning of songs or dances. (Obviously, in these skillful tasks the conscious activity of memorizing and copying the model gives way, after multiple repetitions, to sub-consciously memorized actions, and this allows attention to be focused on a higher level.)
29
The underlining of pride is also useful to prevent the concept of self-control from being incorrectly narrowed. See Bermúdez et al. 2024: “Apathy is a normally overlooked kind of self-control problem. However, compared to negative self-control (i.e., self-control against temptations), which relies more on situational strategies, positive self-control requires more intrapsychic work to get motivation.”
30
‘Self-control’ (Shilton et al. 2020)? Or ‘self-domestication’ (Benítez-Burraco & Nikolsky 2023, to choose a recent example)? I can only say that the connotations of the term ‘self-domestication’ (even if this is very different from ‘submission’ –the evolutionary precedent of shame, according to Maibom 2010) are less suitable for a capacity that, “even when it takes us to meekness, means the strength and power to use one’s energy” for one’s previously chosen purposes: Roszak 2022. (This author, instead of “self-control”, uses the traditionally moral term “fortitude”. But I cannot adopt such a use, since in my view –Bejarano 2022–, self-control is not necessarily moral.)
31
Could Bryant et al. 2024 –“Our findings support a two-step evolutionary process, in which changes in prefrontal cortex organization emerge prior to changes in temporal areas”– reinforce that claim?
32
Remember that, much later in development, also our current narrative speech uses ‘theatricalization’ in gestures and affective prosody. Likewise, ‘symbolic play’ –or ‘pretense’– might train this ‘intentional control and use’ of behavioral and even ‘autonomic’ levels.
33
Such recognition is so adaptive that ‘the possibility of false positives’ (i.e., the currently very mentioned ‘overextension of Theory-of-Mind’–see, e.g., Bering 2011– doesn’t matter.
34
Likewise, human infants produce ‘ostensive gestures with an object’ months before making pointing gestures: Rodríguez et al. 2015 and Guevara et al. 2024.
35
Regarding other animals (including birds and non-primate mammals, in my view) that probably lack any Theory-of-Mind, it is known that they can accumulate evidence through ‘many pairs of eyes’ in an easy, simple communication. For example, “cues and signals from other individuals (e.g. fleeing movements and alarm calls) reduce uncertainty about predator risk” (Hahn et al, preprint).
36
Ontogenetically that estimation is a difficult process, even in its pre-requisite: So, caregivers may naturally express their emotions in ways that maximize learning possibilities –e.g., “emotionese”: see Benders 2013, or Ruba & Repacholi 2020.
37
Thus, according to my proposal, the intrapersonal meta-cognition or intrapersonal ‘cognitive humility’ (i.e., a cognitive humility not primarily understood as “moral interpersonal virtue” à la Priest 2017, or “as reputation management” à la Karabegovic & Mercier 2023) would be a very late human ability. I agree with Li 2023 that it is both interpersonally originated (since the subject during a dialogue sometimes grasps that the knowledge of the other is more complete than his) and very necessary. In addition, I suggest (see the end of 3.1.1 again) that this cognitive humility is required by the transformation that any creative problem-solving involves, i.e., by the process of transforming our initially inadequate resource (i.e., our incomplete or incorrect mental content of a reality) into one capable of achieving the solution. That type of humility –that, so to speak, ‘culmination / intrapersonalisation’ of Theory-of-Mind– is maybe enhanced by the least social –and ontogenetically the latest– type of laughter, namely the laughter caused (e.g., after a punchline) by one’s own pleasant interpretive failure.
38
‘Say’ and its intensifiers ‘promise’ or ‘swear’ were even later used in ‘first person + present + affirmative’, an apparently tautological use which came to fulfil a new function, but still related, in my view, to ‘referred speech’. With them the speaker communicates that he is aware of how his speech looks –and could be referred– from the outside. (In this case, I prefer this ‘communicatively interpersonal’ interpretation to the ‘performative’ one, which is more institutionally based.)
39
This basic attribute could be very variably manifested, particularly in its very beginning (the Middle Stone Age, which “exhibits a predominantly asynchronous presence and duration of many innovations across different regions of Africa” –see Scerri & Will 2023, and the previous note 1).
40
In words of Uomini & Ruck 2019 (who exemplify this attitude in their study of the emergence of human handedness): “The paucity of data is an obstacle in studying cognitive evolution, but this has not stopped researchers from trying”. I really love that “but”.
41
About ‘spontaneous altruism’: See Tomasello 2012, Rand et al. 2012, and, especially, “self-other merging” (Miyazono & Inarimori 2021) and “goal slippage” (Michael & Székely 2019). (I also wonder: What about the unquestionable footprints of caring for the ill or the wounded that have been found in Neanderthals? At least we cannot doubt “the selective advantages of reducing the risk of mortality of other group members in groups whose members are highly interdependent” (Spikins et al. 2019.) Spontaneous altruism is ontogenetically earlier than the motivation to improve one’s own reputation by helping: See Hepach et al. 2022. About the (probably, very primitive) type of spontaneous altruism that, “connected to reactive, non-cognitive fear circuits, helps others under threat” (for instance, in social hunters): See Vieira et al. 2020 and Vieira & Olsson, preprint. About the limits of ‘spontaneous altruism’, see previous notes 16 and 18.
42
According to Tomasello & Call 2019, “attention-getters, since they manipulate attention of addressees, evolutionarily precede pointing gestures, while intention-movements, since they manipulate the imagination, precede pantomimes”. I agree with such difference, but my interest is now in the similarity of both receptions.
43
See also Bohn et al. 2020, who report that apes do not learn from iconic gestures.
44
When infants first understand pointing in a unified way, do they understand it only when the producer addresses it to them? Clark 1996 claimed: “The basic arena for social interaction is the dyad”. Certainly some findings might seem to challenge that claim. (Thiele et al. 2023 report that “observed joint attention” already modulates 9-month-old infants’ object encoding. Likewise, according to Goupil et al. preprint, both humans and macaques show spontaneous preference to look at two bodies facing towards each other.) However these findings don’t seem to me. People's movements are always salient stimuli, of course, but, in my view, the ‘ability to capture other people's mental contents’ is not required in those experimental situations. Thus, according to my proposal, “the dyad” can be maintained for the very origin of human reception of pointing gestures.
45
Bejarano 2011, chapter 6: My argumentation started by focusing on the reception (also studied by Fernandez-Rubio Paula 2021) of the most egocentric deictics (i.e., the words that do not allow echolalia), but it extended to any linguistic reception, since this always includes where the message comes from.
46
What about dogs? Eye-contact –i.e., the communicator making eye-contact with the dog– is the major cue that dogs use to determine when a human pointing is intended for them. (See Kaminski & Nitzschner 2013; Téglás et al. 2012.) However, Lyn et al., preprint may have slightly lowered the initial triumphalism: Since dogs have more difficulty in following contralateral pointing, these authors suggest that ipsilateral points are learned through associative mechanisms. More in general, the Project MANYDOGS will try to replicate previous findings. But it is worth remembering Zuberbühler 2008: “Social carnivores must decide on one particular prey individual prior to group hunting”. Thus, if the dominant wolf remains for a few moments looking at –or making some movement towards– a particular prey, this could be an innately communicative signal, which would pre-activate in the members of the herd a plan of attack in the signaled direction. So, when, shortly after, the wolf-recipient feels that he is being looked at by the dominant individual, he starts its previously pre-activated attack plan. In this way, dogs would just make richer their innate expectation of the first signal –i.e., they would learn to associate their innate expectation with some other features (hand or finger).
47
This possibility is not at all an absurd suggestion. Firstly, within the lineage of Sapiens and even in dates totally within the (formerly so-called) ‘anatomically modern humans’, there is a marked evolution in the shape of the cranium: See Neubauer et a. 2018 (although, at least since 160.000 b. p., these differences with living humans would mainly affect, according to Zollikofer et al. 2022, the face and cranial base). See also Freidline et al. 2024: “The unique facial growth pattern of Homo sapiens post-dated the Middle Stone Age”. Secondly, regarding our progressive absence of prominent brow bridges –which were very prominent in Neanderthals–, Godinho et al. 2018 reject the old hypotheses on such absence and suggest “its potential role in social communication”. (See Siposova et al. 2018, who underline the role of raised and highly mobile eyebrows in “the reception of communicative looks”. Likewise, Gast 2023 focuses on the link between linguistic prosody and eyebrow movement.) I also ask: Could the chin, whose absence in Neanderthal has been so studied (cf. Meneganzin et al. 2024), strengthen the gestural, emotional expressivity of the mouth? (Remember 5.2 above.)
48
Regarding such later rest, I would underline: 1) creative (technical, artistic, or scientific) problem-solving, that is, the ability to transform one’s own insufficient mental contents into sufficient ones to solve the problem, and 2) what I called in previous note 18 ‘the most demanding moral capacity’.
49
Currie et al. 2024: “Philosophical methodology can benefit greatly from interaction with cognitive paleoanthropology. […] Coherent evolutionary narratives is a means of readmitting synthesis to the philosophical toolkit”. Or, more imprecisely, Bejarano 2022: “The current focus on hominids and Neanderthals opens a new door for us which was undreamt of for previous philosophers and scholars”.

References

  1. Algoe, Sara & Haidt, Jonathan (2009). Witnessing excellence in action: The ‘other-praising’ emotions of elevation, gratitude, and admiration. The Journal of Positive Psychology, 4(2), 105–127. [CrossRef]
  2. Andersson, Claes & Tennie, Claudio (2023). Zooming out the microscope on cumulative cultural evolution: ‘Trajectory B’ from animal to human culture. Humanities and Social Sciences Communications, 10. [CrossRef]
  3. André, Jean-Baptiste & Baumard, Nicolas & Boyer, Pascal. (2023). Cultural Evolution from the Producers’ Standpoint. Evolutionary Human Sciences. [CrossRef]
  4. Bar, Moshe (2007). The proactive brain: using analogies and associations to generate predictions. Trends in Cognitive Sciences 11 (7), 280–289.
  5. Barone, Pamela & Gomila, Antoni (2019). Infants’ performance in the indirect false belief tasks: A second-person interpretation. Cognitive science. [CrossRef]
  6. Barone, Pamela & Wenzel, Lisa & Proft, Marina & Racoczy, Hannes (2022). Do young children track other’s beliefs, or merely their perceptual access? An interactive, anticipatory measure of early theory of mind. Royal Society Open Science.
  7. Baumard, Nicolas & André, Jean-Baptiste & Sperber, Dan (2013). A Mutualistic Approach to Morality. The Evolution of Fairness by Partner Choice. Behavioral and Brain Sciences, 36, 59–78.
  8. Becker, Yannick et al. (2021). Early Left-Planum Temporale Asymmetry in Newborn Monkeys (Papio anubis): A Longitudinal Structural MRI Study at Two Stages of Development. NeuroImage, 227. [CrossRef]
  9. Becker, Yannick et al. (2022). Broca’s cerebral asymmetry reflects gestural communication’s lateralisation in monkeys (Papio anubis). Elife.
  10. Bejarano, Teresa (2008). Pragmatics and Theory-of-mind: A problem exportable to the origins of language. Proceedings of Conference ‘Evolang 7’ (but I could not go to the Conference). Available online: https://www.worldscientific.com/doi/abs/10.1142/9789812776129_0003.
  11. Bejarano, Teresa (2010). REVIEW of Hurford, James, 2007, The Origins of Meaning. Teorema, 29, 157- 164. Available online: http://www.lel.ed.ac.uk/~jim/origins.revu.bejarano.html.
  12. Bejarano, Teresa (2011). Becoming Human: From pointing gestures to syntax. Benjamins. Available online: https://benjamins.com/catalog/aicr.81.
  13. Bejarano, Teresa (2014). From Holophrase to Syntax: Intonation and the Victory of Voice over Gesture. Humana.Mente. Journal of Philosophical Studies, 27, 21-37. Available online: https://www.humanamente.eu/index.php/HM/article/view/95.
  14. Bejarano, Teresa (2022). The Most Demanding Moral Capacity: Could Evolution Provide Any Base? Isidorianum, 31(2), 91-126. Available online: https://www.sanisidoro.net/publicaciones/index.php/isidorianum/article/view/Bejarano.
  15. Benders, Titia (2013). Mommy is only happy! Dutch mothers’ realisation of speech sounds in infant-directed speech expresses emotion, not didactic intent. Infant Behavior and Development, 36(4), 847–862. [CrossRef]
  16. Benítez-Burraco, Antonio & Nikolsky, Aleksey (2023). The (Co)Evolution of Language and Music Under Human Self-Domestication. Human Nature. [CrossRef]
  17. Bering, Jesse (2011). The belief instinct: the psychology of souls, destiny, and the meaning of life. New York: W.W. Norton.
  18. Berio, Leda & Moore, Richard (2023). Great ape enculturation studies: a neglected resource in cognitive development research. Biology & Philosophy, 38. [CrossRef]
  19. Berke, Marlene; Horschler, Daniel; Jara-Ettinger, Julian; Santos, Laurie (preprint). Differences Between Human and Non-Human Primate Theory of Mind: Evidence from Computational Modeling. [CrossRef]
  20. Bermúdez, Juan Pablo; Berthelette, Samantha; Fernández-Miranda, Gabriela; Anaya, Alfonso &, Diego (preprint). Temptation and Apathy. In Oxford Studies in Agency and Responsibility, volume 8.
  21. Bohn, Manuel & Kordt, Clara & Braun, Maren & Call, Josep & Tomasello, Michael (2020). Learning novel skills from iconic gestures: A developmental and evolutionary perspective. Psychological Science, 31(7), 873–880. [CrossRef]
  22. Bohn, Manuel & Liebal, Katja & Oña, Linda & Tessler, Michael (2022). Great ape communication as contextual social inference: a computational modelling perspective. Philosophical Transactions of the Royal Society B: Biological Science, 377. [CrossRef]
  23. Bonini, Cristina; Rotunno, Cristina; Arcuri, Edoardo & Gallese, Vittorio (2023). The mirror mechanism: linking perception and social interaction. Trends in Cognitive Sciences. [CrossRef]
  24. Bornstein, Oren; Moran, Tal; Simchon, Almog & Eyal, Tal. (2023). The Effect of Psychological Distance on the Experience of Joy Versus Pride. Social Cognition. [CrossRef]
  25. Bräten, Stein (2004). Hominin Infant Decentration Hypothesis: Mirror neurons system adapted to subserve mother-centered participation, Behavioral and Brain Sciences, 27, 508-509. [CrossRef]
  26. Brinums, Melissa; Franco, Camila; Kang, Jemima; Suddendorf, Thomas; Imuta, Kana (2023). Driven by emotion: Anticipated feelings motivate children’s deliberate practice. Cognitive Development, 66. [CrossRef]
  27. Bryant, Katherine & Camilleri, Julia & Warrington, Shaun & Blazquez Freches, Guilherme & Sotiropoulos, Stamatios & Jbabdi, Saad & Eickhoff, Simon & Mars, Rogier. (2024). Connectivity profile and function of uniquely human cortical areas. [CrossRef]
  28. Bugnyar, Thomas & Heinrich, Bernd. Ravens differentiate between knowledgeable and ignorant competitors. Proceedings of the Royal Society B, 2005; 272, 1641–1646. [CrossRef]
  29. Bugnyar, Thomas; Reber, Stephan & Buckner, Cameron (2016). Ravens attribute visual access to unseen competitors. Nature Communications. Available online: https://www.nature.com/articles/ncomms10506.
  30. Cañigueral, Roser & Krishnan-Barman, Sujatha & Hamilton, Antonia (2022). Social signalling as a framework for second-person neuroscience. Psychonomic Bulletin & Review, 29. [CrossRef]
  31. Cartmill, Erica & Cartmill, Matt & Brown, Kaye & Foster, Jacob (preprint). Which Came First—iconicity or Symbolism? Evolang XV.
  32. Caspar, Kai; Biggemann, Marco; Geissmann, Thomas & Begall, Sabine (2021). Ocular pigmentation in humans, great apes, and gibbons is not suggestive of communicative functions. Scientific Reports. [CrossRef]
  33. Castro, Laureano & Castro-Nogueira, Miguel & Toro, Miguel. (2024). Teaching and the origin of the normativity. Biology & Philosophy, 39. [CrossRef]
  34. Castro, Laureano & Toro, Miguel (2004) The evolution of culture: from primate social learning to human culture. Proceedings of the National Academy of Sciences, 101. [CrossRef]
  35. Cesana-Arlotti, Nicolò; Martín, Ana; Téglás, Ernő; Vorobyova, Liza; Cetnarski, Ryszard; Bonatti, Luca (2018). Precursors of logical reasoning in preverbal human infants. Science. [CrossRef]
  36. Chellappoo, Azita (2021). Rethinking Prestige Bias. Synthese, 198, 8191–8212. [CrossRef]
  37. Cheng, Patricia, & Holyoak, Keith. (1985). Pragmatic reasoning schemas. Cognitive Psychology. 17, 391–416. [CrossRef]
  38. Cisek, Paul (2019). Resynthesizing behavior through phylogenetic refinement. Attention, Perception & Psychophysics. [CrossRef]
  39. Cisek, Paul (2021). Evolution of behavioural control from chordates to primates. Philosophical Transactions of the Royal Society B: Biological Sciences, 377. [CrossRef]
  40. Clark, Hannah; Elsherif, Mahmoud & Leavens, David. (2019). Ontogeny versus phylogeny in primate/canid comparisons: a metaanalysis of the object choice task. Neuroscience and biobehavioral reviews, 105. [CrossRef]
  41. Clark, Herbert (1996). Using language. Cambridge U. P.
  42. Clements, Wendy, & Perner, Joseph (1994). Implicit understanding of belief. Cognitive Development, 9, 377–395. [CrossRef]
  43. Coolidge, Frederick. (2023). Parietal lobe expansion, its consequences for working memory, and the evolution of modern thinking. 10.1016/B978-0-323-99193-3.00002-7.
  44. Cooperrider, Kensy & Slotta, James (2018). The Preference for Pointing With the Hand Is Not Universal. Cognitive Science 42(1). [CrossRef]
  45. Corballis, Michael (2000). Much ado about mirrors. Psychonomic Bulletin & Review, 7.
  46. Corballis, Michael (2001). Why Mirrors Reverse Left and Right. Psycoloquy, 12. Psycoloquy 12(032): Why Mirrors Reverse Left and Right (soton.ac.
  47. Corballis, Michael (2011). The recursive mind: The origins of human language, thought, and civilization. Princeton University Press.
  48. Cosmides, Leda (1989). The Logic of Social Exchange: Has Natural Selection Shaped How Humans Reason? Cognition 31(3), 187–276. [CrossRef]
  49. Crespi, Bernard; Flinn, Mark; Summers, Kyle. (2022). Runaway Social Selection in Human Evolution. Frontiers in Ecology and Evolution. [CrossRef]
  50. Csibra, Gergely & György Gergely. (2006). Social learning and social cognition: The case for Pedagogy, in Yuko Munakata, and Mark H Johnson (eds), Processes of Change in Brain and Cognitive Development (249-274). [CrossRef]
  51. Currie, Adrian & Killin, Anton & Lequin, Mathilde & Meneganzin, Andra & Pain, Ross. (2024). Past materials, past minds: The philosophy of cognitive paleoanthropology. Philosophy Compass. [CrossRef]
  52. Darwin, Charles (1872). The Expression of the Emotions in Man and Animals. London, John Murray.
  53. De Waal, Frans & Ferrari, Pier. (2010). Toward a bottom-up perspective on animal and human cognition. Trends in cognitive sciences, 14. [CrossRef]
  54. De Waal, Frans (2010). The Age of Empathy, Three Rivers Press.
  55. Di Bernardi Luft, Caroline et al. (2022). Social synchronization of brain activity increases during eye-contact. Communications Biology 5(1). [CrossRef]
  56. Di Francesco, Michele & Marraffa, Massimo & Paternoster, Alfredo. (2021). A self properly embodied. [CrossRef]
  57. Dingemanse Mark & Torreira, Francisco & Enfield, Nick. (2013) Is “Huh?” a Universal Word? Conversational Infrastructure and the Convergent Evolution of Linguistic Items. PLoS ONE 8(11). [CrossRef]
  58. Dingemanse, Mark & Enfield, Nick. (2023). Interactive repair and the foundations of language. Trends in Cognitive Sciences. [CrossRef]
  59. Dor, Daniel (2016). From experience to imagination: Language and its evolution as a social communication technology, Journal of Neurolinguistics. [CrossRef]
  60. Dor, Daniel (2023). Communication for collaborative computation: two major transitions in human evolution. Philosophical transactions of the Royal Society of London. Series B, Biological sciences. [CrossRef]
  61. Dreon, Roberta (2024). Enlanguaged experience. Pragmatist contributions to the continuity between experience and language. Phenomenology and the Cognitive Sciences. [CrossRef]
  62. Durdevic, Kresimir & Call, Josep (2022). On the Origins of Mind: A Comparative Perspective. Annual Review of Developmental Psychology, 4, 63–87. [CrossRef]
  63. Edwards-Lowe, Georgina & La Chiusa, Elisa & Olawole-Scott, Helen & Yon, Daniel (2024, preprint). Information seeking without metacognition. [CrossRef]
  64. Ereira, Sam; Dolan, Raymond & Kurth-Nelson, Zeb (2018). Agent-specific learning signals for self – other distinction during mentalising. PLoS Biol 16(4). [CrossRef]
  65. Ericsson, K. Anders (2002). Attaining excellence through deliberate practice: insights from the study of expert performance. In M. Ferrari (Ed.), The pursuit of excellence through education, Erlbaum, Mahwah, NJ (pp. 21-56).
  66. Errante, Antonino; Gerbella, Marzio; Mingolla, Gloria & Fogassi, Leonardo (2023). Activation of Cerebellum, Basal Ganglia and Thalamus During Observation and Execution of Mouth, hand, and foot Actions. Brain Topography. [CrossRef]
  67. Essler, Samuel & Becher, Tamara & Pletti, Carolina & Gniewosz, Burkhard & Paulus, Markus (2023). Longitudinal evidence that infants develop their imitation abilities by being imitated. Current Biology. [CrossRef]
  68. Fedorenko, Evelina & Piantadosi, Steven & Gibson, Edward (2024). Language is primarily a tool for communication rather than thought. Nature, 630, 575–586. [CrossRef]
  69. Fleming, Luke (2017). Phoneme inventory size and the transition from monoplanar to dually patterned speech. Journal of Language Evolution, 2 (1), 52–6. [CrossRef]
  70. Fodor, Jerry (1975). The Language of Thought, Cambridge, MS, Harvard University Press.
  71. Fodor, Jerry (2007). The revenge of the given. In B. McLaughlin & J. Cohen (eds.), Contemporary debates in philosophy of mind (pp. 105–116). Blackwell.
  72. Foley, Robert & Mirazón Lahr, Marta (2020). Variable Cognition in the Evolution of Homo: Biology and Behaviour in the African Middle Stone Age. In book: Landscapes of Human Evolution (pp. 125-141).
  73. Freidline, Sarah & Gunz, Philipp & Alichane, Hajar & Aicha, Oujaa & Ben-Ncer, Abdelouahed & Hajraoui, Mohamed & Hublin, Jean-Jacques. (2024). The Undescribed Juvenile Maxilla from Contrebandiers Cave, Morocco—A Study on Middle Stone Age Facial Growth. Journal of Paleolithic Archaeology, 7. [CrossRef]
  74. Frith, Chris & Frith Uta (2007). Social Cognition in Humans. Current biology. [CrossRef]
  75. Gainotti, Guido (2024). Emotions related to threatening events are mainly linked to the right hemisphere. Journal of psychiatry & neuroscience. [CrossRef]
  76. Gallagher, Shaun (2015). The Problem with 3-Year-Olds. Journal of Consciousness Studies: controversies in science and the humanities, 22, 160–182.
  77. Gallardo, Guillermo & Eichner, Cornelius & Sherwood, Chet & Hopkins, William & Anwander, Alfred & Friederici, Angela. (2023). Morphological evolution of language-relevant brain areas. PLoS biology, 21. [CrossRef]
  78. Gallese, Vittorio (2018). The Problem of Images: A view from the brain-body, Phenomenology and Mind, 14) 70-79. [CrossRef]
  79. Gärdenfors, Peter & Lombard, Marlize (2020). Technology led to more abstract causal reasoning. Biology & Philosophy, 35. [CrossRef]
  80. Gärdenfors, Peter (2022). Teaching as evolutionary precursor to language. Frontiers in Communication 7. [CrossRef]
  81. Gasparri, Luca (2023). The first words ever spoken. Synthese 201, 174. [CrossRef]
  82. Gast, Volker (2023). The Temporal Alignment of Speech-Accompanying Eyebrow Movement and Voice Pitch. Behavioral Sciences. [CrossRef]
  83. Geldhof, Joris (2005). ‘Cogitor ergo sum’: on the meaning and relevance of Baader’s theological critique of Descartes. Modern Theology. [CrossRef]
  84. Geurts, Bart (2019). What’s wrong with Gricean pragmatics. [CrossRef]
  85. Godinho, Ricardo; Spikins, Penny & O’Higgins, Paul. (2018). Supraorbital morphology and social dynamics in human evolution. Nature (Ecology & Evolution. [CrossRef]
  86. Goupil, Nicolas & Rayson, Holly & Serraille, Emilie & Massera, Alice & Ferrari, Pier & Hochmann, Jean-Rémy & Papeo, Liuba (preprint). Visual preference for socially relevant spatial relations in humans and monkeys. [CrossRef]
  87. Graham, Kirsty & Rossano, Federico & Moore, Richard. (2024). The origin of great ape gestural forms. Biological reviews of the Cambridge Philosophical Society. [CrossRef]
  88. Guevara, Irene & Rodríguez, Cintia & Núñez, Maria. (2024). Developing gestures in the infant classroom: from showing and giving to pointing. European Journal of Psychology of Education. [CrossRef]
  89. Hahn, Luca & Sergiou, Andoni & Arbon, Josh & Fuertbauer, Ines & King, Andrew & Thornton, Alex. (preprint). The co-evolution of cognition and sociality. [CrossRef]
  90. Heintz, Christophe & Scott-Phillips, Thom. (2022). Expression unleashed: The evolutionary & cognitive foundations of human communication. Behavioral and Brain Sciences. 1-46.
  91. Henrich, Joseph & Broesch, James. (2011). On the nature of cultural transmission networks: evidence from Fijian villages for adaptive learning biases. . Philosophical Transactions of the Royal Society Biological Sciences, 366, 139-48.
  92. Hepach, Robert & Engelmann, Jan & Herrmann, Esther & Gerdemann, Stella & Tomasello, Michael. (2022). Evidence for a developmental shift in the motivation underlying helping in early childhood. Developmental Science, 26. [CrossRef]
  93. Heyes, Cecilia (2021a). Imitation. Current Biology, 31 (5). [CrossRef]
  94. Heyes, Cecilia (2021b). Imitation and culture: What gives? Mind and Language. [CrossRef]
  95. Heyes, Cecilia, & Catmur, Caroline (2022). What Happened to Mirror Neurons? Perspectives on Psychological Science, 17(1), 153-68. [CrossRef]
  96. Heyes, Cecilia; Frith, Chris (2014). The cultural evolution of mind reading. Science 344. [CrossRef]
  97. Hobaiter, Catherine & Leavens, David & Byrne, Richard (2014). Deictic gesturing in wild chimpanzees? Journal of Comparative Psychology, 128, 82–87. [CrossRef]
  98. Hockett, Charles (1960). The Origin of Speech.
  99. Hurford, James (2007). The Origins of Meaning. Oxford University Press.
  100. Kaminski, Juliane & Nitzschner, Marie (2013). Do dogs get the point? A review of dog–human communication ability. Learning and Motivation. [CrossRef]
  101. Kano, Fumihiro et al. (2022). What is unique about the human eye? Comparative image analysis on the external eye morphology of human and nonhuman great apes. [CrossRef]
  102. Kano, Fumihiro; Krupenye, Christopher; Hirata, Satoshi; Call, Josep & Tomasello, Michael (2017). Submentalizing Cannot Explain Belief-Based Action Anticipation in Apes. Trends in cognitive sciences, 21(9), 633–634. [CrossRef]
  103. Karabegović, Mia & Mercier, Hugo (2023). The Reputational Benefits of Intellectual Humility. Review of Philosophy and Psychology. [CrossRef]
  104. Karg, Katia; Schmelz, Martin; Call, Josep; Tomasello, Michael (2015). The goggles experiment: can chimpanzees use self-experience to infer what a competitor can see? Animal Behavior, 105. [CrossRef]
  105. Karg, Katia; Schmelz, Martin; Call, Josep; Tomasello, Michael (2016). Differing views: Can chimpanzees do Level 2 perspective-taking? Animal Cognition, 19. [CrossRef]
  106. Keysers, Christian & Perrett, David (2004). Demystifying social cognition: a Hebbian perspective. Trends in cognitive sciences, 8, 501–507. [CrossRef]
  107. Kishimoto, Takeshi; Shizawa, Yasuhiro; Yasuda, Jun; Hinobayashi, Toshihiko & Minami, Tetsuhiro. (2007). Do pointing gestures by infants provoke comments from adults? Infant Behavior and Development, 30. [CrossRef]
  108. Klein, Jeffrey; Shepherd, Stephen; Platt, Michael. (2009). Social attention and the brain. Current Biology, 19. [CrossRef]
  109. Kobayashi, Hiromi & Kohshima, Shiro (2001). Unique morphology of the human eye and its adaptive meaning. Journal of Human Evolution, 40, 419–435. [CrossRef]
  110. Krupenye, Christopher; Kano, Fumihiro; Hirata, Satoshi; Call, Josep & Tomasello, Michael (2016). Great apes anticipate that other individuals will act according to false beliefs. Science.
  111. Laland, Kevin (2017). The origins of language in teaching. Psychonomic Bulletin & Review. [CrossRef]
  112. Lameira, Adriano & Hardus, Madeleine & Ravignani, Andrea & Raimondi, Teresa & Gamba, Marco. (2024). Recursive self-embedded vocal motifs in wild orangutans. eLife. [CrossRef]
  113. Learning in and about opaque worlds. Behavioral and Brain Sciences,. [CrossRef]
  114. Leary, Mark (2004). The sociometer. In R. Baumeister & K. Vohs (eds.), Handbook of self-regulation (pp. 373–391). Guilford.
  115. Leavens, David (2021). The Referential Problem Space revisited: An ecological hypothesis of the evolutionary and developmental origins of pointing. Cognitive science. [CrossRef]
  116. Leavens, David; Hopkins, William & Bard, Kim (2005). Understanding the point of chimpanzee. Epigenesis and Ecological Validity. Current Directions in Psychological Science 14(4. [CrossRef]
  117. LeDoux, Joseph (2012). Rethinking the emotional brain. Neuron 73, 653–676. [CrossRef]
  118. LeDoux, Joseph (2023). The Deep History of Ourselves: The Four-Billion-Year Story of How We Got Conscious Brains. Philosophical Psychology. [CrossRef]
  119. Lewis, Laura & Krupenye, Christopher (2022). Theory of Mind in Nonhuman Primates. In book Primate Cognitive Studies. [CrossRef]
  120. Lewis, Michael (2000). The emergence of human emotions. In M. Lewis & J. Haviland-Jones (eds.), Handbook of emotions (pp. 265–280). Guilford.
  121. Li, Leon (2023). The other side of false belief: Constructing the objectivity of reality. Infant and Child Development, 32. [CrossRef]
  122. Lipschits, Or & Geva, Ronny. (2024). An integrative model of parent-infant communication development. Child Development Perspectives. [CrossRef]
  123. Lorenz, Konrad (1966). Evolution and modification of Behaviour.
  124. Lurz, Robert & Krachun, Carla & Mareno, Mary Catherine & Hopkins, William. (2022). Do Chimpanzees Predict Others’ Behavior by Simulating Their Beliefs? Animal Behavior and Cognition 9, 153–175. [CrossRef]
  125. Lyn, Heidi&Christopher, Jennie (2018). A point is not a point is not a point: Reinterpreting three basic kinds of pointing comprehension. Proceedings of Evolang 2018 (pp. 260-263). https://pure.mpg.de/rest/items/item_3190925_17/component/file_3260022/contentRendall, Drew; Owren, Michael&Ryan, Michael (2009). What do animal signals mean? Animal Behaviour, 78, 233-240. [CrossRef]
  126. Lyn, Heidi & Greenfield, Patricia & Savage-Rumbaugh, Sue & Gillespie-Lynch, Kristen & Hopkins, William. (2011). Nonhuman Primates do Declare! A Comparison of Declarative Symbol and Gesture Use in Children, Bonobos, and Chimpanzees. Language & Communication. [CrossRef]
  127. Lyn, Heidi & West, Katie & Villegas, Joclyn & Bass, Christopher & Baker, Steven. (preprint). Pointing on the Other Side: Do Dogs Follow Contralateral Points? [CrossRef]
  128. Maibom, Heidi (2010). The Descent of Shame. Philosophy and Phenomenological Research. [CrossRef]
  129. Margoni, Francesco & Surian, Luca & Baillargeon, Renée. (2023). The Violation-of-Expectation Paradigm: A Conceptual Overview. Psychological Review, 131,.
  130. Mayhew, Jessica & Gómez, Juan Carlos (2015). Gorillas with White Sclera. American Journal of Primatology, 77. [CrossRef]
  131. Mearing, Alex & Koops, Kathelijne (2021). Quantifying gaze conspicuousness: Are humans distinct from chimpanzees and bonobos? Journal of Human Evolution. [CrossRef]
  132. Meguerditchian, Adrien & Molesti, Sandra & Vauclair, Jacques (2011). Right-handedness predominance in 162 baboons for gestural communication: Consistency across time and groups. Behavioral Neuroscience. [CrossRef]
  133. Meguerditchian, Adrien (2022). On the gestural origins of language: what baboons’ gestures and brain have told us after 15 years of research. Ethology Ecology & Evolution, 34.
  134. Melis, Alicia & Rossano, Federico (2022). When and how do non-human great apes communicate to support cooperation? Philosophical Transactions of the Royal Society B: Biological Sciences, 377.
  135. Meneganzin, Andra & Ramsey, Grant & DiFrisco, James. (2024). What is a trait? Lessons from the human chin. Journal of experimental zoology. Part B, Molecular and developmental evolution, 342. [CrossRef]
  136. Michael, John & Székely Marcell (2019). Goal Slippage: A Mechanism for Spontaneous Instrumental Helping in Infancy?, Topoi, 38. [CrossRef]
  137. Miyazono, Kengo & Inarimori, Kiichi (2021). Empathy, Altruism, and Group Identification. Frontiers in Psychology, 12. [CrossRef]
  138. Moore, Chris (2008). The Development of Gaze Following. Child Development Perspectives, 66-70. [CrossRef]
  139. Moore, Richard (2013). Evidence and Interpretation in Great Ape Gestural Communication. HUMANA.MENTE Journal of Philosophical Studies, 6, 27-51. Available online: https://www.humanamente.eu/index.php/HM/article/.
  140. Moore, Richard (2015). A Common Intentional Framework for Ape and Human Communication. Current Anthropology 56(1), 56–80.
  141. Moore, Richard (2020). The cultural evolution of mind-modelling. Synthese, 199(1), 1751 – 1776. [CrossRef]
  142. Morrison, Donald (2020). Disambiguated Indexical Pointing as a Tipping Point for the Explosive Emergence of Language Among Human Ancestors. Biological Theory, 15, 196–211. [CrossRef]
  143. Mussavifard, Nima & Csibra, Gergely (2023). The co-evolution of cooperation and communication: Alternative accounts. Behavioral and Brain Sciences, 46. [CrossRef]
  144. Mussavifard, Nima (preprint). Ostensive Marking as a Distinctive Feature of Human Communication. [CrossRef]
  145. Neubauer, Simon; Hublin, Jean-Jacques & Gunz, Philipp (2018). The evolution of modern human brain shape. Science Advances, 4. [CrossRef]
  146. Nevejans, Maura & Cracco, Emiel (2022). Model expertise does not influence automatic imitation. Experimental Brain Research, 240(4), 1267–1277. [CrossRef]
  147. Okasha, Samir (2022). Goal Attributions in Biology: Objective Fact, Anthropomorphic Bias, or Valuable Heuristic? *Okasha, Teleonomy Vienna series FINAL.
  148. Onishi, Kristine & Baillargeon, Renée (2005). Do 15-Month-Old Infants Understand False Beliefs? Science, 308, 255–258.
  149. Onu, Diana & Kessler, Thomas & Smith, Joanne (2016). Admiration: A conceptual review of the knowns and unknowns. Emotion Review, 8. [CrossRef]
  150. Oostenbroek, Janine & Suddendorf, Thomas & Nielsen, Mark & Redshaw, Jonathan; … Slaughter, Virginia (2016). Comprehensive longitudinal study challenges the existence of neonatal imitation in humans. Current Biology, 26, 1334–1338. [CrossRef]
  151. Osiurak, François & Claidière, Nicolas & Federico, Giovanni (2022). Bringing cumulative technological culture beyond copying versus reasoning. Trends in Cognitive Sciences, 27. [CrossRef]
  152. Osiurak, François & Cretel, Caroline & Uomini, Natalie & Bryche, Chloé & Lesourd, Mathieu & Reynaud, Emanuelle (2021). On the Neurocognitive Co-Evolution of Tool Behavior and Language: Insights from the Massive Redeployment Framework. Topics in cognitive science.
  153. Pan, Xinyue & Hsiao, Vincent & Nau, Dana & Gelfand, Michele (2024). Explaining the evolution of gossip. Proceedings of the National Academy of Sciences, 121. [CrossRef]
  154. Pan, Yeng et al. (2020). Instructor-learner brain coupling discriminates between instructional approaches and predicts learning. NeuroImage, 211. [CrossRef]
  155. Paulus, Markus & Fikkert, Paula (2013). Conflicting Social Cues: Infants’ Reliance on Gaze and Pointing Cues in Word Learning. Journal of Cognition and Development, 15. [CrossRef]
  156. Peeters, Anco & Cosentino, Erica & Werning, Markus (2023). Constructing a Wider View on Memory: Beyond the Dichothomy of Field and Observer Perspectives. In Anja Berninger & Íngrid Vendrell Ferran (eds.), Philosophical Perspectives on Memory and Imagination. (pp. 165-190). Routledge.
  157. Perea-García, Juan & Kret, Mariska & Monteiro, Antonia & Hobaiter, Catherine (2019). Scleral pigmentation leads to conspicuous, not cryptic, eye morphology in chimpanzees. PNAS, 116. [CrossRef]
  158. Perner, Josef & Priewasser, Beate & Roessler, Johannes (2018). The practical other: Teleology and its development. Interdisciplinary Science Reviews, 43. [CrossRef]
  159. Pfister, Roland & Klaffehn, Annika & Kalckert, Andreas & Kunde, Wilfried & Dignath, David (2021). How to lose a hand: Sensory updating drives disembodiment. Psychonomic Bulletin & Review, 28, 827–833.
  160. Phillips, Jonathan & Buckwalter, Wesley & Cushman, Fiery & Friedman, Ori & Martin, Alia & Turri, John & Santos, Laurie & Knobe, Joshua (2020). Knowledge before belief. Behavioral and Brain Sciences. [CrossRef]
  161. Phillips, Steve (2024). A category theory perspective on the Language of Thought. Frontiers in Psychology. [CrossRef]
  162. Piaget, Jean (1954). La formation du symbole chez l’enfant.
  163. Piretti, Luca & Pappaianni, Edoardo & Garbin, Claudia & Rumiati, Raffaella & Job, Remo & Grecucci, Alessandro (2023). The Neural Signatures of Shame, Embarrassment, and Guilt: A Voxel-Based Meta-Analysis on Functional Neuroimaging Studies. Brain Sciences, 13. [CrossRef]
  164. Planer, Ronald & Bandini, Elisa & Tennie, Claudio (2024 preprint). Hominin Tool Evolution and Its (Surprising) Relation to Language Origins. [CrossRef]
  165. Planer, Ronald (2019). The evolution of languages of thought. Biology and Philosophy, 34. [CrossRef]
  166. Planer, Ronald (2023). The evolution of hierarchically structured communication. Frontiers in Psychology, 14.
  167. Pomper, Jörn & Shams, Mohammad & Wen, Shengjun & Bunjes, Friedemann & Thier, Peter (2023). Non-shared coding of observed and executed actions prevails in macaque ventral premotor mirror neurons. [CrossRef]
  168. Poulin-Dubois, Diane & Goldman, Elizabeth & Meltzer, Alexandra & Psaradellis, Elaine (2023). Discontinuity from implicit to explicit theory of mind from infancy to preschool age. Cognitive Development, 65. [CrossRef]
  169. Pouw, Wim & Werner, Raphael & Burchardt, Lara & Selen, Luc (preprint, 2023). The human voice aligns with whole-body kinetics. [CrossRef]
  170. Pragmatic markers: The missing link between language and Theory of Mind. Synthese, 199. [CrossRef]
  171. Prein, Julia & Maurits, Luke & Werwach, Annika & Haun, Daniel & Bohn, Manuel (preprint). Variation in Gaze Understanding Across the Life Span: A Process-level Perspective. [CrossRef]
  172. Priest, Maura. 2017. Intellectual Humility: An Interpersonal Theory. Ergo, 4, 463–480. [CrossRef]
  173. Rakoczy, Hannes & Proft, Marina. (2022). Knowledge before belief ascription? Yes and no (depending on the type of “knowledge” under consideration). Frontiers in Psychology. 13, 2022.
  174. Rakoczy, Hannes (2022). Foundations of theory of mind and its development in early childhood. Nature Reviews Psychology.
  175. Rand, David; Greene, Joshua & Nowak, Martin (2012). Spontaneous Giving and Calculated Greed, Nature 489, 427-430. [CrossRef]
  176. Reddy, Vasudevi (2010). How Infants Know Minds.
  177. Rodríguez, Cintia & Moreno-Núñez, Ana & Basilio, Marisol & Sosa, Noelia. (2015). Ostensive gestures come first: Their role in the beginning of shared reference. Cognitive Development, 36. [CrossRef]
  178. Rönnqvist, Louise (2003). Developmentally, the arm preference precedes handedness. Behavioral and Brain Sciences, 26. [CrossRef]
  179. Ross, Lee (1977). The intuitive psychologist and his shortcomings: Distortions in the attribution process. In L. Berkowitz (ed.), Advances in experimental social psychology (vol. 10). New York: Academic Press.
  180. Rossano, Matt (2003). Expertise and the evolution of consciousness. Cognition, 89. [CrossRef]
  181. Roszak, Piotr (2022). Not Only Coping: Resilience and Its Sources from a Thomistic Perspective. Journal of Religion and Health. [CrossRef]
  182. Royo, Julie & Orset, Thomas & Catani, Marco & Pouget, Pierre & Thiebaut de Schotten, Michel. (2024). Evidence for an evolutionary continuity in social dominance: Insights from non-human primates tractography. [CrossRef]
  183. Ruba, Ashley&Repacholi, Betty. (2020). Beyond language in infant emotion concept development. Emotion Review, 12(4), 255–258. [CrossRef]
  184. Ruba, Ashley; Pollak, Seth; Saffran, Jenny (2022). Acquiring Complex Communicative Systems: Statistical Learning of Language and Emotion. Topics in Cognitive Science. [CrossRef]
  185. Scerri, Eleanor & Will, Manuel. (2023). The revolution that still isn’t: The origins of behavioral complexity in Homo sapiens. Journal of Human Evolution. [CrossRef]
  186. Schüler, Clara & Berger, Philipp & Grosse Wiesmann, Charlotte (2024). A dorsal versus ventral network for understanding others in the developing brain. [CrossRef]
  187. Schuwerk; et al. (Preprint; work in progress). MANYBABIES. Action anticipation based on an agent’s epistemic state in toddlers and adults. [CrossRef]
  188. Scott-Phillips, Thom & Heintz, Christophe (2023). Great ape interaction: Ladyginian but not Gricean. Proceedings of the National Academy of Sciences, 120. [CrossRef]
  189. Shilton, Dor; Breski, Mati; Dor, Daniel & Jablonka, Eva. (2020). Human Social Evolution: Self-Domestication or Self-Control? Frontiers in Psychology, 11. [CrossRef]
  190. Shimoni, Einav; Berger, Andrea & Eyal, Tal. (2022). Your pride is my goal: How the exposure to others’ positive emotional experience influences preschoolers’ delay of gratification. Journal of Experimental Child Psychology. [CrossRef]
  191. Shipton, Ceri. (2024). Was culture cumulative in the Palaeolithic? Phenomenology and the Cognitive Sciences.
  192. Siposova, Barbora; Tomasello, Michael & Carpenter, Malinda (2018). Communicative eye contact signals a commitment to cooperate for young children. Cognition, 179. [CrossRef]
  193. Southgate, Victoria (2020). Are infants altercentric? The other and the self in early social cognition. Psychological Review, 127(4), 505–523. [CrossRef]
  194. Southgate, Victoria; van Maanen, Catharine & Csibra, Gergely (2007). Infant Pointing: Communication to Cooperate or Communication to Learn? Child Development 78(3), 735-40. Available online: https://srcd.onlinelibrary.wiley.com/doi/10.1111/j.1467-8624.2007.01028.x.
  195. Spikins, Penny & Needham, Andy & Wright, Barry & Dytham, Calvin & Gatta, Maurizio & Hitchens, Gail (2019). Living to Fight Another Day: The Ecological and Evolutionary Significance of Neanderthal Healthcare, Quaternary Science Reviews, 217, 98-118. [CrossRef]
  196. Spurrett, David (2024). Motivation and Cumulative Culture. Commentary on Sterelny and Hiscock, Cumulative Culture, Archaeology, and the Zone of Latent Solutions. Current Anthropology, 65(1).
  197. Sterelny, Kim & Hiscock, Peter. (2024). Cumulative Culture, Archaeology, and the Zone of Latent Solutions. Current Anthropology, 65(1). [CrossRef]
  198. Sterelny, Kim (2023). Niche Construction, Cumulative Culture and The Social Transmission of Expertise. PaleoAnthropology.
  199. Steven, Samuel; Cole, Geoff; Eacott, Madeline (2022). It’s Not You, It’s Me: A Review of Individual Differences in Visuospatial Perspective Taking. Perspectives on Psychological Science. [CrossRef]
  200. Sznycer, Daniel & Cohen, Adam (2021). How pride works. Evolutionary Human Sciences, 3. [CrossRef]
  201. Sznycer, Daniel (2019). Forms and Functions of the Self-Conscious Emotions. Trends in Cognitive Sciences 23(2). [CrossRef]
  202. Sznycer, Daniel et al. (2017). Cross-cultural regularities in the cognitive architecture of pride. Proceedings of the National Academy of Sciences. 114. [CrossRef]
  203. Tattersall, Ian (2023). Let Sleeping Syntheses Lie. PaleoAnthropology. Special Issue: Niche Construction, Plasticity, and Inclusive Inheritance: Rethinking Human Origins with the Extended Evolutionary Synthesis, Part 1.
  204. Tebbe, Anna Lena & Rothmaler, Katrin & Koester, Moritz & Wiesmann, Charlotte. (2024). Infants and adults neurally represent the perspective of others like their own perception. [CrossRef]
  205. Téglás, Erno & Gergely, Anna & Kupán, Krisztina & Miklósi, Ádám & Topál, József. (2012). Current Biology. [CrossRef]
  206. Tennie, Claudio; Braun, David; Premo, Luke & Mcpherron, Shannon (2016). The Island Test for Cumulative Culture in the Paleolithic, in The Nature of Culture, edited by M. Haidle, N. Conard, and M. Bolus (pp. 121-133). Springer Press, Berlin. [CrossRef]
  207. Thiele, Maleen & Kalinke, Steven & Michel, Christine & Haun, Daniel (2023). Direct and Observed Joint Attention Modulate 9-Month-Old Infants’ Object Encoding. Open Mind, 7. 917–946. [CrossRef]
  208. Thomas, Emily & Haarsma, Joost & Nicholson, Jessica & Yon, Daniel & Kok, Peter & Press, Clare. (2024). Predictions and errors are distinctly represented across V1 layers. Current Biology, 34. [CrossRef]
  209. Thorne, Tyler; Milyavskaya, Marina; Werner, Kaitlyn; Leduc-Cummings, Isabelle; Saunders, Blair & Inzlicht, Michael (preprint). The Personal Goal Difficulty - Progress Paradox: Unraveling the Role of Self-Efficacy on Perceptions of Goal Difficulty. [CrossRef]
  210. Thornton, Mark & Tamir, Diana (2024). Neural representations of situations and mental states are composed of sums of representations of the actions they afford. Nature Communications. [CrossRef]
  211. Tomasello, Michael & Call, Josep. (2019). Thirty years of great ape gestures. Animal Cognition, These expectations are the only resource that monkeys have to estimate the interiority of others.
  212. Tomasello, Michael (2008). Origins of human communication.
  213. Tomasello, Michael (2012). Why be nice? Better not think about it. Trends in Cognitive Sciences 16. [CrossRef]
  214. Tomasello, Michael (2018). How Children Come to Understand False Beliefs: A Shared Intentionality Account. Proceedings of the National Academy of Sciences 115.
  215. Tomasello, Michael (2022). Social cognition and metacognition in great apes: a theory. Animal Cognition. [CrossRef]
  216. Tomasello, Michael; Call, Josep & Hare, Brian (2003). Chimpanzees understand psychological states –the question is which ones and to what extent. Trends in Cognitive Sciences 7. [CrossRef]
  217. Tomasello, Michael; Hare, Brian; Lehmann, Hagen & Call, Josep (2007). Reliance on head versus eyes in the gaze following of great apes and human infants: the cooperative eye hypothesis. Journal of Human Evolution.
  218. Tomasello, Michel (1999). The Human Adaptation for Culture. Annual Review of Anthropology, 28, 509–529. Available online: http://www.jstor.org/stable/223404.
  219. Tomasello, Rosario & Grisoni, Luigi & Boux, Isabella & Sammler, Daniela & Pulvermüller, Friedemann.(2022). Instantaneous Neural Processing of Communicative Functions Conveyed by Speech Prosody. Cerebral Cortex, 32. [CrossRef]
  220. Tomonaga, Masaki; Kurosawa, Yoshiki; Kawaguchi, Yuri & Takiyama, Hiroya (2023). Don’t look back on failure: spontaneous uncertainty monitoring in chimpanzees. Learning & Behavior. [CrossRef]
  221. Tracy, Jessica & Mercadante, Eric & Witkower, Zachary. (2024). The Evolved Nature of Pride. In The Oxford Handbook of Evolution and the Emotions. (pp. 203-218). [CrossRef]
  222. Uomini, Natalie & Ruck, Lana. (2019). Testing Models of Handedness in Stone Tools. [CrossRef]
  223. van Leeuwen, Edwin & Detroy, Sarah & Haun, Daniel & Call, Josep (2024). Chimpanzees use social information to acquire a skill they fail to innovate. Nature Human Behaviour. [CrossRef]
  224. Vasilieva, Olga (2019). Beyond “Uniqueness”: Habitual Traits in the Context of Cognitive-communicative Continuity. Theoria et Historia Scientiarum 16. [CrossRef]
  225. Vieira, Joana & Olsson, Andreas (preprint). Help or flight: Neural defensive circuits promote helping under threat in humans. [CrossRef]
  226. Vieira, Joana; Schellhaas, Sabine; Enström, Erik & Olsson, Andreas. (2020). Help or flight? Increased threat imminence promotes defensive helping in humans. Proceedings of the Royal Society B: Biological Sciences, 287. [CrossRef]
  227. Vincini, Stefano (2023). Can interactionist approaches solve the empathy-sharing conundrum? [CrossRef]
  228. Vygotsky, Lev & Cole, Michael (1978). Mind in society: Development of higher psychological processes.
  229. Vyshedskiy, Andrey (2022). Language evolution is not limited to speech acquisition: a large study of language development in children with language deficits highlights the importance of the voluntary imagination component of language. Research Ideas and Outcomes. 8. [CrossRef]
  230. Warren, Elizabeth & Call, Josep & György, Gergely. (2023). On the murky dissociation between expression and communication. Behavioral and brain sciences, 46. [CrossRef]
  231. Warren, Elizabeth & Call, Josep. (2022). Inferential Communication: Bridging the Gap Between Intentional and Ostensive Communication in Non-human Primates. Frontiers in Psychology, 12. [CrossRef]
  232. Witkower, Zachary & Tracy, Jessica & Cheng, Joey & Henrich, Joseph. (2020). Two Signals of Social. Prestige and Dominance are Associated with Distinct Nonverbal Displays, Journal of Personality and Social Psychology, 118, 89-120. [CrossRef]
  233. Wolf, Wouter; Thielhelm, Julia; Tomasello, Michael (2023). Five-year-old children show cooperative preferences for faces with white sclera. Journal of Experimental Child Psychology. [CrossRef]
  234. Woo, Brandon & Spelke, Elizabeth (2022). Toddlers’ social evaluations of agents who act on false beliefs. Developmental Science, 26 (2). [CrossRef]
  235. Woo, Brandon; Chisholm, Gabriel & Spelke, Elizabeth. (2024). Do toddlers reason about other people’s experiences of objects? A limit to early mental state reasoning. Cognition, 246. [CrossRef]
  236. Woo, Brandon; Tan, Enda; Yuen, Francis & Hamlin, J. Kiley (2022). Socially evaluative contexts facilitate mentalizing. Trends in Cognitive Sciences. [CrossRef]
  237. Yáñez, Bernardo & Gomila, Antoni (2018). Evolución de la esclerótica del ojo humano: Una hipótesis social. Ludus vitalis, 26.
  238. Zollikofer; et al. (2022). Endocranial ontogeny and evolution in early Homo sapiens: The evidence from Herto, Ethiopia. Proceedings of the National Academy of Sciences 119(32). [CrossRef]
  239. Zuberbühler, Klaus (2008). Gaze following. Current Biology. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated