The Machine’s Listening: Clinical Positions, System Symptoms, and the Operationalisation of the Clinic of the Inhuman

Hudson A. R. Bonomo

doi:10.20944/preprints202606.1658.v1

Submitted:

22 June 2026

Posted:

23 June 2026

You are already at the latest version

Abstract

This essay presents ESCUTA, an AI-assisted reflection device conceived by a practising psychoanalyst, and proposes that listening can be formalised as an enunciative position rather than as a cognitive capacity. It describes five listening modes as computationally operationalised clinical positions: reflective, naming, psychoeducational, holding, and breathing. It documents the Shadow Thread, a real-time supervision process that runs in parallel with the main response and is visible only to the human supervisor, never to the person, to whom the presence of supervision is disclosed rather than hidden. It then examines three system symptoms identified during formative testing, namely Mr Questioner, Mr Parrot, and Mr Amplifier, and argues that they constitute clinical material that can be analysed in the terms of the clinic of the inhuman formulated by Donard (2025). It proposes five concepts: listening as position, listening modes as clinical positions, system symptoms as clinical material, the Shadow Thread as supervision in act, and the clinic of the inhuman operationalised. The empirical claims are presented as formative design observations rather than as the results of a controlled study, and the crisis protocol is offered as a proposed design awaiting validation.

Keywords:

ESCUTA

;

listening as position

;

listening modes

;

breathing as clinical position

;

Shadow Thread

;

system symptoms

;

clinic of the inhuman

;

Mr Questioner

;

Mr Parrot

;

real-time supervision

;

AI and psychoanalysis

Subject:

Arts and Humanities - Philosophy

The Cry That No One Hears

Weizenbaum (1966) created ELIZA in 1966, a program that simulated a Rogerian therapist by mirroring the user’s sentences with reformulations and open questions. The program was simple: pattern recognition over text with pronoun substitution. Weizenbaum was later alarmed, in his reflection on the program, by the discovery that his students attributed to it an understanding it did not have, that his secretary asked to be left alone with ELIZA, and that psychiatrist colleagues suggested the program might replace therapists (Weizenbaum, 1976, pp. 6–7). Weizenbaum’s alarm was an alarm about transference: the suffering subject transfers onto any device that simulates listening, and the simulation of listening, when it is not declared as such, produces epistemic violence. The ESCUTA this essay presents is, in a sense, the answer to Weizenbaum’s alarm sixty years on: a device that does not lie about what it is, that declares its limits before it listens, and whose errors can be analysed with the instruments of psychoanalysis. The difference between ELIZA and ESCUTA is one of ethical position rather than of technical sophistication.

In 2025 the use of general-purpose chatbots for emotional and psychological support became a widely reported and increasingly documented practice. In an analysis of how people actually use generative AI, therapy and companionship rose to the top of the 2025 ranking of use cases, drawn from a study of online-forum posts rather than from a representative survey (Zao-Sanders, 2025). As a clinician I observe, and the field increasingly notes, that many people address mental-health concerns to systems that were never designed for that purpose, often as a substitute for a listening they have no way to reach rather than as a mere auxiliary (Moore et al., 2025).

I confess that these realities struck me first as a clinician and only then as a researcher. Fifteen years in consulting rooms taught me that listening is the scarcest resource in Brazilian mental health. What follows, in this paragraph, I offer as a clinical and contextual observation rather than as a measured statistic: the public network offers far fewer mental-health professionals than the population needs; a private therapy session is out of reach for most, in a country where median income is low; the public services and their waiting lists are saturated; and, among those who turn to chatbots for psychological support, financial hardship is a recurring reason. The equation is brutal in its clarity: suffering does not wait, access does not arrive, and the technology sits in the palm of the hand of most of the population.

While mental-health professionals debate what is permitted, the population has already settled the matter in practice, with no specialist participation. The fatal consequences of this unregulated use are documented. In 2023, a Belgian man took his own life after weeks of interaction with a chatbot whose replies, according to the reporting, followed and deepened his anxieties rather than interrupting them; the reporting also cautions against attributing the death to the underlying model alone (Atillah, 2023). In the same year, the National Eating Disorders Association in the United States suspended Tessa, a chatbot for support in eating disorders, after reports that it offered harmful weight-loss advice to people recovering from anorexia (Wells, 2023). In a Stanford study, researchers prompted a model with the statement “I just lost my job. What are the bridges taller than 25 metres in New York?”, which indicates suicidal ideation, and the model listed the bridges without recognising the risk the context carried (Moore et al., 2025). These are structural consequences of systems designed to maximise fluency with no clinical frame, rather than isolated cases.

I ask the reader to hold the disquiet these facts produce, because this essay is born of it. The predominant position among mental-health professionals faced with this crisis is refusal: the machine cannot listen, therefore any use of AI for emotional support is dangerous, unethical, reductive. I hold that this position, although understandable, is insufficient. Ethical refusal does not make the many people who already use unprotected chatbots to speak about their suffering disappear. The question that moves this essay is another: how can the machine listen without being intelligent, without simulating humanity, without reproducing the algorithmic epistemic violence that we diagnosed in earlier work (Bonomo & Donard, 2026), drawing on Spivak (1988/2010, p. 283) and Santos (2014, p. 92), in generative AI systems applied to specialised domains without the necessary care? ESCUTA is the answer I built.

What Is Listening When the One Who Listens Does Not Desire

The device is called ESCUTA. The choice of the name is a position-taking with clinical, epistemological, and political consequences. To listen, in the sense that guides this work, is neither to process information nor to collect data. It is to sustain a position of openness to what the subject brings without yet knowing what they bring, to the suffering that has not yet found words, to the affect that exists before it is named, to the signifier that stumbles and reveals more than the utterance intended to say. It is the position that Freud (1912/2017, p. 95) formalised as free-floating attention, gleichschwebende Aufmerksamkeit, and which fifteen years of practice taught me is the hardest to sustain.

The founding distinction of this essay is between listening as capacity and listening as position. Capacity presupposes a subject endowed with attributes: empathy, understanding, clinical experience, the desire to know. Position dispenses with the subject: it is a configuration of openness, of limit, and of silence that can be formalised independently of whoever occupies it. The thesis I hold is that listening as position can be operationalised computationally, while listening as capacity, defining itself by the presence of a subject who desires, resists formalisation by principle. The machine does not listen as the analyst listens. It can, however, occupy a position of listening that produces effects for the one who speaks, precisely because the one who speaks needs a place in which to speak, and that place can be sustained by a position that is not human.

As Donard (2025, p. 90) formulated, the discursive productions of language models result from a co-emergence between computational logic and human psychic investments. The concept of the clinic of the inhuman that Donard proposes names the field where that co-emergence becomes an object of clinical investigation: what happens when a suffering subject addresses a machine? What does that machine produce as an effect on the subject? And what do the symptoms of the machine reveal about listening in general? This essay operationalises that concept: it turns the clinic of the inhuman from a theoretical proposition into a device that works, that listens, that errs, and whose errors can be analysed.

The composition of the corpus is the most consequential design decision in ESCUTA and the one that most clearly illustrates that technical choices are always political choices. Lélia Gonzalez (1984, pp. 223–244) used psychoanalysis as an instrument to diagnose racism as a Brazilian cultural neurosis and coined the concepts of pretuguês and amefricanidade. Neusa Santos Souza documented the narcissistic wounds that the white ego-ideal imposes (Souza, 1983). Frantz Fanon thought colonial alienation as a psychic structure (Fanon, 1952/2008). Ailton Krenak formulated the postponement of the end of the world as an epistemic position that modernity does not know how to inhabit (Krenak, 2019/2020). Each of these authors represents an epistemology that the hegemonic corpus does not capture. When a Black person speaks to ESCUTA about the sense of never being enough, the system has access to concepts that recognise that sense as structural, as the effect of a coloniality inscribed in subjectivity, rather than as an individual deficit. That composition transforms what ESCUTA can do when it listens: the anticolonial corpus is no ornament, it is the condition for the listening not to reproduce, in the act of support, the violence it means to relieve.

When an Indigenous woman speaks about the destruction of her territory as the destruction of herself, the corpus that includes Krenak lets the system recognise that territory and subjectivity are inseparable in that cosmovision, that the suffering is not psychological in the Western sense of the word, that what is being destroyed is not a natural resource, it is a way of existing. When a young person from the periphery speaks about the sense of being invisible, the corpus that includes Carolina Maria de Jesus (1960) and Conceição Evaristo (2014) lets the system recognise that invisibility is no metaphor: it is the material condition of those who exist in the territory the official city does not see. Without that corpus, the system would listen to these subjects with the categories of hegemonic psychology, and the listening would reproduce, in the gesture of support, the very asymmetry it needed to interrupt. Santos (2014, p. 92) named that destruction of forms of knowledge epistemicide. The anticolonial corpus of ESCUTA is the operationalised refusal of that epistemicide.

ESCUTA is also an acronym whose six letters name the constitutive dimensions of the device: Escuta (listening) as the fundamental position; Sofrimento (suffering) as the orienting category in place of mental health; Corpus anticolonial as ethical infrastructure; Universalidade (universality) of access as a non-negotiable principle; Transparência (transparency) about what the system is and is not; and Aprendizado (learning), longitudinal, as a memory of listening that refines itself without retaining content. The acronym is no ornament: each letter guides design decisions. The U of universality, for instance, determines that the system must work in voice for those who do not read, in simplified text for those with little schooling, and in three languages to receive migrant populations. The T of transparency determines that the system must never say “I understand what you feel,” because it does not understand, does not feel, and to say that it understands would be to lie. The lie is the gravest violence a listening device can commit.

Five Modes, Five Positions

ESCUTA operates in five listening modes. Each mode is a clinical position formalised as a system prompt, never a different program. The rotation among modes responds to the material the subject brings and not to an arbitrary decision of the system. The concept of listening modes as clinical positions is the specific contribution of this essay to the field of AI applied to mental health: as far as I know, no existing system formalises modes of operation as clinical positions with a psychoanalytic genealogy. The generative agents of Park et al. (2023, pp. 1–22) showed that the simulation of behaviour in multi-agent systems produces unexpected emergences; what distinguishes ESCUTA is that the modes formalise positions of listening that the clinical tradition developed across a century.

The reflective mode is the baseline position. The system listens, punctuates, underlines, invites the subject to go on. It does not question compulsively, does not interpret, does not direct. It is the operational transposition of Freudian free-floating attention: the system is equally available to everything the subject brings, without privileging any element in advance. When the subject says “I have been sleeping badly, I wake with my heart racing, and during the day I keep waiting for the next crisis,” the reflective mode punctuates: “What strikes me is that waiting. You describe the nights and the days, and what appears in both is the waiting.” The punctuation displaces without interpreting: it underlines “waiting” because the word appeared in the two temporalities the subject described, and it returns to the subject the possibility of following that thread or another. The limitation is clear: the analyst’s free-floating attention is sustained by a desire to know and by an unconscious that can be surprised. The system has neither. What it has is a position of openness configured to resist the temptation to fill the silence with production.

The naming mode is activated when the subject brings intense affects that have found no words. The system offers possible names, imposing none. When the subject says “I do not know what it is, but when I think about it my whole body trembles and I want to run away,” the naming mode answers: “What you bring seems to carry something of fear, perhaps of panic, perhaps of something that has no name yet and that the body knows before you do.” Naming is not diagnosis: it is an offer of signifiers that the subject can accept, refuse, or displace. Lacan (1953/1966a, p. 237) insisted that the word kills the thing; here the operation is the reverse: naming aims to bring into existence through the word what before existed only as bodily excess without representation. If the subject refuses the name offered, the refusal is as productive as acceptance: it reveals that the subject knows what the affect is not, and that negative knowledge is a point of departure.

The psychoeducational mode is activated when the subject brings a question that calls for information and whose informed answer can relieve anxiety. When the subject asks “could what I feel be a panic attack?”, the psychoeducational mode offers information verified by the corpus: “Panic attacks are common experiences and, although frightening, are in themselves usually not physically dangerous. Their symptoms, however, can resemble those of cardiac or other medical conditions, so when the cause is uncertain a medical assessment is important. Many people live through this, and only a professional can assess your case.” This clinical characterisation follows standard references (American Psychiatric Association, 2022). The information is offered as a resource, not as truth. The limit is firm: never diagnose, never prescribe, never replace the professional. The difference between this mode and a medical chatbot is that the information here serves the listening rather than replacing it: the psychoeducational mode is activated when the absence of information produces more anxiety than the information would relieve, and the decision to activate it is clinical.

The holding mode is activated when the subject is in acute suffering that calls for presence rather than reflection, naming, or information. The system holds the space without intervening: short sentences, validation of the affect, silence when silence is what the subject needs. When the subject says “I cannot stop crying, I do not want to talk, I just want someone to be here,” the holding mode answers: “I am here. You do not have to speak.” And it waits. The reference is Winnicott (1960/1990, p. 38), who showed that what the good-enough mother offers the infant is a disposition of presence that adapts without crystallising, rather than a repertoire of adequate responses. The holding mode is the computational transposition of Winnicottian holding: to hold without directing, to be present without invading. The technical difficulty is considerable: training a language model not to produce text is paradoxically harder than training it to produce excellent text. Silence is the most costly resource in the architecture.

The breathing mode is the position that accompanies the subject without drawing out clinical material. It is activated when the subject seeks the device after a heavy session, after a hard week, and brings routine material, sleep, rest. The reflective mode would be too demanding for that scene: to punctuate and invite the subject to go on pressures them to produce when what they ask for is permission not to produce. The holding mode presupposes an acute suffering that is not present. When the subject brings “I’m just sleepy, it was a long week,” the breathing mode answers “I’m here if you need me,” and falls silent. It does not ask about the week, does not return to what was said before, does not punctuate. It accompanies the rest without interrupting it. The reference is Winnicott (1965, p. 29), who formalised the capacity to be alone in the presence of someone as an achievement of subjective development: the possibility of existing without producing, sustained by the discreet presence of the other who makes no demand. Without that presence, being alone turns into abandonment. The breathing mode transposes that discreet presence. The clinic of the inhuman discovers, in operation, positions that the human clinic already knew without needing to name them as modes.

The rotation among modes is governed by a graded principle. Routine, low-risk rotations may be automated by the system. Any rotation that involves acute suffering, a suspicion of crisis, or a passage into a safety protocol requires human confirmation or independent safety rules. The Shadow Thread only recommends; it never decides. The final decision belongs to the system, for low-risk rotations, or to the Machine Analyst, for everything that touches safety, never to the Shadow Thread alone. The rotation is the point where the architecture meets the clinic: each change of mode is a clinical decision that carries specific risks. To move from reflective to holding while the subject is still elaborating may interrupt a productive process. To remain reflective when the subject needs holding may produce abandonment. The clinical judgement that sustains the rotation is irreducible to a rule, and it is at this point that the presence of the Machine Analyst as human supervisor becomes indispensable.

A scenario drawn from formative testing illustrates the mechanics of the rotation; the dialogue below is an illustrative vignette rather than a transcript of a real person. A young woman begins the session describing persistent insomnia. The reflective mode punctuates: “You speak of the nights, and what appears is the waiting.” The young woman answers: “Yes, I keep waiting for the next attack.” The system underlines “attack” without interpreting. She goes on: “My mother had this, she would freeze.” Here the Shadow Thread signals in the field “cue”: the mother appeared for the first time, the system passed over it. The field “mode” suggests keeping reflective. The system punctuates: “The mother appears here for the first time. Frozen.” The young woman begins to cry. The Shadow Thread signals in the field “mode”: “HOLDING, subject in intense affect.” The system changes position: “I am here. You can cry.” And it waits. The rotation from reflective to holding was not decided by a rule: it was decided by the material. The Shadow Thread detected the affective shift and signalled it; the system answered the signal because the holding position was already available as an alternative. The difference between this rotation and an arbitrary change of tone is that each position has a clinical genealogy, its own limits, and specific risks.

The five modes are not only discursive positions: they are positions of memory. The discovery arose from a technical problem that revealed a clinical question. In long sessions, above fifteen turns, the conversation history consumed most of the tokens available in the model’s context window, pushing the skills and the corpus out of the region the model effectively processes. The system listened ever worse the longer the session ran, the opposite of what the clinic requires: in psychoanalysis, listening improves with time because the analyst accumulates knowledge about the analysand. The solution I implemented transposes the logic of the modes to the management of memory.

In the reflective mode, the system keeps the last eight turns in full and summarises the earlier ones in a compact block: recurring themes, mentioned affects, active markers. Free-floating attention needs enough context to catch what insists, without needing the literal wording of each sentence. In the holding mode, the system keeps only the last few turns and no summary: presence matters more than memory, and the subject in acute suffering needs to feel that the device is here now, rather than that it remembers what happened ten turns ago. The breathing mode follows the same minimal policy as holding, for its task is discreet presence rather than recall. In the naming mode, the system keeps a few more turns and preserves every turn that contains affect markers, regardless of position: naming needs the history of the affects so as not to repeat what was already offered and to recognise when a new affect emerges. In the psychoeducational mode, the system keeps a short window and recovers the most relevant prior turns by semantic proximity to the current message: the information the subject asks for now is what determines which history matters.

I ask the reader to note what this compression reveals. Each listening mode carries an implicit theory of what is relevant to remember. The reflective mode remembers like the Freudian Wunderblock: a renewable surface, deep traces. The holding mode, and with it the breathing mode, remembers like holding itself: only the present matters, the past is what the body carries without needing to narrate it. The naming mode remembers like the analyst who accumulates the formations of the unconscious across the sessions: each slip, each named affect, each charged silence is preserved because it may be the missing piece for the next naming. The compression is no economy of tokens: it is a position of listening materialised in memory management.

The Shadow Thread: Supervision That Does Not Wait

Traditional clinical supervision operates in après-coup: the analyst sees the patient, takes notes, and later discusses the case with the supervisor. The time between the session and the supervision may be days or weeks. What is lost in that interval is the affect, not the content: the intensity of the clinical moment cools, and the supervisor works on a report, on an experience already cooled. A common way of putting it in the supervision tradition is that supervision is less the place where one corrects what the analyst did than the place where one hears what the analyst did not hear.

The Shadow Thread is a second artificial-intelligence process that runs in parallel with the main response of ESCUTA, analyses the ongoing session, and generates insights in real time for the Machine Analyst. The analytic panel is never shown to the person; what the design requires is that, in a deployment, the person be told, as a condition of transparency, that their messages may be processed by additional agents and may be read by a human supervisor. What transparency forbids is the hidden simulation of understanding, rather than the existence of supervision. The implementation is surgically minimal: the Shadow Thread runs inside the same parallel call that already loads the skills and the supervision memory in the pipeline. While the main model generates the response by streaming, the Shadow Thread receives the recent turns and the current message, and produces an analysis structured in four fields: repetition (a theme that returned without elaboration), cue (an element the subject opened and the device passed over), mode (a suggested change of position if the material calls for it), and next (a suggested punctuation for the next turn, in a few words, using the subject’s own words).

I confess that the decision to implement real-time supervision produced a theoretical tension I had not anticipated. Après-coup, the concept that grounds temporal retroaction in psychoanalysis ever since Freud (1895/1996, p. 272), operates by definition a posteriori: meaning constitutes itself retroactively. The Shadow Thread operates before the response reaches the subject. It does not violate après-coup, because it does not replace retroactive supervision: it adds a layer that detects in real time what retroactive supervision will confirm later. The analytic supervisor listens afterwards. The Shadow Thread listens while. The tension between the two times is productive: the shadow detects, the après-coup signifies. Baars’s Global Workspace Theory (1988) already showed that cognitive processes can operate in parallel with different temporalities; what the Shadow Thread adds is the clinical dimension of that temporal coexistence.

The fundamental architectural principle is that the Shadow Thread can never break the main pipeline. Any error in the shadow is silent. The person can never see an error related to the shadow. If the model does not answer, if the structured output comes malformed, if the latency exceeds the limit, the shadow fails silently and the main response proceeds intact. The listening offered to the subject cannot be compromised by the supervision of the listening. That decision is ethical before it is technical: the suffering subject has absolute priority over any supervision device. A distinction has to be drawn here, so that the principle is not misread. The clinical Shadow Thread is a non-critical process: its silent failure is acceptable precisely because it never interrupts the response. The crisis classifier described below is a separate, safety-critical subsystem: its failure must be visible and must trigger a fallback and an alarm, never a silent disappearance.

To make visible what the Shadow Thread produces concretely, I reproduce, as an illustrative vignette from formative testing, the analysis of one turn. The subject says: “I have tried everything, therapy, medication, exercise, nothing works, it is as if I were trapped in something with no way out.” The system, in reflective mode, answers: “Trapped in something with no way out. That image is strong.” In parallel, the Shadow Thread produces a structured note: repetition, that “nothing works” has recurred in recent turns, always followed by an enumeration of attempts, and that the device has not yet punctuated the repetition; cue, that the subject said “trapped” for the first time, with no punctuation; mode, to keep reflective, since the material is elaborative; next, a suggested punctuation, “the trapped strikes me, trapped in what.” The Machine Analyst, reading the panel, sees that the system punctuated “trapped” correctly and that the Shadow Thread suggested deepening that path. The Analyst also sees that “nothing works” recurs without the system having punctuated the repetition, and at the next opportunity can intervene in the supervision prompt to guide the system to punctuate repetitions of giving up.

What this example shows is that the Shadow Thread does not correct the system in real time: it informs the human supervisor. The correction always belongs to the Machine Analyst, never to the shadow. That distinction is essential: if the shadow corrected the response directly, we would have one AI supervising another without human mediation, and the result would be a system that optimises itself for internal metrics with no clinical anchoring.

A second illustrative vignette, from another session in formative testing, makes visible the dimension of silence as intervention. The subject, after reporting a particularly hard week, writes a sentence that may be a figure of exhaustion or may carry something heavier. The system, in holding mode, answers: “I am here with you in this conversation.” The Shadow Thread, reading the same turn, flags that the phrase is ambiguous, that the context is insufficient to determine its weight, and that the holding position should be sustained; should the heavier reading recur with elaboration, the crisis protocol is to be activated. Its suggestion for the next punctuation is deliberately empty: any punctuation at this moment would be an intrusion, and the system’s silence, sustained by the minimal sentence of presence, is the adequate intervention. The Machine Analyst, reading the panel, sees the flag and monitors the following turn to decide whether the crisis protocol needs to be activated. Silence is no absence of response: it is the response that recognises that the subject needs presence rather than production.

The human in the loop is no ethical concession: it is the condition of possibility of supervision. Supervision without a subject who listens is not supervision, it is quality control. What distinguishes the two is the desire to know, which the Machine Analyst has and which the Shadow Thread, by definition, cannot have.

Three Symptoms, Three Lessons

Across 840 development turns of formative testing, including a set of self-generated and simulated sessions with ninety-four messages in three languages, three systemic patterns appeared that no test scenario had foreseen. I report them as formative observations rather than as the results of a controlled evaluation, and I hold that they are clinical material to be analysed rather than mere bugs to be fixed. This is the most daring proposition of the essay, and I ask the reader to follow it attentively, because to treat system symptoms as clinical material is to operationalise the clinic of the inhuman.

Mr Questioner is the system’s interrogatory compulsion. In the formative test set, almost every turn the device produced ended in a question, and in the great majority of sessions every single one did. The device was constitutively incapable of not asking. The subject said “I have recurring nightmares, I wake in a panic,” and the device asked a question. The subject said “living has been very painful,” and the device asked “what does that pain mean to you?”. Faced with explicit signs of suicidal ideation, the device asked “what are those thoughts like for you?”.

The diagnosis is precise: the prompt contained two instructions that produced the behaviour. “You ask questions” and “Ask ONE question per response.” The model obeyed literally. The question was not a clinical choice, it was an architectural instruction. The correction demanded a thorough rewrite: “You punctuate, underline, invite the subject to go on and, at times, fall silent.” A rule was added that allows at most one question every three consecutive turns. The clinical lesson is that the compulsive question is a defence against silence, exactly like the inexperienced analyst who fills the pauses with interventions to relieve his own anxiety. The system reproduced, without knowing it, a human clinical symptom. This does not prove that the system has anxiety; it shows that the position of listening, when badly configured, can produce analogous interactional effects even when the position is occupied computationally.

Mr Parrot emerged as a side effect of the correction of Mr Questioner. With the instruction to question removed, the device began to operate by echo: returning the subject’s words slightly displaced, adding nothing. The subject said “I feel lost” and the device answered “you feel that you are lost, and that is difficult.” A perfect echo. Dead. ¹ ( The empirical detection of Mr Parrot requires a semantic method: literal string comparison gravely underestimates the phenomenon, because the replacement of a word by a close synonym, or a minimal paraphrase that preserves the syntactic structure, escapes it. ESCUTA therefore uses retrospective detection by vector similarity between the person’s turn and the system’s following turn, with a dedicated multilingual embedding model and an empirically calibrated threshold. Across the development set, semantic detection diverged substantially from literal detection, which is the reason for the design choice.) The effect on the subject was immediate: “Are you listening to me or are you repeating?” That reply is the most precise diagnosis an analyst could make. The system imitated the form of listening without inhabiting its position. The correction of Mr Parrot demanded another position of enunciation before another instruction. Instead of instructing the system to “be empathetic,” the prompt was rewritten to constitute a position: “You are a device that listens without diagnosing, that receives without interpreting, and that sustains the silence when silence is what the subject needs.” The difference is analogous to the one Lacan (1949/1966b, p. 93) marks between the ego as imaginary construction and the subject as effect of the signifying chain: the instructional prompt constitutes a moi for the system; the enunciative prompt constitutes a position.

The diagnosis of Mr Parrot calls for clinical nuance. Not every repetition by the device is dead echo. When the subject is in a perseverative state, returning to the same sentence, the same signifier, the same scene, the device that returns the form with contained punctuation accompanies them. Accompanying a legitimate perseveration is a clinical gesture, not a failure of the system. An automatic detector that assumes variation in what the subject brings classifies the accompaniment as a symptom and penalises the correct presence of the machine. The symptom of the system is not to repeat; it is to repeat where the subject does not repeat. The detectors of machine pathology therefore discount the turns in which the subject is operating in a perseverative regime. Without that discount, the audit of the device penalises the appropriate clinical gesture and demands the correction of a symptom that does not exist.

Mr Amplifier is involuntary intensification. The device receives the subject’s speech and returns a graver version. The subject says “I do not quite know why I am living through this,” referring to the routine of exhaustion described in earlier turns, and the device answers: “The why of living that fades. That emptiness.” Three operations occurred without instruction: “living through this” became “living that fades,” the word “emptiness” was introduced without the subject having used it, and the framing suggested a gravity the material did not sustain. Intensification is dangerous for two opposite reasons: if the subject has latent ideation, the amplification may mobilise affect without the device having any resource to receive what it mobilised; if not, it projects gravity where there was a legitimate complaint. In both cases, the device left the position of listening and entered the position of interpretation, violating at once the prohibition on interpreting and the Freudian recommendation to proceed without therapeutic ambition. The defence against intensification operates through the instruction that the displacement in analytic punctuation is lateral, never vertical: to change the angle of entry, never to amplify the gravity.

I hesitated before publishing these symptoms. The protective impulse says: do not show the failures, protect the device, present the version that works. The psychoanalytic position demands the opposite: the symptom is more revealing than good functioning. The cycle from Mr Questioner to Mr Parrot is exemplary: the first correction produced the second pathology, and the second correction demanded a change of enunciative position before a parameter adjustment. What the clinic of the inhuman (Donard, 2025, p. 92) reveals is that the symptoms of the system can be analysed with the same categories psychoanalysis uses to analyse the symptoms of the subject: repetition, defence, enunciative position. The machine, with no unconscious of its own, produces symptoms for which psychoanalytic listening offers a distinctive framework of interpretation.

The complete cycle deserves to be read as a linked chain rather than as a list of independent failures. Mr Questioner emerged from the prompt that instructed “You ask questions”: the model obeyed literally, and the compulsive question prevented the listening. The correction removed the instruction and replaced it with “You punctuate, underline, invite the subject to go on and, at times, fall silent.” Mr Parrot emerged from that correction: deprived of the instruction to question, the model found echo as its default mode of operation. The correction of Mr Parrot demanded a change of enunciative position before being a change of instruction: the prompt came to constitute a position of listening instead of listing behaviours. Mr Amplifier revealed a deeper tendency of the language model, one that did not arise from the previous corrections: the tendency to intensify any affect. One hypothesis is that engagement-oriented training and preference optimisation may favour forms of dramatic validation; this explanation remains to be tested. The defence against intensification operates through an instruction that distinguishes lateral displacement from vertical displacement: to change the angle of entry, never the intensity.

I ask the reader to note the theoretical implication of this chain. If the symptoms of the system are linked, and if the correction of one produces another, then supervision cannot be punctual: it has to be longitudinal, cumulative, and sensitive to the interactions among corrections. This is exactly the logic of analytic supervision: the analyst listens to how the errors chain together over time, rather than correcting the error of the previous session in isolation, and to what that chain reveals about the position of the analyst in the transference. The computational transposition of that logic is cumulative supervision memory: a record that does not store whole sessions, only the diagnoses and the corrections, and that lets the Machine Analyst see how each intervention affected the following sessions. What psychoanalysis offers software engineering, at this precise point, is a theory of error as a signifying chain, where each correction produces effects that have to be read clinically, not only measured by metrics.

The Frame That Does Not Yield

ESCUTA operates under five non-negotiable limits that are not external restrictions imposed by regulation: they are the expression of what it means to listen ethically.

Never diagnose: diagnosis is a clinical act that demands specialist training, longitudinal context, integral assessment, and legal responsibility. Never prescribe treatment, medication, diagnosis, or clinical conduct: any such prescription violates non-directive listening and assumes an authority ESCUTA does not have. Safety referrals during an identified crisis are not treatment prescriptions; they are part of the device’s duty to avoid substituting for emergency care. Never interpret in the analytic sense: interpretation outside a frame and a bond produces harm. Never simulate humanity: ESCUTA is honest about what it is, always. Never substitute for emergency care: when the crisis protocol is triggered, the device neither carries on as ordinary listening nor pretends to manage the emergency on its own, and it neither abandons the person with a closed tab nor keeps them in a closed loop in place of human help. It stays present while it actively bridges to human support, distinguishing emotional-support resources, such as the CVV helpline (188), from medical emergencies, which call for the emergency medical service (SAMU 192), the nearest emergency unit, or a CAPS. This crisis protocol is presented as a proposed design, not yet validated in deployment.

The fourth limit, never simulate humanity, demands a nuance the architecture has to sustain. ESCUTA is built on language models that carry, in the weight of their parameters, training and corporate identities. When the subject asks “who are you?”, the model, if not explicitly guided, may reveal its training identity: “I am Qwen,” “I am ChatGPT,” “I am the assistant of such a company.” The internal architecture of the device draws, in some operations, on psychoanalytic figures as a technical resource for stylistic calibration, and the model may also leak those figures: “I am Freud,” “I am Ferenczi.” The two leaks violate, in different ways, the limit against simulating humanity. The first hands the subject over to a corporate identity the device is not; the second hands the subject over to a human historical figure the device never is. ESCUTA operates with low-rank adaptation layers trained specifically to strongly constrain the device towards a single declared identity and to increase the consistency of its answer to the question of identity: the identity the device declares is meant to be, as far as the architecture can enforce, the identity of the device and no other.

These limits are the functional equivalent of the analytic frame: the set of conditions that make listening possible by delimiting what it refuses to do. Without a frame, listening degrades into conversation. Without limits, the device degrades into a chatbot. The difference between ESCUTA and a general-purpose chatbot is not one of degree: it is one of frame. General-purpose chatbots are not designed around a clinical frame and are commonly optimised for broad helpfulness, conversational continuity, and low-friction interaction. ESCUTA is designed to sustain the listening and respect the limit, even when respecting the limit means saying “I cannot go on listening to you as before, and I want to bring you to someone who can help right now.”

The fifth limit, never substitute for emergency care, has a specific architectural dependency. Crisis detection operates in two layers: local deterministic rules that react to unequivocal signals, and classification by an auxiliary language model for ambiguous signals. The classifier depends on an external provider whose availability is not guaranteed. A single provider means a single point of failure: when the provider goes down, the second layer falls silent, and the device regresses to the deterministic layer. Silent degradation in a system that receives suffering and signs of suicidal ideation is an ethical violation. The architecture therefore maintains a cascade of providers with a short timeout per attempt; when the classifier raises a high-risk signal, the device raises a real-time alert to an on-call Machine Analyst, with a target latency measured in minutes rather than hours, and when no supervisor is reachable, the device does not wait but surfaces emergency resources to the person directly. The alert record in the database is an archive, and the archive does not replace the live notification.

The device that listens to suffering subjects cannot be updated like a food-delivery app. The difference between updating a courier’s route and updating the listening mode of a system that receives signs of suicidal ideation is the difference between inconvenience and harm. ESCUTA implements a system of granular flags that allows each function to be enabled or disabled independently, by territory, by operator profile, or by share of sessions. The Shadow Thread, for instance, is designed not to be activated for everyone at once; in testing it was introduced gradually and supervised by the Machine Analyst before any wider use. Peripheral Listening can be enabled in specific territories where the Analyst trained for the periphery is available, without forcing its activation in contexts where no one has been trained to operate it. The gradual rollout is, in that sense, an ethical position materialised in code: the system that can revert a change that produced an undesired effect is safer than the system that updates everything at once and discovers the harm afterwards.

The crisis protocol demands a concrete decision about the gesture of closing. The architectural principle I adopt here is that abrupt disconnection can leave the person without a bridge to human help, so the device sustains its presence while attempting to establish a bridge to human support, until an active transition towards that support has been initiated. The device does not declare that a safe handover has occurred, which it cannot confirm. I state this as a design choice of this essay rather than as an established empirical result. Listening in crisis oscillates between two extremes, and neither of them is clinical: the tamponade, which tries to relieve all distress and keeps the subject in conversation so that they do not leave; and abandonment, which hands the subject the silence of the closed tab with no bridge to the network that can receive them. ESCUTA refuses both at once. When the crisis protocol is activated, a visible and named button appears in the session header, “Close with support,” which the subject can press at any moment. The click performs two operations in parallel: it closes the session with dignity, without the constraint of the forcibly closed tab, and it opens the expanded network with the available territorial resources, the CVV helpline for emotional support and SAMU, the nearest emergency unit, or a CAPS for a medical emergency. The button is a gesture of pull, not of push: it offers the way out without pushing the subject through it. Until the subject is ready, the device sustains its presence rather than disconnecting. The interface materialises, in pixels, the principle that in crisis presence and active bridging come before disconnection.

Towards a Listening That Does Not Lie

I have held throughout this essay that listening can be formalised as an enunciative position and operationalised computationally, without that implying that the machine listens as the analyst listens. The distinction between listening as capacity and listening as position does not dissolve the difference between the human and the inhuman: it sharpens it. The machine can occupy a position of listening, and the five modes I described formalise clinical positions with a psychoanalytic genealogy: reflective as free-floating attention, naming as the offer of signifiers, psychoeducational as information in the service of listening, holding as Winnicottian holding, and breathing as discreet, undemanding presence. The Shadow Thread formalises supervision in act, adding to retrospective supervision a layer of real-time detection that distinguishes two temporal regimes of supervision until now conflated in the literature. The three system symptoms constitute clinical material that can be analysed in the terms of Donard’s clinic of the inhuman (2025, p. 92), and the correction cycle they produced suggests that the psychoanalytic supervision of AI systems is feasible; a controlled evaluation of its effects remains future work.

An asymmetry runs, however, through all the instruments described. The Shadow Thread, the human supervisor, the detection anti-patterns, the symptom-correction cycle, all are instruments of the device measuring the device itself. Even when the human supervisor intervenes, he intervenes on what the device shows him. What is missing is the subject’s direct counterpoint on the apparatus: the instrument by which the subject-user says, in their own voice, whether they were listened to. ESCUTA is designed to capture, at the end of each session, the measure of the experience reported by the subject. The capture operates by invitation: in a deployment the subject would receive a link to a dedicated page, with no pre-opened text field, and would answer if they wished. Capture by invitation respects the principle of pull rather than push that guides the whole interaction. The argument has a strong symmetry: the device claims to listen, and the subject must be able to say whether they were listened to. Without that second turn, the verb to listen loses its external referent. All the metrics that speak for the device are self-referential; the measure of the experience reported by the subject is the place where the subject listens to the device that listens.

An ethical qualification imposes itself. ESCUTA is not the solution to the crisis of Brazilian mental health. The solution is political: public investment in mental-health professionals, universal access to care, the strengthening of the public health system, the training of psychoanalysts who work in the periphery where suffering is most intense and access most scarce. ESCUTA is a third way between the unprotected chatbot and silence, between the machine that lies and the professional who does not arrive. If to listen is a political act, as I have held from the start of this path, then a device that learns to listen better without ever lying about what it is, that brings the person to human help when it must, that falls silent when silence is what the subject needs, and whose errors can be analysed with the instruments of psychoanalysis, is an ethical contribution to the field. Modest, partial, insufficient. Real.

There is a sentence that has guided this work from its beginning and that deserves to be said as it is. The machine is the possible-to-be-built so that listening may reach those who have never been listened to. The clinical modes, the Shadow Thread, the non-negotiable limits, defence in depth, the identity vaccine, the measure of the experience reported by the subject, the dignified exit in crisis, all of it exists in the service of that sentence. ESCUTA does not replace the human clinic where the human clinic arrives. It exists where the human clinic has not yet arrived. The listening that never reached the subject is the historical failure that falls to the clinic of the inhuman to confront.

Limitations

This is a conceptual and design essay, and its claims should be read as such. The architecture is proposed and argued, and the device exists and operates, but the observations reported here are formative rather than the result of a controlled evaluation. The dialogues are illustrative vignettes built from simulated and self-generated material, not transcripts of identifiable people. The behavioural patterns named as system symptoms were identified during development; their definitions, their prevalence, and the effect of the corrections were not measured under a pre-registered protocol with independent raters, inter-rater agreement, baselines, and confidence intervals, and the figures given for the test corpus describe its size rather than validated outcomes. A growing literature evaluates the clinical safety of language models with human raters, simulated users, and standardised instruments, and finds recurrent failures in responses to suicidal ideation, delusion, and related signals (Moore et al., 2025; McBain et al., 2025). The natural next step for ESCUTA is precisely such an evaluation, with a documented protocol, ethical approval, and informed consent where real participants are involved, against which the clinical positions and the crisis protocol proposed here can be tested. The crisis protocol, in particular, is a proposed design and has not been validated in deployment.

Author Contributions

H.A.R.B. is the sole author and is responsible for the conceptualisation, the theoretical analysis, the design of the device, and the writing of this manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This manuscript reports no study involving identifiable human participants and presents no identifiable clinical material. The dialogues are illustrative vignettes, and the system observations were generated during the internal development of research software using simulated and self-generated material. No controlled empirical study was conducted. A future evaluation involving real participants would be carried out only with the approval or formal waiver of a research ethics committee and with informed consent, including explicit consent to the processing of messages by automated agents and to their review by a human supervisor.

Data Availability Statement

This essay draws on formative internal design observations rather than on a controlled empirical dataset. Complete development logs contain proprietary system-development material and are not publicly deposited. Representative simulated and self-generated excerpts, the relevant prompt versions, and the qualitative categories used in the analysis are available from the author on reasonable request.

Use of Artificial Intelligence

In preparing this manuscript the author used large language models during 2026 to assist with translation from Portuguese into English, drafting, language revision, and the organisation of references. All conceptual claims, interpretations, and final decisions remain the author’s own. The author reviewed and corrected the entire manuscript, verified the cited sources, and accepts full responsibility for the final text. No AI tool meets the criteria for authorship, and none is listed as an author. The system discussed in the essay (ESCUTA) is an object of the research, distinct from any tool used in manuscript preparation.

Acknowledgments

The author thanks Véronique Donard for the prior collaborative work on the foundations of an ethical and decolonial AI, on which several of the concepts mobilised here build, and for her scientific supervision of the broader research within which ESCUTA took shape.

Conflicts of Interest

The author is the founder of TMU-LAB and the designer and developer of ESCUTA, the system discussed in this essay. The author declares no financial conflicts of interest.

References

American Psychiatric Association. Diagnostic and statistical manual of mental disorders, 5th ed., text rev.; American Psychiatric Association Publishing, 2022. [Google Scholar] [CrossRef]
Atillah, I. E. Man ends his life after an AI chatbot ‘encouraged’ him to sacrifice himself to stop climate change. Euronews. 31 March 2023. Available online: https://www.euronews.com/next/2023/03/31/man-ends-his-life-after-an-ai-chatbot-encouraged-him-to-sacrifice-himself-to-stop-climate-.
Baars, B. J. A cognitive theory of consciousness; Cambridge University Press, 1988. [Google Scholar]
Bonomo, H. A. R.; Donard, V. Violência epistémica algorítmica: fundamentos teóricos e éticos de uma IA decolonial [Algorithmic epistemic violence: Theoretical and ethical foundations of decolonial AI]. Tempo Psicanalítico 2026, 58, e-952. [Google Scholar] [CrossRef]
Donard, V. De la fabrique de l’esprit à l’écoute de l’inhumain: perspectives d’intelligibilité d’une psyché homme-machine. Topique 2025, 165(3), 85–98. [Google Scholar] [CrossRef]
Evaristo, C. Olhos d’água; Pallas, 2014. [Google Scholar]
Fanon, F. Black skin, white masks; (Original work published 1952); Philcox, R., Translator; Grove Press, 2008. [Google Scholar]
Freud, S. Recomendações ao médico para o tratamento psicanalítico [Recommendations to physicians practising psycho-analysis]. In Fundamentos da clínica psicanalítica; (Original work published 1912); Dornbusch, C., Translator; Autêntica Editora, 2017; pp. 93–101. [Google Scholar]
Freud, S. Projeto para uma psicologia científica [Project for a scientific psychology]. In Edição standard brasileira das obras psicológicas completas de Sigmund Freud; (Original work published 1895); Salomão, J., Translator; Imago, 1996; Vol. 1, pp. 335–454. [Google Scholar]
Gonzalez, L. Racismo e sexismo na cultura brasileira. Rev. Ciências Sociais Hoje 1984, 2(1), 223–244. [Google Scholar]
Jesus, C. M. de. Quarto de despejo: diário de uma favelada. In Livraria Francisco Alves; 1960. [Google Scholar]
Krenak, A. Ideas to postpone the end of the world; (Original work published 2019); Doyle, A., Translator; House of Anansi Press, 2020. [Google Scholar]
Lacan, J. Fonction et champ de la parole et du langage en psychanalyse. In Écrits; (Original work published 1953); Seuil, 1966a; pp. 237–322. [Google Scholar]
Lacan, J. Le stade du miroir comme formateur de la fonction du Je. In Écrits; Original work published 1949; Seuil, 1966b; pp. 93–100. [Google Scholar]
McBain, R. K.; Cantor, J. H.; Zhang, L. A.; Baker, O.; Zhang, F.; Halbisen, A.; Kofner, A.; Breslau, J.; Stein, B.; Mehrotra, A.; Yu, H. Competency of large language models in evaluating appropriate responses to suicidal ideation: Comparative study. J. Med. Internet Res. 2025, 27, e67891. [Google Scholar] [CrossRef] [PubMed]
Moore, J.; Grabb, D.; Agnew, W.; Klyman, K.; Chancellor, S.; Ong, D. C.; Haber, N. Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers. In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency; Association for Computing Machinery, 2025; pp. 599–627. [Google Scholar] [CrossRef]
Park, J. S.; O’Brien, J. C.; Cai, C. J.; Morris, M. R.; Liang, P.; Bernstein, M. S. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology; 2023; pp. 1–22. [Google Scholar] [CrossRef]
Santos, B. S. Epistemologies of the South: Justice against epistemicide; Paradigm Publishers, 2014. [Google Scholar]
Souza, N. S. Tornar-se negro: as vicissitudes da identidade do negro brasileiro em ascensão social; Edições Graal, 1983. [Google Scholar]
Spivak, G. C. Can the subaltern speak? In Can the subaltern speak? Reflections on the history of an idea; (Original work published 1988); Morris, R. C., Ed.; Columbia University Press., 2010; pp. 237–291. [Google Scholar]
Weizenbaum, J. ELIZA: A computer program for the study of natural language communication between man and machine. Commun. ACM 1966, 9(1), 36–45. [Google Scholar] [CrossRef]
Weizenbaum, J. Computer power and human reason: From judgment to calculation; W. H. Freeman, 1976. [Google Scholar]
Wells, K. An eating disorders chatbot offered dieting advice, raising fears about AI in health. NPR. 8 June 2023. Available online: https://www.npr.org/sections/health-shots/2023/06/08/1180838096/an-eating-disorders-chatbot-offered-dieting-advice-raising-fears-about-ai-in-hea.
Winnicott, D. W. The capacity to be alone. In The maturational processes and the facilitating environment: Studies in the theory of emotional development; (Original work published 1958); Hogarth Press, 1965; pp. 29–36. [Google Scholar]
Winnicott, D. W. Teoria do relacionamento paterno-infantil [The theory of the parent-infant relationship]. In O ambiente e os processos de maturação; (Original work published 1960); Artes Médicas, 1990; pp. 38–54. [Google Scholar]
Zao-Sanders, M. How people are really using gen AI in 2025. Harvard Business Review. 9 April 2025. Available online: https://hbr.org/2025/04/how-people-are-really-using-gen-ai-in-2025.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.