Investigating the Neural Basis of the “Uncanny Valley” David Hanson The University of Texas at Dallas Institute for Interactive Arts and Engineering

October 6, 2003 Keywords: Robot, Identity, Uncanny Valley, Realism, Realistic, Realistic Humanoid, Realistic Robot, Verisimilitude, Uncanny, FFA, Face recognition, Social Cognition

Address for author: 6117 Reiger Avenue Dallas Texas, 75214 214-827-9330 323-208-2788 [email protected]

Abstract Many robot designers have avoided creating realistic depictions of the human face because of regard for the “Uncanny Valley”—Masahiro Mori’s theory that people will respond favorably to cartoons and to perfect realism, but will respond with fear and discomfort to depictions that lie in between. Although this theory has been accepted in robotics engineering for twenty-five years, there has been little attempt to scientifically investigate the anecdotal phenomenon described by the theory. This paper proposes that the Uncanny Valley effect arises from a distributed network of brain-systems that, in concert, function as an “emergency alarm”. This alarm system becomes acutely enabled by the detection of high-verisimilitude anthropomorphic stimuli, and rings with alarm if patterns that signal crisis are detected. But the alarm also will ring (provided it is enabled) if certain patterns that signal a healthy social presence are not detected. This revised theory is renamed the theory of Bridge of Engagement (BOE). Recent brain imaging [LaBar et al, 2003], has found that visual stimulus of moving fear expressions shown to test subjects, activates a distributed pattern involving the pSTS, right lateral fusiform gyrus (FFA), and the amygdala. [Kesler-West et al, 2001] found similar results, but found that “happy” expressions activated a very different distributedpattern in test subjects than did expressions of negative affect, and this pattern did not include elevated activity in the amygdala, but instead involved elevations in activity of the medial frontal/cingulate sulcus, an area that has been found to be critical to the initiation of language [Crosson et al, 1999]. These findings support the notion that crisis stimuli such as a fearful expression will trigger a neural alarm of fear, whereas facial stimuli that do not cause fear lead to preparations for social engagement. Additionally, [LaBar et al, 2003] also shows that sliding “identity morphs”, which animate identity change from one individual to another, activate similar distributed neural patterns as do expressions of fear, notably with elevated activity in the amygdala. This also supports the VDR concept that if high-resolution identity cues fall outside expected patterns, the brain will signal alarm. This paper also proposes future experiments that can further elucidate the structure of this alarm system by supporting or controverting our hypothesis that such a social emergency alarm increases its level of discrimination when exposed to high-resolution, high verisimilitude cues.

Outline 1. Introduction 2. Background 2.1 Background Science 2.2 Philosophical Background 2.2.1 The Uncanny Freud 2.2.2 Sociable Robotics debates the Face 2.2.3 Masahiro Mori’s mistake 3. Broaching a New Theory of the Uncanny 4. First Proof of VDR 4.1 Scientific hypothesis, in detail, Communications systems 4.2 Proposed Experiments 5. The Neurophysiology of the VDR 6. VDR Triggers and Factors: Exploring the Dimensions of Verisimilitude 7. Concluding remarks 8. Acknowledgements 9. References

1. Introduction In recent years, sociable robots and agents have become relatively accepted as tools for investigating social cognition [Fong et al 2003], yet the engineers of these robots have avoided making realistic depictions of the human face because of longstanding regard for the “Uncanny Valley”—Masahiro Mori’s theory that people will respond favorably to cartoons and to perfect realism, but will respond with fear and discomfort to depictions that lie between cartoon and realism. The Uncanny Valley theory further holds that perfect realism is impossible to emulate, because human facial communication is so infinitely dense with subtlety. Partially as a result of this theory, roboticists widely believe that socially acceptable levels of realism will be a fool’s errand, and should not be attempted. Yet, for the goal of investigating human social cognition, it intuitively seems useful for robots to emulate human appearance as realistically as is possible. Although the theory of the Uncanny Valley has been generally accepted as engineering principle for twenty-five years, there has been little attempt during that time to scientifically investigate the phenomenon described by the theory. This paper seeks to push this theory out of the realm of conjecture and into hard science, by investigating several aspects of a neural basis for the Uncanny Valley, describing several related hypotheses, and proposing future experiments. Our central hypothesis holds that the Uncanny Valley effect arises from a distributed network of brain-systems that act as an “emergency alarm”, which becomes acutely enabled by the presence of high-verisimilitude anthropomorphic stimuli, and rings with alarm if patterns that signal crisis are detected, but also (while the alarm is enabled) will ring if certain patterns that signal a healthy social presence are not detected. This revised theory is renamed the theory of Bridge of engagement (VDR). It is well-supported [O’Toole et al 2002] that social recognition is divided into two distinct neural activities—one being low-resolution, motion-based face/gesture recognition, and the other being high-resolution, static recognition. This paper argues that this division provides the framework for the bridge of engagement, such that only high-resolution facial recognition will trigger full expectations of face-to-face social exchange and simultaneously enable the full strength of the VDR “watchdog” pattern-detectors. Recent brain imaging [LaBar et al, 2003], has found that visual stimulus of moving fear expressions shown to test subjects, activates a distributed pattern involving the pSTS, lateral fusiform gyrus, and the amygdala. [Kesler-West et al 2001] found similar results, but found that “happy” expressions activated a very different distributed-pattern in test subjects than did expressions of negative affect, and this pattern did not include elevated activity in the amygdala, but instead involved elevations in activity of the medial frontal/cingulate sulcus, an area found to be critical to the initiation of language [Crosson et al, 1999]. These findings support the notion that crisis stimuli such as a fearful expression will trigger a neural alarm of fear, whereas facial stimuli that do not cause fear lead to preparations for social engagement. Additionally, [LaBar et al, 2003] also shows that sliding “identity morphs”—animations of identity changing from one individual’s into another’s—

activate distributed neural patterns similar to those activated by expressions of fear (notably elevating amygdala activity). This also supports the VDR concept that if high-resolution identity cues fall outside expected patterns, the brain will signal alarm. These reflexes would be consistent with some evolutionary scenarios, wherein certain visual patterns, or deviations from normal appearance, may have signified circumstances of danger, a.k.a. disease, psychosis, physical trauma, psychological trauma, subterfuge, or an interaction with a member of a foreign social-group. This paper also proposes that a set of minimal cues may be required to satisfy the VDR watchdog, and that this set is not as dense as has been previously assumed. Section 6 considers possible VDR triggers, and proposes experiments to strain them out. Structure of the paper This paper will be braided from two strands: one scientific and the other philosophical. While roughly equivalent attention will be paid to each strand, the philosophical strand will range rather widely, considering aesthetic, cultural, and artistic implications, as well as implications for the design of sociable robots. The scientific strand will stay grounded in prior science, conservatively constructing novel hypotheses to be verified at a later date by executing the proposed experiments. It is hoped that this dual approach will help contextualize the topic, offer improved guidance for future scientific inquiry, and facilitate practical application of the science to the design of sociable robots and agents. 2. Background Thanks largely to powerful new brain imaging techniques, recent decades have begun to answer bounteous questions (and to open yet more new ones) about how the brain processes the human social appearance. This body of science offers strong clues into a possible neural basis of the Uncanny Valley, and provides the foundation for our alternative theory of The bridge of engagement. The section 2.1 surveys relevant prior science. 2.1 Neural Systems Related to Face-based Social Cognition This section will introduce the major organs related to face-based social cognition, and will survey notable discoveries and debates. Visual data processing occurs in a highly parallel and distributed manner [Haxby et al, 2001; Spiridon and Kanwisher, 2002], but certain organs appear specialized for processing restricted types of data, and may be especially entrained upon prescribed visual patterns, for example, the lateral fusiform gyrus responds especially to facelike forms [Puce et al. 1995; Kanwisher et al. 1997]. To perform real cognitive tasks, such as socialization, specialized face-processing structures operate in close communication with other structures, such as emotion processors (amygdala, anterior cingulate cortex), lexical systems (broca’s area, frontal lobes, medial frontal/cingulate

sulcus), place (parahippocampal place area and the parietal lobes) and object recognizers [Schultz et al, 2001; La Bar et al, 2003]. Parallelism in human visual processing permeates human visual systems. Subretinal nerve cells first divvy high-resolution static data (as processed by parvo cells) from low-resolution, motion data (as processed by magno cells). These divisions continue up through the visual system, such that spatial and motion information may stream into the dorsal “where” system, composed of the Medial Temporal, Parietal and Frontal lobes, while high resolution static-pose data streams into the ventral “what” system, which perceives objects’ invariant aspects and is comprised of the Occipital Cortex and in particular the Inferior Temporal (IT) lobes [O’Toole et al, 2002; Swiercinsky, 2000, Riesenhuber and Poggio, 2000]. Expert Face Detection The higher levels of the “what” system will expertly detect faces, objects and places, with invariance to shifts in lighting, rotational transformations, and so on. This feat apparently is achieved by close collections of cells that are tuned to encode various views of an object [Riesenhuber and Poggio, 2000]. One such collection of cells—the right lateral fusiform gyrus—is well-demonstrated to respond to invariant aspects of the human face (when contrasted with object or house recognition), and so has been dubbed the Fusiform Face Area (FFA) [Puce et al. 1995; Kanwisher et al. 1997; Kanwisher 2000]. Intimately interconnected with brain regions that manage lexical information, such as names and other language-related data [O’Toole et al, 2002], the location of the FFA (as well as the entire IT “what” system) allows fast matching between the expertise of invariant visual attributes and expert language [Lamb, 1999; Adolphs, 2001]. Other regions, notably the Occipito-Cortex zones M100, M170 [Kalanit GrillSpector 2002; Liu et al, 2000], and the Occipital Face area (OFA) [Gauthier 2000, Helpert,1998, Haxby 1999] have been shown to be responsive to the human face, but without the capacity for expert identification that is exhibited by the FFA. Given that facial recognition may be useful to many social cognitive tasks, such redundant facial responsivity may offer improved localized computational efficiencies. The interpretation of the lateral fusiform gyrus as innately face-specific is hotly contested by [Gauthier et al, 1998; Gauthier et al, 2000], in several studies showing that the lateral fusiform gyrus may also be trained to be expert in identifying nonfacelike objects, such as cars, birds, or “greebles” (specially designed non-facelike characters). Perhaps, though, the right lateral fusiform gyrus is both innately biased to prefer faces, and flexible enough to be tuned for more general purpose visualexpertise. Consistent with this conjecture, [Tzourio-Mazoyer, et al, 2002] found in PET scans of eight week-old infants, strong activity in the FFA precursor areas after the infants were shown images of female faces. Given that eight weeks is also the age at which infants first begin to be able to decipher the inner regions of visual stimuli [Tzourio-Mazoyer et al, 2002], this study implies that the FFA swiftly if not immediately favors the human face. Regardless of whether the FFA’s preference of faces is scaffolded upon inborn urges, or entirely conditioned, many studies demonstrate that FFA plays a principal role in expert face detection.

Additional studies link the FFA to social initiation and engagement. Supporting the FFA’s role in social engagement, [Schultz et al, 2003] found (quite by accident) that approximately 50% of the FFA is strongly activated when subjects view abstract social animations in a social attribution task (SAT), which was devoid of facelike attributes. This finding challenges the assumption that FFA activities are restricted to the perception of faces, or even expert objects. The authors speculate that when processing social relationships, the FFA may encode abstract semantic information associated with faces, to be accessed for social computations. The FFA’s involvement in the initiation and maintenance of social interactions, and its capabilities for subtle visual expertise, may indicate that the FFA serves a critical role in the hypothetical VDR watchdog network. In addition to realistic faces, the FFA responds strongly to cartoons, “mooney” faces, and cat faces [Tong et al, 2000], but these abstractions may not carry all the subtle verisimilitude percepts that may be required to enable fully the VDR alarm system. Future work and experiments are needed to test how the FFA responds to multiple dimensions of increasing verisimilitude and to various violations of verisimilitude as it increases. These points will be further substantiated in sections 3 and 4 of this paper, and experiments will be proposed in sections 4.2 and 5. Motion Processing Via the dorsal stream, dynamic attributes of the face are processed separately from the invariant attributes that are handled by the FFA. One portion of the dorsal “where” system—the posterior superior temporal sulcus (pSTS)—processes most of the dynamic actions of the face, including gaze-tracking, head-orientation, and facial movement, facial identification of facial expressions and speech-related motions [Critchley et al. 2000; Hoffman & Haxby 2000; O’Toole et al. 2002]. The pSTS also appears to be involved in some perception of emotion [Schultz, 2003]. Any task of verisimilitude discrimination, then, would be divided among several coordinated task domains, starting with invariant imaging and motion imaging. A third task domain would be the distributed representation, or Object Form Topology. Distributed Representations Multiple studies have found that visual social cognition is achieved by coordinated and cascading actions among many structures. [Haxby et al, 2001] found that images of faces, cats, houses, and objects (and noise) each activate distinct, distributed patterns of neural activity, suggesting that these patterns encode representations of the subject. These distributed representations, dubbed by the authors “Object Form Topologies”, reveal that a visual subject, such as a face, is not processed exclusively in an expert area, like the FFA, but is also processed in areas previously thought to be primarily dedicated to other subjects. In reproducing and extending this work, [Spiridon and Kanwisher, 2002] found that while distributed patterns do recognize faces, they do not perform expert identification, and so can not supplant the function of expert modules such as the FFA. This finding is consistent with double dissociation in prosopagnosics, another wellcited stanchion for the face-specific expertise of the right lateral fusiform gyrus. Expressing some surprise, these authors did find that expert modules, such as the FFA,

do recognize other visual information types, such as objects and houses; but that these modules also did not appear capable of expert identification outside their specializations. It may be that such distributed patterns encode semantic, relational associations among faces, places, and objects, and the diverse informational aspects of each (for example: a face’s moving and invariant information [O’Toole, et al, 2002]). Additionally, distributed redundancy of recognition may more effectively encode nuances of complex visual objects. A distributed network may also serve to synchronize the recall of object data across regions, for swift associative-type reasoning. Haxby’s distributed representations provide the basis of [LaBar et al, 2003], which found that visual stimuli of moving expressions (fear, anger), activate distinct distributed patterns, involving varied structures that generally included the pSTS, the amygdala, and, most notably, the FFA. In this study, the FFA responded with greater activity to dynamic expressions than to static expressions, and this response was particularly punctuated when subject viewed fear expressions. [Kesler-West, 2001] also detected elevated FFA response to various social affects and facial expressions, including happy, sad, fearful, and angry expressions. As with LaBar, the FFA was particularly sensitive to fearful expressions, and as with LaBar, the amygdala also elevated in response to fearful expressions. It is notable that neither the amygdala nor the FFA responded especially to happy expressions. Happy expressions instead activated the medial frontal/cingulate cortex, an area found to be important in the initiation of language. These findings of Kesler-West and LaBar imply that very different neural systems will be engaged by percepts of fearful expressions than by those engaged by happy or sad expressions. More generally, these findings may indicate that visual percepts that indicate social ease will trigger preparations for social engagement, while visual percepts that convey crisis preempt the easy-going social actions, to prepare for danger. [LaBar et al, 2003], also found that similar fear activations occur when subjects viewed a sliding identity morph, which the authors suggest may be a reaction to implausibility or to signals interpretable as subterfuge, innately associated with crisis via evolution. It does seem possible, however, that this response is caused by simple logical inconsistency, in essence causing a double-take, a rapid error-check. In both cases of crisis reaction, the FFA played a prominent role. It could be that the FFA is involved by expertly responding to crisis-associated patterns, such as fearful expression; this concept is hypothesized in detail in sections 3 and 4 of this paper, with experiments proposed for testing provided in section 5. Also, the FFA may coordinate with the superior temporal sulcus and the amygdala to become much more active and attentive to surprising patterns in conjunction with anthropomorphic percepts. This conjecture is also considered further in section 3. It is widely accepted that the amygdala plays a strong role in face-related social cognition [Schultz et al, 2003]. The amygdala also has been shown to be prominent in the distributed networks reported in [Kesler-West et al, 2001; LaBar et al, 2003; Schultz et al, 2003]. This distributed social network is generally composed of the

amygdala, medial prefrontal cortices, and the Fusiform Gyrae (FG) (which include the FFA), and when changeable and dynamic aspects are involved, the STS. [Schultz et al, 2003], in particular found that the amygdala responds to neutral, non-emotional faces, for the first time implicating the amygdala in non-emotional face processing. Because the ventral cortices are highly integrated with the amygdala [Amaral & Price 1984] data may be exchanged at a high rate between the fusiform gyrae and the amygdala. The amygdala also reliably activates when subjects are asked to judge personality characteristics from images of faces [Adolphs et al. 1998; Baron-Cohen et al. 1999; Winston et al. 2002, Schultz et al., 2003]. The amygdala and its gateway, the anterior cingulate cortex, are well-known to produce the social “feelings” that are critical to probabilistically learning and managing social complexities [D’Amasio, 1994; Adolphs et al., 1998]. [Schultz et al., 2003] found that the right lateral fusiform gyrus operates in close tandem with the right amygdala in a distributed network to process social tasks. In tasks of social calculation, the authors felt that the amygdala guides the FFA towards more fine-tuned, situational face-processing. In this light, the findings of LaBar and Kesler-West may indicate that this amygdala-FG feedback elevates with fearassociative face-image stimuli, causing the FG to search more closely for expert signs of danger and safety. Face Perception and Verisimilitude Sensitivity The face serves as the primary visual apparatus for social interface, and as our principle marker for individual identity. Given how captivating the face is, and how engrossing a face-to-face conversation can be, we likely evolved vigorous defenses that need to be satisfied before we can relax socially. The literature on the neural architecture of visual social cognition in humans appears tantalizingly consistent with this concept of face-based social defense, and aligns roughly with the anecdotal concepts described in the theory of the Uncanny Valley. However, further research and experimentation are required to verify that these correlations are in fact causally related; only then can a more comprehensive theory of such a social defense system be considered trustworthy. Many questions stand out in this quest. What are the relevant dimensions of verisimilitude? Do the rules of paralinguistics build upon these basic defensive protocols? Will “mooney” faces trigger different activations in the FFA than “realistic” faces? How about in distributed representations? Does such a defense system relate uniquely to human identity? How minimal may the verisimilitude be for a robot to be both realistic and socially appealing? Before advancing hypotheses toward a rudimentary theory in section 3, an examination of the Uncanny Valley through the lenses of psychology, the arts, and robot design may help further contextualize our hypotheses. 2.2

Philosophical Background For thousands of years, the arts have portrayed the human face, in literally billions of variations. These depictions have proliferated into a dense evolutionary bush,

populated by abstractions, cartoons, subtle distortions, and shades of high verisimilitude, in myriad media such as sculpture, painting and animation. The human form has been rendered as appealingly naturalistic as the Pieta by Michelangelo, as distortedly appealing as Mickey Mouse, and as grotesquely distorted as the computeranimated Gollum of the Lord of the Rings film, the Two Towers. Needless to say, the arts have little hesitation about handling both verisimilitude and the “uncanny”. Moreover, the results aren’t generally repellant, but rather endlessly fascinating and can even inspire of feelings of love in people. In common with the sociable robot engineer, the artist emulates the communicative function of the face and figure with manipulated technological media. Yet, as smart as this artistic problem-solving must be, it is scarcely formalized at best. Engineering science motivates sociable robotics, and of course, engineering delivers the technology’s profound, if emerging, functionality. And from an engineering perspective, the artist’s methods may appear to be about as legitimate as a sheaf of folk remedies. To reconcile these opposing perspectives on verisimilitude and visual communication, the following sections consider each perspective in some detail. Along the way, we hope to extract some hints about the nature of humanity’s exacting discrimination of human facial aesthetic. 2.2.1 Mori’s Mistake Sociable robots are coming of age. Though not nearly as capable as humans, many “human emulation” technologies have sprouted substantially in the last decade, showing remarkable surges in functionality including face tracking, feature tracking, visual biometric identification, bipedal locomotion, and semantically rich natural language processing [Kurzweil, 1999; Menzel, 2001; Bar-Cohen & Breazeal, 2003]. Sparked largely by the mid1990’s MIT graduate work of Cynthia Breazeal, sociable robots integrate many of these human emulation technologies into singular synthetic organisms, designed to affect naturalistic mannerisms to communicate more effectively with people [Breazeal, 2002]. While these robots only crudely simulate social cognition, they are being actively used as modeling tools in cognitive science [Fong et al, 2003]. Since Breazeal’s seminal work, a sizable number of sociable robots have sprung into existence. Although a comprehensive list is beyond the scope of this paper, a few sociable robots include: Ridley at MIT (lead by Deb Roy), Nursebot Pearl of CMU, Kismet and Leonardo of MIT (lead by Cynthia Breazeal; see figure 1), and Mabel at the University of Rochester (built by a team of undergraduates). Additionally, companies including Panasonic, Sony, and Honda have pursued sociable humanoid robots. Although these robots all seek to achieve bioinspired communicative interaction with humans, none has a realistic humanlike face.

Figure 1. Sociable robots; left: Leonardo of MIT; right: Nursebot Pearl of CMU. Several groups use computer-simulations of realistic faces for their robots and autonomous agents, for instance: Cassell’s Rea at MIT, and Vikea of CMU’s Sociable Robots Group. Perhaps this is because video games have made humanlike simulations more acceptable. Otherwise, the avoidance of anthropomorphic realism has been remarkably pervasive among the hundreds of robotics groups conducting relevant research. Two exceptions stand out: Fumio Hara’s lab at the Science University of Tokyo [Hara, ,1998], and the work of the author of this paper at the University of Texas at Dallas [Hanson, 2002; Ferber, 2003]. Each has pointedly pursued realistic human identity in robotics (see figure 2).

Figure 2. High-verisimilitude anthropomorphic robots, left: the Science University of Tokyo; right: the University of Texas at Dallas.

Since the mid-1990s, nearly every debate on the topic of anthropomorphic identity for sociable robots has cited the work of Hara, sometimes pointing out that Hara’s robots are not altogether comforting to behold. In conversation, one robotics researcher stated that Hara’s robots gave her the “creeps”, and were less appealing than more cartoonish incarnations of robots. One may respond to such criticisms by stating that the limits of robotic verisimilitude might be technological and not fundamental. Yet the mentioned researcher’s hesitations echo the general response of the AI and robotics community. While at times, abstraction versus realism is vigorously debated, most prominent literature [Caporael, 1990; Duffy, 2002; Breazeal, 2002; Fong et al, 2003] concludes that the pursuit of anthropomorphic verisimilitude is quixotic, if not downright destructive to the field of robotics. The feeling is that human appearance may cause robots to be prematurely rejected, as realism causes people to be overly discriminating of flaws and to expect more from robots than they can presently deliver. One of the most cited manifestations of this argument, the theory of the Uncanny Valley, appeared in a 1975 essay by Japanese robotics researcher Masahiro Mori [Reichardt, 1978]. Here, Mori speculated that as an anthropomorphic object looks and acts more realistically human, it will receive increasingly favorable reaction, but only up to a limited point of realism (see figure 3). After this node, however, the viewer starts to become more distracted by flaws in humanoid demeanor, such that the object will soon become highly disturbing to a person. This graphic depression in favorable opinion is the valley of Mori’s Uncanny Valley. Mori further speculated that should the object increase sufficiently in realism, viewer opinion will eventually rise back out of the valley, cross the neutral threshold of viewer opinion, and ultimately, (once appearance rivals realism) will turn into complete acceptance. Based upon this theory, though, Dr. Mori concluded that anthropomorphic robot designs should always stop short of the Uncanny Valley to avoid public fear and loathing or worse yet: the complete rejection of robots by the public.

Figure 3: Masahiro Mori’s Uncanny Valley of anthropomorphic rejection (illustration by [Bryant, 2003]). Although no data has been collected to substantiate the Uncanny Valley theory, it is the closest thing to an engineering principle that exists for guiding the design of anthropomorphic robot identity. It is important to emphasize that the chart provided in Figure 3 is conceptual, and not based on data. Other aspects of the Uncanny Valley theory are specious as well. For example, in the theory, verisimilitude is not well-defined. Many dimensions of aesthetic percept fluctuate widely in the examples given with the theory. As the axes move towards increasing verisimilitude, a corpse appears as an example. Certainly, disturbing and unhealthy percepts are associated with corpses, which would represent retrograde in verisimilitude. With this and the other sickly examples given, the chart actually decreases (or rather, fluctuates wildly) in realism (or in critical subdimensions of verisimilitude) along the axis that putatively increases in realism. And yet, such analysis is tenuously legitimate, as none of the examples given are actually quantified by Mori. We haven’t a clue what the examples actually look like, and no experimental method is described. In short, although the Uncanny Valley theory may hint at real phenomena, it is not real science. To get a better handle on the phenomena, the perceptual dimensions of verisimilitude need to be quantified, and their variable relationships hypothesized and experimentally evaluated. Needless to say, many variables must be at play, operating in concert to compose the one variable of “verisimilitude”. It may be useful to first divide the variables into two classes: those associable with realism, and those that connote disease. Much subdivision and careful discrimination will be required to define the subvariables under each category; a first pass at this task is the ambition of section 6 of this paper. Another assumption of this theory is that cartoons are inherently outside of the valley. But cartoons may be at least as disturbing as realistic figures, as can be demonstrated by throngs of Disney villains, ghouls, and predators. Even a decapitated stick figure can be disturbing. Another potentially erroneous assumption of the Uncanny Valley theory is that a dip (the valley that defines the theory) unavoidably occurs between abstract representation and realism. This may be anecdotally countermanded by anyone who has seen people walk out of a distance. The people first appear as simple, low resolution shapes with few motion cues; slowly they increase in resolution and density of verisimilitude percepts. The subsequent transition to full resolution is then continuous, yet conspicuously devoid of an uncanny valley: at no point of intermediate resolution does an observer (under common circumstances) plunge into a state fear or discomfort. Moreover, as described in 2.2., the figurative arts provide numerous, wideranging examples of abstraction and near-verisimilitude that perhaps should dip into the valley, but do not. Art may be viewed as another example of a low- to highverisimilitude continuum that can be devoid of an Uncanny Valley. So, perhaps the valley is not an adequate metaphor (this, I claim, is Mori’s mistake). Could it be that the criteria for human social preference simply become

more demanding for forms and motions to which humans are more expert? If so, then a better metaphor could be a road, which thins to a path, which thins into a tightrope as the aesthetic approaches full “realism”. As verisimilitude increases, it may be increasingly challenging to satisfy the finicky tastes of human visual cognition, but it seems premature to conclude that the quest is impossible. The challenges and recent progress in technologically bridging the valley are addressed briefly in sections 2.2.3 through 2.2.5 of this paper. 2.2.2 The Conceptual Heritage of the Uncanny The concept of the uncanny itself has quite a heritage, which may be elucidating to consider. "Uncanny," which etymologically means "not known, not safe, or not comfortable", is defined by Merriam Webster as “seeming to have a supernatural character or origin: eerie, mysterious”, and by the American Heritage Dictionary, “so keen and perceptive as to seem preternatural”. Additionally, “uncanny” is commonly used describe a reproduction that is so close to the original that it is startling. In 1919, Freud wrote “Das Unheimliche”, or “The Uncanny” [Freud, 1919/1955], stirred by an essay by Ernst Jenst on the same topic and by the gothic sensibilities of the era’s literary culture. Freud’s essay reduced the “uncanny” to the anxiety that results from various real and imagined surreal circumstances, either conceptual or sensory/aesthetic in origin, which frequently relate to madness, the doppelganger (or double), death, and the confusion of the animate and the inanimate [Masschelein, 2003], themes at play largely in the Gothic oeuvre. Yet, before and since, literature and the arts have timelessly circled the “uncanny” (the concepts, not the word per se); monsters, ghouls, and the walking dead seem to be transcultural iconography, which perhaps indicates the existence of an inherent wariness of certain patterns of mortal threat. The 18th and 19th centuries were particularly fascinated by the themes of the uncanny, which inspired the great gothic novels, including Mary Shelley’s pertinent classic Frankenstein. One of the great works to address the uncanny, Shelley’s Frankenstein exhibits special (if not downright uncanny) relevance to many issues raised in the creation of sociable robots. In the novel, humans reject the “monster” solely due to a grotesquely distorted appearance, which was rife with symptoms of mortality [Shelley, 1831]. Driven mad by the pain of abandonment, the monster turns its amazing talents away from creativity, to a spree of annihilation. The mythic rejection of this character mirrors Mori’s fear that a realistic robot would be loathed as if it were a walking corpse. Yet, if we can scientifically and technologically unravel the mysteries of the uncanny effect, there may be hope to spare our future robotic progeny from such painful abandonment, even while empowering them with the full-bandwidth nonverbal expressivity that only a realistic face may provide. 2.2.3 Art Following the finishing stroke on the marble sculpture of the Moses, legend says that Michelangelo Buonarroti stepped back to behold his awesome creation for a

moment; then, with a sudden, batonlike shake of his chisel at the work, he bellowed: “BREATHE!” While the sheer appearance of great art does convey the impression that artists have breathed life into raw material, the technology to cause raw material to behave and think like living creatures has only begun to be a real prospect. The disciplines of Biologically Inspired Engineering channel technology into lifelike forms, and channel bioscience into engineered technology. Bio-inspired Intelligent Robotics merges these pursuits to simulate entire organisms [Bar-Cohen & Breazeal, 2003]. Bio-inspired robotics, and especially sociable robotics, may be anticipated to blossom into one of the great artistic media of all time, coming closer than any other to portraying the spirit of a living, intelligent organism. Note that bioinspired literally means “life breathed in”. As great artists refine sociable robotics (in collaboration with great scientists and engineers), the technology will be driven to new extremes of expressivity. This will require novel formalization of artist’s methods into engineering technique. This convergence of engineering and art will be most effective if done in concert with scientific inquiry into social cognition. As one step in this quest, the experiments may use artists’ nuanced distortions and abstractions of the face (as in figure 4) to test subjects’ verisimilitude discrimination response. One such experiment is proposed in section 4 of the is paper.

Figure 4 “Mask,” a 5-foot tall self-portrait by artist Ron Mueck, shows the sophistication of artistic aesthetic biomimesis, and demonstrates the surreal effects of controlled verisimilitude deviations; here the dimensions of deviation would be scale and exaggerated brow contortion. Photo by Mark Feldman. The heuristics of artistic talents can be considered a “black box” problem solver for tasks of aesthetic biomimesis [Hanson et al, 2003]. The trained artist translates the natural, internal human faculty for communication out into external tools and materials, presumably by active feedback between neural systems responsible for social visual cognition, and the media at hand.

Artists do this trick in static-media (i.e., sculpture, painted surfaces, etc.), but also when designing movement as animation, and in words as well, as when composing narrative. All of these will come into play in the medium of sociable robotics. Artists also report anecdotal evidence of verisimilitude discrimination response. In [Tomlinson, 2000], animator Henry Selick relates to Tomlinson an uncomfortable first pass at animating the motion picture “James and the Giant Peach”. The characters were first designed to be highly realistic, but when the first footage returned from development, viewers reported “a creepy feeling that they were watching a dead child crawling around the set”. The characters were subsequently abstracted a bit (presumably other variables remained significantly unchanged), and the audience was much less repelled. A June 2002, Wired article [Weschler, 2002] surveys the struggles of the computer effects industry to simulate a believable face, and the box office rejection of the most realistically animated film to date, “Final Fantasy”. The article concludes that an effective social narrative can make a substantial difference in how acceptable a face may be. This conclusion aligns with [Schultz et al, 2003], which shows that social interactions affect the FFA independently of facelike percepts. Perhaps, a distributed VDR network allows strong social narrative percepts (which may include vocal speech as well as visually represented narrative sequences), to override otherwise compromised visual verisimilitude. Given that “James and the Giant Peach” was created with stop-motion animation (which doesn’t allow for control of as many variables as might a robot) and that “Final Fantasy” is made with a technology that is still in its infancy, the anecdotal problems with verisimilitude likely spring from technological, not fundamental, limitations. With the character Gollum in the film Lord of the Rings, the animators succeeded in convincingly portraying a living humanoid; the thing doesn’t look at all like the walking dead (although it does look creepy to be sure). Yet this creation required millions of dollars, and the efforts of legions of artists, animators, and one highly skilled actor. And it is not an autonomous agent; the elements that made it seem alive are not formalized nor automated yet. In short, the above anecdotes strongly imply that humans do get much more finicky in response to high verisimilitude percepts, and that extra effort is required to engineer a humanoid face that satisfies our finicky tastes for verisimilitude. Because a virtual character requires the simulation of an enormous amount of physics, it may actually be simpler to emulate realistic humanoid face with physical materials, as robots. A robot also may convey benefits by occupying our 3D physical space.

2.2.4 Animatronics In addition to ancient marbles and modern cinema animation, the fine arts have demonstrated effective aesthetic biomimesis in mechanical “automata”, as far back as the ancient Greeks [Cassell, 2002]. In the 20th century, artisans furthered this tradition of figurative automation by combining modern materials, mechanical principles and electronics, into sophisticated robotic puppets that are commonly called “animatronics”, after Walt Disney’s “Audio-Animatronics” ™ (see Figure 4), At the 1964 World’s Fair, when Walt Disney unveiled his first AudioAnimatronic™—a talking, gesticulating Abraham Lincoln—it caused a public sensation. Though computer scientists regarded the creation as more carnival prop than engineered object, people in general thought humanity was suddenly on the verge of creating their mechanical peers. It is true that animatronics such as “Abe” are merely electromechanical puppets, not automated except when enacting a prerecorded script; nevertheless, these devices are relevant here because this technology mimics the appearance and movements of humans, sometimes seeking high-verisimilitude.

Figure 5 Audio Animatronics™ in Disney’s Haunted Mansion at Disneyland, designed and implemented in the 1960’s. There is little sign that the Uncanny Valley has caused the rejection of the Pirates of the Caribbean at Disneyworld, or of hundreds of other figures in themeparks globally, or of the thousands of animatronic figures that have appeared in films including Inspector Gadget, Star Trek Nemesis, and AI. Animatronics seems like a natural match for sociable robotics. Indeed, the merger of animatronics and sociable robotics has begun: in 2002, one of the leading shops in animatronics built the mechanical and aesthetic systems of Cynthia Breazeal’s Leonardo robot [Bar-Cohen &Breazeal, 2003; Landon, 2003]. Yet Leonardo is decidedly not human in form. The Leonardo team’s qualms against realism do seem legitimate: that the realistic animatronic technology is not

quite human enough to be convincing, but just human enough to pop open people’s expectations like a can of worms. Here again, it seems that the Uncanny Valley alludes to real phenomena, underscoring the imminent need for a more formal, datasupported theory on verisimilitude response. While animatronics has greatly matured through the last four decades of development, these devices do not perfectly mimic human expressions. Mimicking the aesthetics of the human face and its dynamic action is not simple. Human faces are very complex with modes of expression; wrinkles lie dormant all over the face, invisible until an expression is enacted. Opposing movements will evoke radically different types of folding and bunching of skin. Extremely soft, human skin is intimately affected by the pulling of various internal layers of tissues, each of which may have a different Young’s modulus, and other physical characteristics [Hanson, 2002]. Moreover, because human skin is webby, liquid-saturated cellular tissue, it is tricky to emulate with solid elastomer. The molecules of a solid elastomer adhere to their nearest neighbors, and so remain geometrically constrained when distorted, in a manner unlike facial flesh, which behaves more like an array of billions of tiny water balloons. By a quirk of physics, when solid elastomers are elongated, they transduce kinetic energy into spring tension by pulling the disordered polymer molecules into alignment (in effect very efficiently removing entropy). There the energy will be stored until it can be returned into kinetic energy, except for the small amount that is tranduced to heat or light. This effect is just close enough to the behavior of flesh to have enticed a half-century of special effects artists into believing that solid elastomers are the right stuff. However, the quirk of physics does not function as neatly when otherwise distorting a solid elastomer, so elastomers require much more force, say for example, to compress. This is because the molecules of a solid elastomer adhere to their nearest neighbors and, so, are geometrically constrained when compressed or otherwise distorted. This behavior is quite unlike that of the molecules of facial flesh, which (being mostly liquid) are not so geometrically constrained. The individual molecules of liquid can flow away from their nearest neighbors into any topological form that the cellular membranes and fascia will tolerate. Thus facial flesh behaves more like an array of billions of tiny water balloons. Hold a water balloon in one hand and a sphere of solid rubber in the other, and the difference in physics becomes quickly apparent. This physical dissimilarity causes elastomers to consume more energy in achieving facelike expressions, and causes them fold and bunch very differently from facial tissues [Hanson, 2002]. Additionally, because facial tissues are composed mostly of fluid, much less solid elastomer material actually needs to be stretched, further reducing the force required to achieve expressions, relative to a solid elastomer. So, alternative materials are needed to simulate human tissues more effectively. The author’s solutions are described in section 2.2.5. 2.2.5 Human Emulation Robots of UTD

The limitations of technological media have impeded prior attempts to emulate human nonverbal expressions. The facial expression robots of the 20th century (both animatronics and the robots of Fumio Hara and his students) have used solid elastomers to simulate facial tissues, but the physics of a solid elastomer are very different from those of facial tissues. To achieve higher degrees of verisimilitude, new materials have been needed. In the spring of 2002, the author of this paper innovated a urethane-based foamed elastomer that elongates 700% , yet compresses like a conventional sponge or foam rubber. Dubbed “F’rubber” (a contraction of “face” and “rubber’), this material exhibits physical characteristics that are much closer to those of human skin than a solid elastomer’s. The cells of the material are filled with air rather than liquid, which cause the volume of the simulated tissue to be variable, unlike the practically invariable volume of liquid-filled facial tissues. Nevertheless, these new materials fold, wrinkle and bunch in ways that are highly naturalistic (see Figure 6), much more so than can be achieved with solid elastomer. In addition to improved verisimilitude in simulated facial expressions, this material decreases the force requirements by an order of magnitude, enabling lower power, lower cost expressive robots, rendering them applicable to a wider range of art and science.

Figure 6 UTD human emulation robots with F’rubber. More recently, new F’rubber material has been created out of silicone, which is softer, can elongate 1050%, and is tolerant to a wider range of environmental conditions. Also, computer-numeric-controlled (CNC) deposition of thermoplastic elastomer (TPE) into a designed matrix has shown promising preliminary results,

exhibiting elongation up to 1250%. The pores in such a material need not be spherical; they can be shaped into complex manifolds for improved mechanical and expressive behavior. They may even contain closed cells filled with liquid for still more advanced expressive emulation of facial tissues. These new material approaches may help to satisfy people’s discriminating taste for verisimilitude. They may also enable convey more relevant sociable perceptual patterns when deviating from verisimilitude. It is hoped that these tools may be used to scientifically prove that a natural bridge spans the uncanny valley, and prove that artificial bridges may span as well. HCI devices, the future of computing terminals Outside the sheer value of the science, why bother with realism? Are there practical applications? Well, this paper proposes that humans’ visual expertise helps us to receive the full bandwidth of paralinguistic semantics. If this is true, then machines that emulate the faces about which we are so expert will simply be more communicative. Also, people are highly attracted to the human image. Because people respond so predictably to the human face as the primary means of visual communication, it could be argued that the realistic human face and its subtle visual semantics would be the most natural paradigm for Human Compute Interface (HCI) devices, of which sociable robots would be a subset. We are so magnetically drawn to the face, be it in paint, print, sculpture, or film. Why wouldn’t this be true for robots as well? Long-standing biases against mechanized realistic faces, such as the Uncanny Valley theory, have prevented engineers from vigorously assailing the challenge of the face. However, the bias does not stand on science. Considering recent years’ progress in paralinguistics¸ the neuroscience of face recognition, and robotics, a scientific investigation of the Uncanny Valley is due. 3. Broaching a New Theory of the Uncanny As discussed in section 2, the Uncanny Valley theory is not consistent with many of the phenomena anecdotally related to human reaction to anthropomorphic depictions. Proposals for more quantified explanation of verisimilitude discrimination are needed. As an alternative to the theory of the Uncanny Valley, this paper proposes the theory of Verisimilitude Discrimination Response, which holds that Uncanny Valley effect arises from a distributed network of brain-systems that, in concert, function as an “emergency alarm”. This alarm system becomes acutely enabled by the detection of high-verisimilitude anthropomorphic stimuli, and rings with alarm if patterns that signal crisis are detected. But the alarm will also ring (provided it is enabled) if certain patterns that signal a healthy social presence are not detected. According to the VDR theory, human visual systems respond to visual stimuli containing increasing cues of anthropomorphic verisimilitude, by increasing expectations for more advanced verisimilitude cues. If this concert of increased expectations is met, then preparation for social engagement is initiated. If the

demands are not met, a response of fear is triggered. The negative valence of this Verisimilitude Discrimination Response would be enacted by “emergency alarm” neural templates, biased to detect signs of deception, illness, mortal compromise, and strangers. In the model proposed here, these neural templates would passively detect verisimilitude cues—upon detection of all appropriate cues (and in the absence of other crisis-based percepts) the emergency alarm does nothing. The positive valence of the VDR would be then achieved with little effort, in a gliding transition from initial person-recognition, into a state of preparation for social engagement. According to this model, the VDR could be a highly efficient gatekeeper for social engagement, as little extra energy would be expended unless signs of crisis present themselves, in the form of either the absence of unconsciously expected cues, or in the presence of unexpected, inconsistent, or danger-signifier cues. As a survival-oriented defense system, the VDR would subsume social activities. It seems likely that the two co-evolved, so are integrally correlated. If so, then defensive gateway protocols would partially determine paralinguistic social entry protocols, rendering VDR research to be generally relevant to investigations of social cognition. In this way, a neural confirmation/explanation of VDR systems will advance the general understanding of the neural basis of social cognition. Below sections propose specific hypotheses about various aspects of a possible VDR system, and propose experiments for validating or controverting them. First, this section 4 considers whether a VDR effect exists at all. Section 5, contemplates the potential neural physiology of a VDR alarm system, including the prospective roles of the pSTS, the ACC, the amygdala, distributed representations, and especially, the FFA’s potentially prominent role. Finally, section 6 postulates the various dimensions of verisimilitude discrimination. 4. First Proof of Verisimilitude Discrimination Response To formally investigate the Uncanny Valley, one must first check to see that the effect even exists, and assay a few of its basic parameters. Before scheduling time in an fMRI facility to conduct more advanced experiments (as proposed in sections 5 and 6), we may eke out some basic answers using low-tech psychometric experiments. Let us first define “verisimilitude”. This paper defines verisimilitude as degree of biological accuracy in an anthropomorphic portrayal, along any of several dimensions, including static appearance, dynamic appearance, and interactive dynamics, each of which may be broken into subdimensions. Subdimensions of static verisimilitude can be specified to include standardized accuracy of form and color, biological plausibility, and mortal integrity. Of course, these dimensions are conjectural, and so warrant debate (section 6 of this paper ponders these dimensions further), yet they may still be useful for test some basic hypotheses. Necessarily, however, these first experiments are incomplete. They focus on a minute number of possible influential dimensions of verisimilitude—the first experiment contrasts line drawings to grey-scale photos, with geometric distortions.

To effectively evaluate a possible VDR, many derivations of such psychometrics need to be administered to test groups. 4.1

Hypothesis 1 This paper hypothesizes that in response to increased static verisimilitude in visual percepts of humanlike faces, humans will show higher negative reaction towards certain deviations in facial appearance, than to other deviations, and will show no negative reaction to other deviations. If violation of some verisimilitude percepts elicit discomfort while others do not, this will indicate a verisimilitude discrimination response that selectively increases for certain dimensions of verisimilitude. If all deviations elicit negative reaction, this may indicate that an Uncanny Valley exists. If no deviations elicit negative reaction, then verisimilitude discrimination may not heighten with the verisimilitude percepts targeted in this experiment, but may heighten in ways that will require more advanced tests to detect. This outcome may imply that humans are more highly tolerant to static distortions in high-verisimilitude facial depictions than has been expressed elsewhere. Whatever the outcome of these first few experiments, many other dimensions of verisimilitude (such as dynamic and interactive verisimilitude) remain to be tested. Proposed Experiment 1 In a first experiment, test subjects will be shown seven triptychs of images of faces on a viewing screen, which will vary in two dimensions of static verisimilitude: geometric distortion, and photo-rendered detail (see figure 7). The first image of each triptych will be a grey-scale photographic depiction, the second will be a cartoonish line drawing traced directly from the photographic depictions so as to retain the essential geometry, and third will be an intermediate abstract blend. The first photographic image in the first triptych (upper left image in figure 7) is the undistorted control image. All geometric distortions diplayed will be within the range of biological plausibility, as a great range of anatomical facial configuration is biologically plausible. Yet, because individual humans are not especially expert at facial configurations that are not frequently encountered, [Golby et al, 2001], hence the geometric distortions in this experiment should be exotic enough to trigger an uncanny effect if gross geometric distortion does so. It may be that other static factors are more important than gross geometric structures, such as the subtle arrangement of skin around the eyes. This is saved for future work. The first distortion in the experiment, the control is morphed toward the key set of features that has been shown to evoke nurturing responses in adults [Eibl-Eibesfeldt, 1972]; proportionally large eyes, forehead, small jaw and lips that pucker are elements of this “baby scheme”. The next distortion morphs opposite to this scheme. The third set subtly bends the face along the vertical axis. The fourth distortion changes only the symmetric sizing of the eyes, such that one gets smaller and the other larger. The fifth distortion renders the features asymmetrical and unusually

proportioned relative to the face edge. The final distortion shrinks the features in tandem, relative to the face edge.

Figure 7 Verisimilitude violations over two static dimensions. The topleft-most image is the control (original face image borrowed from [Haxby et al, 2001]). All rows below top contain geometric distortions. Left column shows highresolution photo-grade shading cues. Middle column represents extreme abstraction, and the right column is an intermediate abstraction. All images but the upper left are rendered by the author of this paper. All figures are designed so as to be affectively neutral. This is particularly important for the more highly exotic faces, so as to avoid inadvertently evoking sympathy or fear by emotional expression instead of verisimilitude alone. Method: ~20 subjects would be recruited and prescreened for psychiatric disorders. A short background questionnaire will be administered so as to divide the subjects randomly into four groups. One group would serve as control and three would serve as variable groups. The test will be administered to each individual in a closed booth, without direct human assistance. The images will be displayed on a flat screen at a viewing distance of ~1m from the test subject. Test subjects will be instructed to imagine that each person depicted in the images is a real person encountered on

the street, and to react to the questions as candidly as possible. Physiological reaction of test subjects will be gauged via Galvanic skin response, heart rate, with this data acquisition timed closely with the slide show of images. All data will be acquired by Labview, to facilitate subsequent analysis. Test subjects will also be videotaped with a time-coded tape, to better assess subject reaction to test stimuli, and to corroborate the other data. Test subjects will be shown an image for 5 seconds, and then prompted to answer a questionaire for each image. The subject will have 30 seconds to answer the questions, during which time the image will remain on the screen. 10 seconds of blank-grey screen will separate each image. The control group will be shown the images randomly. Variable group 1 will be shown the line drawings first, in sequence. Variable group 2 will be shown the photograde images first, and variable group 3 will be shown the blended images first. All variable groups will view the remaining images in a random order. Questionnaire: How unusual is this face? Rate it from 1 to 5. Does this face make you feel nervous? Rate from 1 to 5. Does this face make you feel comfortable? Rate from 1 to 5. Does this face look alive or dead? Is this face healthy or sickly? Do you find this face: Funny? Comforting? Frightening? Appealing? Lovable? (circle one). Do you want to talk with this person? Analysis: If the hypothesis is correct, the data for the first 7 images returned by variable group 1, should be substantially different from that returned by variable group 2; the response should be particularly different for the first and second distorted images, which are designed for evoking nurturing feelings and disorientation, alternately. The issue is complex, and it will not be surprising if reversal effects occur— where the photo-rendered frames receive more sympathy than the cartoons. Complex results will be impetus for future work. Humorous reactions may indicate social rejection, and an alternative outlet for fear. Galvanic skin responses immediately prior to humorous outbursts should be particularly noted. The last question on the questionnaire is intended to given preliminary indication about whether the higher-resolution and/or the less distorted images trigger social preparations. The results of this question may be specious, as sympathy may motivate the answer, instead of verisimilitude response. Future work: It may be more effective to display the face renderings in a naturalistic backdrop, on a full body figure rendered with all natural details. People are so used to seeing distortions of the human figure, in cartoons and in popular media (such as Star Trek’s menagerie of special effects creatures) that reactions may be jaded, unless the unusual faces are visually put into the context of contemporary civilization.

If the results are interesting, it may be worthwhile to repeat the experiment using fMRI, to attempt to detect the neural activities underlying the effects. 4.2

Hypothesis 2 This section hypothesizes that a continuum of increasing static visual verisimilitude may be technologically attained, such that high-verisimilitude anthropomorphic representations need not fall into a valley of revulsion. If viewers have no “uncanny”, uncomfortable reaction, then the static dimensions of verisimilitude will be shown to be technologically achievable. If viewers of the continuum experience discomfort at the sight of intermediate verisimilitude depictions, then the experiment does not disprove the requisite plunge into the Uncanny Valley. Proposed Experiment 2 In this experiment, each subject will be shown one of two sets of images of humanlike figures. The first set of images, the control set, will contain 20 frames that morph from a low-resolution, low contrast abstraction of a human figure, up to a full resolution photo-realistic version of the same image, in a manner that is perceptually common (similar to camera-focus, or the approach upon a figure from a distance). The second set will also morph in 20 images, but from a highly distorted cartoon, to the same high-res. photo-realistic figure (see figure 8). If viewed in an animated sequence, the second set of images will function like an identity morph, which [Schultz et al, 2003] demonstrated causes neural alarm in test subjects. This alarm is likely due to logical inconsistency, rather than verisimilitude discrimination. No such logical inconsistency would be in play for the control set. To ensure that we exclude this identity morph corruption, and detect only possible verisimilitude response, the sequences will be shown in a predetermined, shuffled order that prevents an animation effect. This order will be identical for control set and the variable set. This sequence will begin with the second highest verisimilitude image. The images will be designed to avoid sparking feelings of discomfort in the viewer. Method: Around ten to twenty subjects would be recruited and prescreened for psychiatric disorders. A short background questionnaire will be administered so as to divide the subjects randomly into two groups: one control group and one variable group. The test will be administered to each individual in a closed booth, without direct human assistance. The images will be displayed on a flat screen at a viewing distance of ~1m from the test subject. Test subjects will be instructed view the images and to rate how upsetting or appealing each image is, from 1 to 10, with 10 being most appealing and 5 being neutral. An array of push buttons will be mounted below the test screen. Physiological reaction of test subjects will be gauged via Galvanic skin response, heart rate, with this data acquisition timed closely with the slide show of images. All data will be acquired by Labview, to facilitate subsequent analysis. Test subjects will also be videotaped with a time-coded tape, to better assess subject reaction to test

stimuli, and to corroborate the other data. Test subjects will be shown an image for 3 seconds, and then be given 5 seconds to press the judgement button. Between images, the screen will display 10 seconds of featureless greytone.

Figure 8 Verisimilitude Distortion Continuum, such as will be used in experiment 2. Jasmine character is courtesy of Disney Enterprises, 2003. Morph achieved with Morpheus software and some considerable handpainting by author of this paper. Analysis: If the hypothesis is correct, the data for the control group and the variable group should be statistically equivalent. Should they not be, then further refinement of the experimental methods may be in order, and subsequent readministration of the experiment. If a substantial differences exist between control and variable responses, this may support Mori’s Uncanny Valley model. If the differences are particularly close to the high-verisimilitude end of the continuum, this may support Mori’s conjecture that the Uncanny Valley drops most steeply in the last 1-5% of the approach to verisimilitude. Future work: If the results are interesting, it may be worthwhile to repeat the experiment using fMRI, to attempt to detect the neural patterns related to the static VDR. 4.3

Hypothesis 3 This section hypothesizes that if static verisimilitude percepts outpace sociallyrich dynamic verisimilitude, that a sense of alarm will be triggered in human observers.

If not, then humans may not be as sensitive to combined static and social verisimilitude as expected. If this is the case, perhaps still more subtle evaluations will be called for. Proposed Experiment 3 To test this hypothesis, three iterations of an SAT (social attribution task) will be employed, wherein all factors remain constant, except for the increasing static facial realism of the figures. To accommodate the three dimensional qualities of verisimilitude, the SAT will be executed in a 3-D rendering software. The first iteration of the SAT will be animated with completely abstract geometric figures operating in a generic boxlike space, true to the spirit of the classic SAT studies [Heider & Simmel, 1944; Castelli et al, 2000; Schultz et al, 2003]. The only significant modifications to these earlier SATs are the addition of rocking motion to connote walking and an extra geometric appendage to connote a head. Also, this appendage will turn just prior to the figure during direction changes, in order to convey a sense of intention to motions. This first iteration is the control. The second iteration of the SAT will replace each geometric head/appendage with a very abstract cartoon face. All other factors remain constant. In the third iteration, the cartoon face is replaced with a highly realistic face, with a neutral but attentive affective expression. Method: ~20 subjects will be recruited and prescreened for psychiatric disorders. A short background questionnaire will be administered so as to divide the subjects randomly into three groups. One group will serve as control and two will serve as variable groups. The animation and 3-D modeling will be executed in MAYA software, with muted-saturation colors and untextured phong surfaces, except the realistic texture on the third iteration models. The animation will be rendered in advance into a digital video file and played back for the experiment, with the timing of the video carefully orchestrated with the data acquisition. The test will be administered to each individual in a closed booth, without direct human assistance. The SAT animations will be displayed on a flat screen at a viewing distance of ~1m from the test subject. Test subjects will be trained in advance regarding the rules of the test, so as to answer questions in timely fashion. Physiological reaction of test subjects will be gauged via Galvanic skin response, heart rate. All data will be collected by Labview, to facilitate subsequent analysis. Test subjects will also be videotaped with a time-coded tape, to better assess subject reaction to test stimuli, and to corroborate the other data. Test subjects will be shown three 20 second animations, and following each, subjects will be prompted to answer a questionaire. The subject will have 30 seconds to answer the questions, during which time the images of the “characters” of the SAT will remain on the screen. Each subject will watch three separate animations. 15 seconds of blank-grey screen will precede each animation.

The control group will be shown the only geometric SAT animations. Variable group 1 will be shown only SAT animations with abstract character faces. Variable group 2 will be shown SAT animations with realistic faces. All other factors will be constant among the groups. Questionnaire: Do you find this figures: funny? Frightening? Appealing? Lovable? Do the figures look alive or dead? Do the figures look healthy or sickly? Do the figures make you nervous? Rate from 1 to 5. Do the figures make you feel comfortable? Rate from 1 to 5. Analysis: If the hypothesis is correct, variable group 2 will find the realistic faces repellant, and the data will be substantially different from that returned by variable group 2 or the control group. If this is the case, this will strongly indicate that a VDR neural security alarm exists, and that a combination of static and dynamic verisimilitude cues are required to be detected in tandem for a person to feel safe and comfortable in social situations. If the hypothesis is not verified, such that all groups respond similarly to the SAT, then the Uncanny Valley effect is much less daunting than has been previously assumed, and may require much more subtle tests to detect the phenomenon. Future work: If the experiment verifies the hypothesis, then a next task would be to reproduce the experiment with further iterations of the SAT, increasing the dimensions of dynamic verisimilitude until the animation pulls out of the valley, and becomes appealing. It may be worthwhile to look at the information density of the dimensional factors that overcome the VDR, and relate these to paralinguistic social-entry protocols. It would also be worthwhile to conduct these future experiments using advanced imaging technology, to map the neural activities underlying the effects. If the experiment does not verify the hypothesis, then further experiments will need to be conducted to confirm these results, and to look for more subtle manifestations of the VDR effect. 5. The Neurophysiology of the VDR. If they exist, VDR gatekeeper cues will be an information-rich, mixed-modal set of stimuli. For this reason, VDR neural templates can be expected to be both strongly localized, and distributed, consistent with Haxby’s Object Form Topologies. As discussed in section 2, the FFA, the pSTS, and the amygdala seem likely candidates for handling substantial amounts of the VDR watchdog activity. The FFA and a Distributed Verisimilitude Discrimination

The FFA’s involvement in the initiation and maintenance of social interactions, and its capabilities for subtle visual expertise, indicates that the FFA serves a critical role in the hypothetical VDR watchdog network. If the VDR effect exists, it must rely largely on memory and instincts that categorize faces as general objects, and that contrast a stimulus to this set of neural templates that are expert. The expressive folds, wrinkles and other idiosyncrasies of the human face, as well as the invariant features to which we are trained experts. This diversity of expertise implies that a diverse network of neural structures must coordinate to achieve the VDR. According to the model proposed by [Adolphs, 2001], FFA invariant expertise in faces is acquired on the cellular level by encoding a dense multitude of discrete views and models of a given face. Many studies relate FFA activities to the opening of Social Engagement [Kesler-West et al, 2001; LaBar et al, 2003; Schultz et al, 2003]. The general purpose expertise of the lateral Fusiform Gyrus, proposed by [Gauthier et al, 2001], also fits this model of the FFA as a principle VDA gatekeeper. Consider, that the more expertly tuned we are to any class, the more we can detect discrepancies. The FFA’s relationship with the amygdala also suggests an effective position as a gatekeeper and watchdog for danger during social activities, mainly by remaining alert to deviations of expert expectations. It is possible that if the FFA functions as a social gateway, as Schultz contends, that the FFA’s visual expertise, and identity management function, helps us to receive and manage the full bandwidth of paralinguistic semantics. The basic division of low-resolution, motion-based face/gesture recognition, and high-resolution, static recognition [O’Toole et al 2002] offers a promising model for the shift from initial face recognition, to prospective social engagement. This transition could have been pretty dangerous during our evolutionary history. A defense system against foreign social groups and predators might have been rather advantageous to compensate for such dangers. If this is the case, then as the FFA coordinates with the superior temporal sulcus and the amygdala during initial facial recognition, these systems would become much more active and attentive to surprising patterns in conjunction with anthropomorphic percepts. If all goes well, the distributed social system prepares for social engagement. However, if any surprising patterns or other danger signals are detected (such as the perception of a fearful facial affect), then the preparations for social engagement would be quickly terminated in exchange for a state of alarm. As discussed in section 2, existing literature on the neural basis of visual social cognition provides an effective framework for the verisimilitude discrimination response, such that only high-resolution facial recognition will trigger full expectations of face-to-face social exchange and simultaneously enable the full strength of the VDR “watchdog” pattern-detectors. 5.1

Hypotheses The central hypothesis of this papers holds that the Uncanny Valley effect arises from a distributed network of brain-systems that act as an “emergency alarm”, which becomes acutely enabled by the presence of high-verisimilitude anthropomorphic stimuli, and rings with alarm if patterns that signal crisis are detected, but also (while

the alarm is enabled) will ring if certain patterns that signal a healthy social presence are not detected. In the face of verisimilitude violations, this distributed network will especially activate the FFA, pSTS and the amygdala. If verisimilitude factors are satisfactory, then the alarm does nothing, and the pSTS and medial prefrontal/cingulate sulcus and other social and language systems will prepare for activity. If this hypothesis is confirmed, then this VDR defense offers preliminary map of visual social entry protocols, to compliment the abundance of research into behavioral paralinguistics. This will enable further research into deciphering the dimensions and parameters of the VDR, quantifying the phenomenon underlying the legend of the Uncanny Valley. If the hypothesis is controverted, this may lay to rest theory of the Uncanny Valley as a strong neural phenomenon. The phenomenon may exist, not as a neurally based, instinctive defense mechanism, but rather, as an effect of technological deficiencies in robotic hardware and artistic craft. 5.2

Proposed Experiments I propose to repeat the SAT verisimilitude experiments described in section 4 of this paper, this time using an fMRI apparatus to measure the neural activity underlying this verisimilitude discrimination task. Of course, this will only be appropriate if the psychometric and GSR results confirm the VDR in the earlier experiments proposed in section 4. The experiments will target fMRI activity in several ROI, including the FFA, the pSTS, the ACC, the amygdala, the medial prefrontal/cingulate sulcus, and distributed representations throughout the ROI, and also in the IT and STS at large. Just as in 4.3, three iterations of an SAT (social attribution task) will be employed, wherein all factors remain constant, except for the increasing static facial realism of the figures. The test may be modified and improved over the one proposed in 4.3, based on the results from the one in 4.3. Methods: ~20 subjects will be recruited and prescreened for psychiatric and neurological disorders. A short background questionnaire will be administered so as to divide the subjects randomly into three groups. One group will serve as control and two will serve as variable groups. The test subjects will be administered practice tests so as to perform more effectively within the confines of the fMRI device. The practice tests will contain none of the content of the actual tests, but will be stylistically the same, and involve the similar physical constraints. Immediate following the subject’s emergence from the fMRI, they will be administered a followup questionnaire, which will be the same as described in 4.3. Analysis: If the hypothesis is correct, variable group 2 will find the realistic faces repellant, and the fMRI data will return patterns that indicate alarm, namely relatively

heightened activity in the amygdala pSTS, and FG, matching that of a visual perception of fearful affect. Variable group 1 and the control group will show signs of social preparations. If the hypothesis is not verified, such that all groups show similar neural activity, then the Uncanny Valley effect may not have a strong, distributed neural basis. This does leave the possibility open that the effect has a more subtle neural basis that does activate substantially different distributed representations, large scale differing actions in the ROIs. Future work: If the experiment verifies the hypothesis, then future experiments may increasingly quantify the complex dimensions of dynamic verisimilitude and their neural correlates. 6. VDR Triggers and Factors: Exploring the Dimensions of Verisimilitude Many researchers have argues that the Uncanny Valley is too dimensionally dense to conquer, so is not worth assailing. Although this paper holds that this set is not as dense as has been previously assumed, if the Valley is so enormously rich, then therein waits all the more bounty of worthy discoveries. With these motives in mind, this section considers possible VDR triggers, and proposes experiments to strain them out. Watchdog triggers may be divided into several categories: cues that are simply unexpected (violating expertise), and cues that signal crisis or disease; the two may be discriminated at higher levels of cognition, while both may initially trigger the VDR watchdog equivalently. Exactly what cues may trigger each activity? What minimal cues may be met to avoid the “fear” reaction of the “uncanny”? Answering such questions may have great practical relevance for animation designers, roboticists, and designers of naturalistic Human-Computer Interfaces in general, as well as providing deeper understanding of social cognition. If this paper’s conjectures are correct, the minimum required cues for verisimilitude may be extremely low (such as can be achieved by “faking it”—lots of smiling and nodding, such as a person may affect in the midst of people speaking a foreign language). The maximum cues clearly become extremely complex, dependant upon how deep the naturalistic exchange becomes. Consider an analogy to chess, wherein the rules are simple enough, and a game may be played in a simple way or in an enormously deep way. It is possible that following a first set of anthropomorphic recognition (presumably coordinated across several cortical regions), that the distributed social system then expects more advanced, expert anthropomorphic cues. This second stage gatekeeping may look for certain features of “realism” of facial image (both static and dynamic presumably), and the fixed and moving expressions associated with naturalistic social exchange. Following entry into social engagement, this activity likely then transitions to the complex, context-dependant branching expectations that are common to social exchange.

In addition to the “password” verisimilitude cues that the VDR would require to open a social exchange, the VDR would be waiting like a passive trap, for certain signals interpretable as danger. These would include alien motion or static appearance, fearful expressions, cues of the compromise of physical identity (in evolutionary history this would signify mortal compromise), and maybe, some hardwired cues patterns that signal crisis or disease. 6.1

Possible Future Inquiry

In addition to forming a matrix of possible VDR dimensional variables (a first sketch is shown in figure 9), it will be important to track the sequence of activity. Does the pSTS or the FFA respond to facial verisimilitude violations first or second? What effect do static and dynamic facial expressions have on firing patterns? Do the genders react differently to different “uncanny” imagery? Human emulation robots that have controlled verisimilitude variables may be extremely useful tools for this VDR exploration process, as might be the fine aesthetic discrimination of classical figurative artists.

Variations/Combinations

A. Cartoon

B. Mid-Valley

C. Almost Real

D. Real

E. Sickly

1. Static Neutral

1.A

1.B

1.C

1.D

1.E

2. 3. 4. 5.

2.A 3.A 4.A 5.A

2.B 3.B 4.B 5.B

2.C 3.C 4.C 5.C

2.D 3.D 4.D 5.D

2.E 3.E 4.E 5.E

6. Dynamic Neutral

6.A

6.B

6.C

6.D

6.E

7. Dynamic Smile 8. Dynamic Sorrow 9. Dynamic Anger 10. Dynamic Fear

7.A 8.A 9.A 10.A

7.B 8.B 9.B 10.B

7.C 8.C 9.C 10.C

7.D 8.D 9.D 10.D

7.E 8.E 9.E 10.E

Static Static Static Static

Joy Sorrow Anger Fear

Figure 9 a matrix of possible VDR variables. 7. Concluding remarks With the VDR that we have hypothesized here, a more apt metaphor than the valley may be a juggling act. As realism increases, many subtle dimensions of visual appearance become increasingly important to the viewer, and must be managed, like extra balls tossed in the air. Likewise, unrealistic artifacts become increasingly noticeable at higher levels of realism, so must be removed from the simulation. At the present time, it seems as though these artifacts are more technological than biological, but biologically, such artifacts do occur (e.g. a twitch or a palsy). The biological relationship between unexpected noise in motion and disease or crisis, may

mean that humans have a genetically heritable fear that is triggered by unexpected classes of motions. Thus, the VDR defense keeps guard. The efficiency of multiplicity (and Occam’s razor) implies that many of these expert defensive social systems may also be used for communication. If so then investigating the theory of VDR will provide insights into the neural basis elementary greeting protocols, and represent steps towards formal understanding of the dynamics of neuro-paralinguistics. It is hoped that this line of inquiry will forge some hard science about the Uncanny Valley, either proving the theory or dispelling it as myth. It is also hoped that realistic sculpted robots will prove useful as tools in investigating social cognition, and will help to channel the science of social cognition into innovative and useful HCI devices, and profound works of art that finally hear Michelangelo’s command (“BREATHE!”). 8. Acknowledgements Special thanks are extended to Alice O’Toole, Tom Linehan, Jochen Triesch, Yoseph Bar-Cohen, Kristen Nelson, Elaine Hanson, and Dan Ferber. 9. References Adolphs, R., Tranel, D., & Damasio, A. R. (1998). The Human Amygdala in Social Judgment. Nature, 393, 470-474. Adolphs, R. (2001). The neurobiology of social cognition. Current Opinion in Neurobiology, l1, 231-239. Bar-Cohen, Y., Breazeal, C. Biologically Inspired Intelligent Robotics, SPIE Press, 2003 Bar-Cohen, Y., (Ed.), Proceedings of the SPIE’s Electroactive Polymer Actuators and Devices Conf., 6th Smart Structures and Materials Symposium, SPIE Proc. Vol. 3669 (1999), pp. 1-414. Bar-Cohen Y., (Ed.), Proceedings of the SPIE’s Electroactive Polymer Actuators and Devices Conf., 7th Smart Structures and Materials Symposium, SPIE Proc. Vol. 3987 (2000) pp. 1-360. Breazeal C., Designing Sociable Robots, MIT Press (2002). Bryant, D. “Why are monster-movie zombies so horrifying and talking animals so fascinating?” http://www.arclight.net/~pdb/glimpses/valley.html, 2003 Caporael, L. R. Anthropomorphism and mechanomorphism: Two faces of the human machine. Computers in Human Behavior, pp185-211. 1990 Cassell, J. “Embodied Conversational Agents,” AI Magazine, Volume 22 No. 4, winter 2001. Castelli, F., Happe, F., Frith, U. & Frith, C., Movement and mind: a functional imaging study of perception and interpretation of complex intentional movement patterns. NeuroImage 12, 314–325, 2000. Crosson, B., Sadek, J. R., Bobholz, J. A., Gökçay, D., Mohr, C. M., Leonard, C. M., Maron, L., Auerbach, E. J., Browd3, S. R., Freeman, A. J., Briggs R.W., Activity in

the Paracingulate and Cingulate Sulci during Word Generation: An fMRI Study of Functional Anatomy Cerebral Cortex, Vol. 9, No. 4, 307-316, June 1999 Dailey, M.N. Cotrell, G.W., “Organization of face and object recognition in modular neural network models”, Neural Networks 12, 1999. Damasio, A.R., Descartes' Error. The Grosset Putnam, New York, NY (1994). Damasio, A.R., The Feeling of What Happens: Body and Emotion in the Making of Consciousness, Harcourt Brace, New York, 1999, 2000. Darwin, C., Ekman, P. (Ed.), The Expression of the Emotions in Man and Animals, Oxford University Press, New York (1998/1872). Dickinson, M.H. Farley, C.T., Full, R.J., Koehl, M. A. R., Kram R., and Lehman, S., “How animals move: An integrative view,” Science 288, (2000), pp. 100-106. Duffy, B.R., Anthropomorphism and The Social Robot www.medientagemuenchen.de/archiv/pdf_2002/Duffy_12.2.pdf Ekman and Friesen, Basic Emotions (Ekman & Friesen, 1971) Ekman,P., “The argument and evidence about universals in facial expressions of emotion,” in Wagner,H., Manstead, A., (Eds), Handbook of psychophysiology, John Wiley, London, 1989. Ferber, D.” The Man who Mistook his Girlfriend for a Robot”, Popular Science, Sept. 2003. Fong, T., Nourbakhsh, I., Dautenhahn, K. “A survey of socially interactive robots” Robotics and Autonomous Systems 42 (2003) 143–166 Freud, “the Uncanny”, 1919. reprinted in The Standard Edition of the Complete Psychological Works of Sigmund Freud, ed. & trs. James Strachey, vol. XVII (London: Hogarth, 1953), pp. 219-252. Gauthier, I., Skudlarski, P., Gore, J.C., & Anderson, A.W. (2000). Expertise for cars and birds recruits brain areas involved in face recognition. Nature Neuroscience, 3(2): 191-197. Gauthier, I., Williams, P., Tarr, M. J., & Tanaka, J. (1998). Training "Greeble" experts: A framework for studying expert object recognition processes. Vision Research, Special issue on "Models of Recognition", 38: 2401-2428. Golby, A. J., Gabrieli, J. D. E., Chiao, J. Y. & Eberhardt, J. L. Differential responses in the fusiform region to same-race and other-race faces. Nature Neuroscience, 4, 845 850, (2001). Golby, A. J., Gabrieli, J. D. E., Chiao. J.Y., Eberhardt, J. L. (2001). Differential responses in the fusiform region to same-race and other-race faces. Nature Neuroscience, 4, 845 – 850. Hanson, D. Rus, D., Canvin, S., Schmierer, G., Biologically Inspired Robotic Applications, in Biologically Inspired Intelligent Robotics, SPIE Press, 2003 Hanson, D., “Identity Emulation Facial Expression Robots”, proceedings of American Association for Artificial Intelligence,Conference, August, 2002. Hanson D. and Pioggia G., “Entertainment Applications for Electrically Actuated Polymer Actuators,” in Electrically Actuated Polymer Actuators as Artificial Muscles, SPIE PRESS, International Society of Optical Engineers, Washington, USA, Vol. PM98, Ch. 18, March 2001. Hanson D., Pioggia G., Bar-Cohen Y., De Rossi D., “Androids: application of EAP as artificial muscles to entertainment industry,” Proc. SPIE’s Electroactive Polymer

Actuators and Devices Conf., 7TH Smart Structures and Materials Symposium, Newport Beach, USA, 2001. Hara, F., Kobayashi, H., Iida, F., Tabata, M. Personality characterization of animate face robot through interactive communication with Human 1st Int’l W/S in Humanoid and Human Friendly Robots, pg 1-10, 1998. Haxby, J. V., Gobbini, M.I., Furey, M.L., Ishai, A., Schouten, J.L., Pietrini, P. (2001) Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293, 2425-30. Heider, F. & Simmel, M. 1944 An experimental study of apparent behavior. Am. J. Psychol. 57, 243–259. Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: A module in human extrastriate cortex specialized for face perception. Journal of Neuroscience, 17, 4302-4311 Kesler-West, M.L., Andersen, A.H., Smith C.D., Avison, M.J., Davis, C.E., Kryscio, R.J., Blonder, L.X. (2001). Neural substrates of facial emotion processing using fMRI. Brain Res Cogn Brain Research, 11, 213-26. Kurzweil, Ray The Age of the Spiritual Machines Viking Press, 1999. Lamb, S. M., Pathways of the Brain, The Neurocognitive Basis of Language, Amsterdam & Philadelphia: John Benjamins Publishing Co., (1999). Levenson, R., Ekman, P., and Friesen, W. “Voluntary facial action generates emotionspecific autonomic nervous system activity.” Psychophysiology, 27(4): 363-383. 1990. Liu, J., Higuchi, C. M, Marantz, A., Kanwisher, N., “The Selectivity of the Occipitotemporal M170 for Faces”, Neuroreport. 2000 Feb 7;11(2):337-41. Masschelein Anneleen, “A Homeless Concept, Shapes of the Uncanny in TwentiethCentury Theory and Culture”, Image & Narrative, Issue 5, January, 2003 Menzel, P., D’Aluisio, F. Robo sapiens: Evolution of a New Species, Boston, MIT Press, (2000). Ochsner, K. N., & Lieberman, M. D. (2001) The emergence of social cognitive neuroscience. American Psychologist, 56, 717-734. O’Toole, A.J., Roark, D. A., Abdi, H., “Recognizing moving faces: A psychological and neural synthesis”, The University of Texas at Dallas, March 13, 2002 Pizzagalli D, Koenig T, Regard M, Lehmann D. (1998). Faces and emotions: brain electric field sources during covert emotional processing. Neuropsychologia, 36, 323-32. Puce, A., Allison, T., Gore, J. C. & McCarthy, G. 1995 Facesensitive regions in human extrastriate cortex studied by functional MRI. J. Neurophysiol. 74, 1192–1199. Riesenhuber, M., Poggio, T, “Models of object recognition”, Nature America, 2000. Reichardt, Jasia Robots: Fact, Fiction, and Prediction, Harmondsworth, Middlesex, England: Penguin Books, Ltd. 1978. Reiman, E.M., Lane, R.D., Ahern, G.L., Schwartz, G.E., Davidson, R.J., Friston, K.J., Hart, A.J., Whalen, P.J., Shin, L.M., McInerney, S.C., Fischer, H., & Rauch, S.L. (2000). Differential response in the human amygdala to racial outgroup vs ingroup face stimuli. Neuroreport, 11, 2351-2355. Shelley, M.W., Frankenstein, 1831.

Spiridon, M. and Kanwisher, N. “How Distributed Is Visual Category Information in Human Occipito-Temporal Cortex? An fMRI Study”, Neuron, Vol. 35, 1157–1165, September 12, 2002, Copyright .2002 by Cell Press. Swiercinsky, D.P., “Brain Map Visual System”, 2000. Tarr M. J., & Gauthier, I. (2000). FFA: a flexible fusiform area for subordinate-level visual processing automatized by expertise. Nature Neuroscience, 3, 764-9. Thomas, F., Johnston, O., The Illusion of Life: Disney Animation, Hyperion; ISBN: 0786860707; Revised edition, 1995. Tomkins S. S., “Determinants of Affects,” AIG-I, Ch. 8, 248-258. New York: Springer, 1962. Tomlinson, B. 2000. "Dead Technology." Style Vol. 33 No. 2, p. 316-335 Tong F., Nakayama, K., Moscovitch, M., Weinrib, O., Kanwisher, N. Response Properties of the Human Fusiform Face Area, Cognitive Neuropsychology, 2000, 17 (1/2/3), 257–279 Tzourio-Mazoyer, N, De Schonen, S, Crivello, F, Reutter, B, Aujard, Y, Mazoyer, B “Neural Correlates of Woman Face Processing by 2-Month-Old Infants” NeuroImage 15, 454–461 (2002) Weschler, L., “Why is This Man Smiling? Digital animators are closing in on the complex systems that make a face come alive.” Wired Magazine, Issue 10.06 - Jun 2002

Investigating the Neural Basis of the —Uncanny Valley

2.2.3 Masahiro Mori's mistake. 3. Broaching a New Theory of the Uncanny. 4. First Proof of VDR. 4.1 Scientific hypothesis, in detail, Communications systems.

717KB Sizes 2 Downloads 53 Views

Recommend Documents

Upending the Uncanny Valley
unfortunately Maya's standard SDK appears to limit the interactive playback, so that realtime animation assembly and simultaneous playback can not be.

Mortality Salience and the Uncanny Valley - Karl F. MacDorman
One hypothesis is that an uncanny robot elicits an innate fear of death and ... perfect actor in controlled experiments, permitting scientists to vary precisely the ...

Mortality Salience and the Uncanny Valley
that with each new generation more closely simulate human ... a complete android, Mori believed, would only multiply this ..... Asian things. .... Development of.

The Neural Basis of Relational Memory Deficits in ...
2006;63:356-365 ... gions previously found to support transitive inference in .... participants were shown pairs of visual items on a computer screen and asked to ...

Neural basis of the non-attentional processing ... - Wiley Online Library
1Beijing Normal University, Beijing, China. 2Beijing 306 Hospital, Beijing, China. 3The University of Hong Kong, Hong Kong. 4University of Pittsburgh, Pittsburgh, Pennsylvania. ♢. ♢. Abstract: The neural basis of the automatic activation of words

The neural basis of visual body perception
'holistic' processing. Source localization. A technique used in electro- encephalogram (EEG) and magnetoencephalogram (MEG) research to estimate the location of the brain areas .... Figure 3 | Event-related potentials reveal similar, but distinct, re

Understanding the Neural Basis of Cognitive Bias ... - PsycNET
Apr 29, 2016 - The multilayer neural network trained with realistic face stimuli was also ... modification, visual processing of facial expression, neural network.

Using Complement Coercion to Understand the Neural Basis of ...
differed only in the number of semantic operations required for comprehension ... semantic composition and that the present study seeks to address. The first ...

Neural Basis of Memory
Nov 7, 2004 - How does the transmission of signal take place in neurons? • Do genes play a role in memory ... information, or to the engram, changes that constitute the necessary conditions of remembering (Tulving, cited ..... visual scene we encou

Neural Basis of Memory
Nov 7, 2004 - memory was held in “cell assemblies” distributed throughout the brain. 0.1.1 Organization ... are used for short- and long-term memory storage.

The uncanny advantage of using androids in cognitive ...
from a mechanical-looking humanoid on the left to an android in the center to a hu- man being on the right (see ...... understood at a micro-structural level, nor as socially-definable and separable .... Trends in Cognitive Sciences, 4,. 115–121.

nausicaa a of the valley of the wind.pdf
Loading… Page 1. Whoops! There was a problem loading more pages. nausicaa a of the valley of the wind.pdf. nausicaa a of the valley of the wind.pdf. Open.

valley of the Sunflowers pr.pdf
Download. Connect more apps... Try one of the apps below to open or edit this item. valley of the Sunflowers pr.pdf. valley of the Sunflowers pr.pdf. Open. Extract.

The pathogenic basis of malaria
This could reflect both host-specific factors (for example, an ...... Bull, P. C., Lowe, B. S., Kortok, M. & Marsh, K. Antibody recognition of Plasmodium falciparum.

Uncanny
It is only rarely that a psychoanalyst feels impelled to in- vestigate the subject of aesthetics even when aesthetics is understood to mean not merely the theory of beauty, but the theory of the qualities of feeling. He works in other planes of menta

Umbrella-Of-Suspicion-Investigating-The-Death-Of-JonBenet ...
Retrying... Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Umbrella-Of-Suspicion-Investigating-The-Death-Of-JonBenet-Ramsey.pdf. Umbrella-Of-Suspic

pdf-0749\radial-basis-function-rbf-neural-network-control-for ...
... apps below to open or edit this item. pdf-0749\radial-basis-function-rbf-neural-network-contr ... design-analysis-and-matlab-simulation-by-jinkun-liu.pdf.

Investigating the stereochemical outcome of a tandem ... - Arkivoc
Feb 8, 2018 - Kanazawa, C.; Terada, M. Tetrahedron Lett. 2007, 48, 933-935. https://doi.org/10.1016/j.tetlet.2006.12.015. 49. Terada, M.; Kanazawa, C.; Yamanaka, M. Heterocycles 2007, 74, 819-825. https://doi.org/10.3987/COM-07-S(W)73. 50. Lee, N. S.

pdf-15104\the-uncanny-x-men-184-the-past-of-future ...
Try one of the apps below to open or edit this item. pdf-15104\the-uncanny-x-men-184-the-past-of-future-days-marvel-comics-by-chris-claremont.pdf.

Magnetoencephalographic studies of the neural ...
Feb 12, 2008 - ... such as MRI and. Available online at www.sciencedirect.com ...... Health Research (CIHR) and NIH T32NS007413. ... Masters Thesis. Edgar ...

Investigating the Impact of Plug-in Electric Vehicle Charging on ...
Investigating the Impact of Plug-in Electric Vehicle Ch ... d Modeling and Simulation of Transportation Network.pdf. Investigating the Impact of Plug-in Electric ...