Preflexes and internal models in biomimetic robot ...

Viewer
Transcript

Cogn Process (2005) 6: 25–36 DOI 10.1007/s10339-004-0039-6

R EV IE W

Pietro Morasso Æ Alessandra Bottaro Æ Maura Casadio Vittorio Sanguineti

Preflexes and internal models in biomimetic robot systems

Received: 5 October 2004 / Revised: 15 November 2004 / Accepted: 24 November 2004 / Published online: 14 January 2005 Marta Olivetti Belardinelli and Springer-Verlag 2005

Abstract The next generation of neuroprostheses, which are aimed at the restoration of natural movement of paralysed body parts or at the natural interaction with external devices, will be quite similar to biomimetic robot systems which attempt to duplicate the organization of the biological motor control system. In the paper, we review some of the organizing principles that have emerged in the last few years and might provide useful guidelines for a biomimetic design. Keywords Preﬂexes Æ Biomimetic systems Æ Motor control Æ Motor learning Æ Internal models

Introduction Biomimetic robot systems are thought to be more ﬂexible and eﬀective in non-conventional applications and as haptic interfaces in man–machine interactions. In this paper, we give an overview of some of the experimental evidence regarding the organization of biological motor control from the computational point of view which may be fruitful in the design of biomimetic systems. Ultimately, a biomimetic haptic interface, which is also a bidirectional proprioceptive channel, might be integrated with other bi-directional links between humans and artiﬁcial devices, whether external or implanted. In the execution of movements, the motor control system of biological organisms must comply with two types of forces, internal and external, which typically have highly non-linear characteristics. In particular, for the generic ‘‘kinematic chain’’, the vector of motor torques sm(t), which must be delivered at each joint for a Communicated by Irene Ruspantini and Niels Birbaumer P. Morasso (&) Æ A. Bottaro Æ M. Casadio Æ V. Sanguineti University of Genova, DIST Via Opera Pia 13, 16145 Genova, Italy E-mail: [email protected] Fax: +39-10-3532154

desired motion q(t), is given by the following set of equations: sm ¼ IðqÞ€q þ C ðq; q_ Þq_ þ GðqÞ þ J T ðqÞFext

ð1Þ

The ﬁrst three terms on the right-hand side of the equation express the ‘‘internal forces’’ determined by the distribution of masses of the kinematic chain and the fourth term is related to the ‘‘external force’’ applied to the terminal segment of the kinematic chain as a consequence of a contact with the environment: for example, in the case of standing/walking, the external force is related to the contact with the support surface and is measured by a force platform; in the case of the upper arm, the contact may be mediated by a manipulandum and measured by a wrist sensor. From the point of view of the motor controller, such forces imply complexity which needs to be tamed, because they: 1. May induce instability, as in the case of upright standing. 2. May induce deformation of planned trajectories. In both cases, the ﬁrst line of defence in attempting to minimize such eﬀects is provided by the mechanical properties of the body and the actuators. In insect polypedal locomotion, for example, it has become clear that a fundamental role is played by appropriate ‘‘softness’’—springs and dampers—added to the joints which tend to stabilize the body in an intrinsic fashion and thus greatly simplify control. In mammals there is much more ﬂexibility because the functional parameters of the joint impedance (in particular stiﬀness and viscosity) can be modulated by means of the co-activation of antagonistic groups of muscles and the appropriate presetting of segmental mechanisms. The term preﬂex was ﬁrst coined by Brown and Loeb (1997) to mean the response of the mechanical system due to the intrinsic, non-linear properties of the musculoskeletal system functioning at a joint. In some case, a well-matched preﬂex may be suﬃcient to solve a motor control problem. For example, Wagner and Blickhan (1999) analysed the vertical oscillations of the center of

26

mass of a human bending his legs and found that the intrinsic mechanical properties of musculature can stabilize the oscillatory movement without reﬂexive changes in activation; Wagner and Blickhan (2003) investigated how minor perturbations to the stable patterns of walking or running can be recovered in a smooth way without disrupting the cycle, and found that pairs of antagonistic muscles are able to stabilize the movements without neuronal feedback. In general, in order to guarantee the self-stabilizing ability of the muscle–skeletonl system, the muscle properties such as force–length relationship, force–velocity relationship and the muscle geometry must be tuned to the geometric properties of the linkage system (Gerritsen et al. 1998). In other cases, however, although contributing substantially to the dynamic stability of the musculoskeletal system, stiﬀness control is insuﬃcient to solve the problem or is inappropriate because energetically inefﬁcient and thus it must be supplemented by actively produced control patterns. As sketched in Fig. 1, the active control actions can be derived either by feedforward or feedback control modules. In the preﬂex concept, we should also include at least in the high end of the evolutionary scale, the role of spindle primary aﬀerents, which are sensitive to velocity as well as length and are ﬁnely tuned by fusimotor output (Proske and Gregory 2002; Ellaway et al. 2002; Jaax and Hannaford 2002) although some important issues for muscle receptors remain unresolved. Feedforward control (Fig. 2a) generates the motor commands on the basis of an inverse model of the body dynamics, i.e. a neural network which has been trained to predict the force patterns (net joint torques), which are consistent with Eq. 1, given a planned movement, and thus logically inverts the causal relationship between force and movement. The pros and cons of this control scheme can be summarized as follows: (1) it is fast and the performance is not dependent on propagation delays; (2) it is robust from the point of view of stability; and (3) on the other hand, it is computationally expensive, as regards both learning and real-time control. Fig. 1 Block diagram of the motor control system

Fig. 2 a Feedforward controller; b feedback controller; c anticipative feedback controller

Continuous feedback control generates the motor commands on the basis of a comparison between the desired and the actual movement which is continuously fed back to the muscles. In the basic implementation of this control scheme (see Fig. 2b), the error signal is processed in a very simple way, for example, by adding up three contributes which are proportional, respectively, to (1) the error signal itself; (2) its integral over time; and (3) its time derivative (PID controller). Simplicity is the main positive feature of this scheme as well as the fact that there is no need of lengthy learning processes. However, it is quite sensitive to the propagation delays as regards stability: the Nyquist theorem, which strictly applies only to linear systems (but most non-linear systems can be linearized in the neighbourhood of a working point), states that instability occurs if the loop gain is greater than one for the sinusoidal component which is phase-delayed by 180. So there is a conﬂict between precision, which requires high

27

values of the controller gain, and stability, which suggests to lower the gain in order to satisfy the Nyquist criterion. A propagation delay T is very dangerous for feedback control systems, because it implies a phase delay / which grows with frequency (/=x T): for example, in biological systems, a typical value of the frequency band of a movement is 5 Hz and the delay of segmental reﬂexes is of the order of 100 ms, thus inducing a phase delay of 180 which would make the feedback control of the movement unstable. However, there are ways to recover stability of feedback control in presence of sizable propagation delays: a straightforward approach is based on the introduction, in the feedback loop, of a ‘‘state observer’’ (or Kalman ﬁlter) which reconstructs the present state by mixing the delayed state information coming from the sensory system with motor command patterns. In neurophysiological terms this is an example of corollary discharge, which fuses re-aﬀerence with an eﬀerence copy (von Holst and Mittelstaedt 1950). The observer is a complex internal computational module which has predictive properties and plays the role of a direct model, i.e. a neural network which is trained to predict the movement which is causally related to a given set of forces, also taking into account the initial state of the system. We may call this modiﬁcation of the basic feedback control scheme a predictive feedback controller. Robustness and stability are obtained at the expense of computational complexity and a suitable learning process. However, predictive feedback control is not useful for external perturbations or for noise distal to the source of eﬀerence copy (e.g. in motoneuron recruitment or muscle fatigue). The three main control paradigms outlined above are combined in diﬀerent ways in speciﬁc sensorimotor tasks. In the following, we shall analyse in detail a few representative examples.

Optimal preflex modulation + anticipatory feedback control in the stabilization of upright standing The upright standing posture (Fig. 3) is under the inﬂuence of two counteracting actions: – The destabilizing torque due to gravity: sg=mgh ÆJ; – The restoring torque due to the ankle muscles, with particular regard to the ankle stiﬀness (including segmental reﬂexes): sa= Ka ÆJ. Of course, this is a simpliﬁcation which is justiﬁed by the small size of sway movements (this allows linearization) and by the fact that viscosity can be ignored as a ﬁrst approximation, because it cannot stabilize the unstable ‘‘inverted pendulum’’ but only dampen oscillations, if stability is achieved by other means. If it is assumed that stabilization is achieved by a pure stiﬀness control mechanism (Winter et al. 1998), then the two equations above deﬁne a critical value of ankle stiﬀness:

Fig. 3 Standing posture. COP Center of pressure; COM center of mass

Kc ¼ mgh

ð2Þ

Figure 4 shows a typical sway pattern in the anteroposterior plane, which is characterized by the oscillation of the center of pressure (COP: u in Fig. 3) and the center of mass (COM: y in Fig. 3). It also shows the very slow component of the oscillation (the reference position) which is the only one under explicit conscious control. Direct measurements of intrinsic ankle stiﬀness during standing can be obtained by applying small, random rotational perturbations: by ﬁtting the stimulus-response pattern with a linear second-order model, it is possible to estimate the stiﬀness and viscous parameters. The problem is that this kind of estimates depends on the size of the perturbation, due to the non-linearity of the muscle properties; thus it is important to use test perturbations with a size range which matches natural perturbations. In the case of standing, the range is

28

Figure 5 shows a typical decomposition obtained in this manner. In the paper by Jacono et al. (2004), it has been estimated that the total torque is distributed as follows among the diﬀerent torque elements: 1. Tonic command: 69% 2. Elastic torque: 19% 3. Phasic command: 12%.

Fig. 4 Postural sway patterns in the anteroposterior plane. COP Center of pressure; COM center of mass; reference very slow component of the oscillation which is the only one under explicit conscious control

between 0.05 and 1 and in this range it has been found that the intrinsic ankle stiﬀness varies between 90% (Loram and Lakie 2002b) and 65% (Casadio et al. 2004) of the critical value. Therefore, the stabilizing action of the intrinsic muscle properties (i.e. the ankle preﬂex) must be supplemented by suitable active control patterns. Peterka (2000) suggested that stabilization is achieved by a simple continuous feedback control action quite similar to a linear PID controller. However, this is not plausible for three main reasons: 1. The model underestimates the inﬂuence of the loop delay as regards stability 2. The model does not take into account that the measurement of the sway angle is likely to have a quite low resolution because the sensors (muscle spindle and articular receptors) are activated near or below threshold 3. If the PID controller were true, it should guarantee asymptotic stability but this is not consistent with the measured sway oscillations, unless we assume a very large level of noise.

It must be emphasized that the phasic command, although the smallest in size, is the critical element for achieving stability. Figure 5 also shows that the phasic command is characterized by a regular sequence of peaks with an average inter-peak time of 0.6–0.7 s. This time should be compared with the simpliﬁed biomechanical model of the standing body as an inverted pendulum (Morasso and Schieppati 1999) which is given by the following equation: g €y ¼ ðy uÞ ð6Þ h From this equation, we can derive the time constant ofﬃ pﬃﬃﬃﬃﬃﬃﬃ an uncompensated fall of the pendulum: T ¼ h=g (about 0.3 s in a typical subject). The eﬀect of ankle stiﬀness is to reduce the functional value of gravity by about 70% and thus the resulting value of T is quite close to the inter-peak time cited above (0.6–0.7 s). In general, we think that this is consistent with the idea that the postural stabilization process operates as a sampled data control system, synchronized with the sequence of incipient falls: such micro-falls are aborted by small anticipatory torque bursts which supplement the restoring torque due to muscle stiﬀness. This view is in agreement with what Zatsiorsky and Duarte (2000) call ‘‘rambling and trembling in quiet standing’’, Loram and Lakie (2002a) call ‘‘small, ballistic-like, throw and catch

The nature of the active control which complements the ankle preﬂex can be evaluated by subtracting from the total torque, which is given by the following equation (see Fig. 2) stot ¼ mgu;

ð3Þ

the tonic torque related to the very-slowly varying reference angular position (about 1 in normal subjects) stonic ¼ mghð#ref Þ

ð4Þ

and the elastic, stiﬀness related, torque selast ¼ Ka ð# #ref Þ

ð5Þ

Fig. 5 Decomposition of the total ankle torque into (1) a tonic slowly varying torque; (2) an elastic torque (proportional to ankle stiﬀness); (3) an active torque

29

movements’’, and with the fact that the activation of ankle muscles (in particular, soleus and gastrocnemii) anticipates sway instead of following it (Gatev et al. 1999). In fact, the size of the sway patterns remains a puzzle for postural control models which rely on over-critical ankle stiﬀness values: such models are intrinsically stable and thus cannot explain persistent sway patterns unless suitable ‘‘perturbations’’ are inserted in the model. In contrast, for models, which complement an under-critical ankle stiﬀness with intermittent stabilization bursts, the residual postural oscillations are an integral part, a ‘‘signature’’, of the control process. We may also speculate why the relative value of ankle stiﬀness has evolved to this level. A controller based on over-critical stiﬀness is certainly within the capabilities of the biological hardware but this would require a substantial level of co-activation, because it is known that muscle stiﬀness is approximately proportional to the muscle force, at least for muscle force levels far away from the maximum voluntary contraction (MVC). By considering the data reported by Weiss et al. (1988), who computed the variation of ankle stiﬀness as a function of ankle torque, we may estimate that muscle force should almost double only to reach the critical stiﬀness, with a further increase for assuring the known stiﬀness margin: this would require the ankle muscles to come very close to the MVC. So, although feasible, the stiﬀness control strategy appears to be quite uneconomical from the energetic point of view in the case of the ankle because maintaining a very high level of tonic muscle activity for a long time is certainly fatiguing and undesirable. Moreover, there is no evidence of a substantial co-activation in the ankle muscles during standing. The strategy suggested above, which is based on a mix of stiﬀness and active control, is certainly more economical, although more expensive from the computational point of view. In the same line of reasoning, we may also explain why the brain did not choose to extremize the ‘‘economical’’ strategy by operating with a much lower stiﬀness level. As a matter of fact, if forced, the brain is able to master situations in which ankle stiﬀness is virtually null: this is what happens, for example, when standing on stilts or in rope walking. The reason of keeping a suitable level of stiﬀness, if possible, is to assure a suﬃcient increase of the falling time constant, from the basic value of about 0.3 s to a larger value which allows ‘‘an internal model of sensorimotor integration’’ to carry out an optimal state estimation, similar to Kalman ﬁltering, with suﬃcient accuracy. In a study with a diﬀerent sensorimotor task, Wolpert et al. (1995) provide empirical evidence that Kalman-like sensorimotor ﬁltering processes are employed commonly and their timing cycles impose speciﬁc constraints on the ability to carry out critical integration tasks, particularly if unstable as upright standing. In fact, unconventional standing tasks like standing on stilts are simpliﬁed by artiﬁcially increasing the height and/or the moment of

inertia: in both cases, the end result is a decrease in the falling time constant. In general, we think that the available evidence suggests that the ankle preﬂex is optimally set in order to meet two competing requirements: (1) minimizing eﬀort and (2) slowing down the fall to a value which is compatible with the timing of the internal body model. Since stiﬀness grows with bias force, it appears that the mechanism used by the brain for implementing the optimal preﬂex level is simply to slightly displace forward the COP with respect to the ankle joint (typically 2–3 cm): the optimal ankle stiﬀness is thus a side eﬀect of the tonic command (Jacono et al. 2004). Is the inverted pendulum model correct? In the previous section, it was implicitly assumed that the standing body could be assimilated to an inverted pendulum, thus reducing postural sway to oscillations of the ankle joint. However, can we neglect the other joints of the kinematic chain, in particular, the hip and the neck joint? In fact, it is known that under pathological conditions or simply as a result of aging, there is a tendency to substitute the ankle strategy with the hip strategy: instead of modulating the ankle torque in order to keep the COM within a safety region, the controller directly operates on the COM, by moving the upper part of the body to regain balance after each incipient fall. In fact, Eq. 6 applies to both cases, because the basic biomechanics is just the same. The so-called ankle strategy is implemented by modulating the position of the COP and thus can be more appropriately labeled as COP strategy; the hip strategy, on the other hand, has the goal of shifting the global position of the COM and thus can be labeled COM strategy (Gagey et al. 2002). There is no doubt that the hip strategy is less economical than the ankle strategy and the fact that it only surfaces in pathological conditions is a further proof that stabilization is an active, integrative process, not the pure eﬀect of stiﬀness. This kind of evidence also suggests that a crucial role is played by the receptors in the ankle, the ankle muscles and, moreover, in the foot, without disregarding visual and vestibular which operate on a diﬀerent level. Aging is known indeed to be characterized by a progressive impoverishment of the peripheral sensory information and thus the ‘‘noisy’’ proprioceptive channel must be ‘‘taken over’’ by other channels (visual and vestibular) beyond their usual role and this implies a shift of the focus of control. Unconventional standing tasks, like standing on stilts, are forced to use a COM strategy because the COP is constrained by the narrow surface of contact of the stilt with the ground. Coming back to the original question of this section and restricting our attention to the COP or ankle strategy, which is the stabilization mechanism of the hip and of the other more distal joints, like the neck? A

30

working hypothesis, in absence of direct measurements which may support it, is that in this case stabilization might be obtained by stiﬀness control, i.e. it is a preﬂex. Equation 2 applies to all the joints of the kinematic chain but the critical values of stiﬀness are likely to be diﬀerent in the diﬀerent joints for obvious biomechanical reasons: h decreases about linearly from foot to head and m decreases in an approximately cubic way. In contrast, the cross-sectional areas of the muscle groups responsible for the stabilization of the diﬀerent joints are comparable and since we know that maximum stiﬀness and force values are approximately proportional to such areas we may expect that in moving rostrally from foot to head there is an increase in the ratio between the actual and critical stiﬀness, possibly beyond unity. In conclusion, we put forward the hypothesis that one reason for accepting the simpliﬁcation of the inverted pendulum model might be that the joints of the upper body are stabilized by automatic preﬂexes. In the simpliﬁed analysis of the postural stabilization system carried out in this section we ignored the modulation of segmental reﬂexes by means of the fusimotor system. This is certainly an important feature of biological motor control in general, but is more likely to have a determinant role in other tasks, like locomotion (Ellaway et al. 2002), which are characterized by a sequencing of subtasks with diﬀerent biomechanical requirements.

Optimal preflex modulation + feedforward control in arm movements In many skilled movements involving the arm, the success of the task depends on a combination of optimal preﬂex modulation and some mechanism of anticipatory feedforward compensation. In the following, we review some paradigmatic situations. Anticipatory modulation of stiﬀness at impact A classical experiment by Lacquaniti and Maioli (1989) about catching a ball shows that the goal is achieved by three elements: 1. An anticipatory response, before the impact of the ball (at least 150 ms), which consists of the co-activation of ﬂexors and extensors of the arm and wrist 2. After impact a transient modiﬁcation of the stretch reﬂex 3. A further and more robust co-activation of ﬂexors and extensors about 100 ms after the impact, rather than reciprocal inhibition. In fact, the essential point in catching a ball is to assure a suﬃciently stiﬀ impact, in order to avoid that the hand yields too much in the initial impact phase, and, after the impact, to quickly dissipate the kinetic energy of the ball. These two actions can be associated

with the two co-activation episodes found by Lacquaniti and Maioli (1989). In particular, viscosity is likely to play a signiﬁcant role in the second phase, because energy dissipation and damping mainly depends on the viscous parameter. In fact, it has been demonstrated that by modulating segmental circuits it is possible to tune the damping factor of muscle groups. Thus, in this task, preﬂex modulation is likely to include the anticipatory modulation of segmental reﬂexes by means of the fusimotor system, operating on top of the intrinsic mechanical impedance of active muscles and other viscoelastic tissues. Feedforward compensation of interaction forces If we consider simple reaching movements of the arm, without interaction with external objects, then we can re-write Eq. 1 as follows: _ q_ þ GðqÞ sm ¼ IðqÞ€q þ Cðq; qÞ

ð7Þ

taking into account that the term on the left includes both active muscle torques and the torques contributed by passive properties of muscles, skin and ligaments. The equation implies that in order to drive the arm through a planned trajectory the net joint torques which must be generated must somehow contain a term proportional to acceleration (the inertial term), a term related to speed (in a quadratic way: the Coriolis or ‘‘gyroscopic’’ term), and a term related to gravity, which only depends on position. All together, the three terms are usually named interaction forces, and represent a self-generated disturbance due to the dynamic interaction among the diﬀerent degrees of freedom. Such interactions are highly non-linear: the inertial term dominates movement initiation and termination (when acceleration is high and velocity is low), whereas the Coriolis term is dominant in mid-ﬂight (when acceleration is low and velocity is high). The non-linearity of this process applies in both the spatial and time domain, as is apparent in Fig. 6 which shows the simulation results of a simple arm model (2 degrees of freedom, operating in the horizontal plane, thus eliminating the eﬀect of gravity). Starting from the same initial position, movements in diﬀerent directions are considered: slow in the top panel (duration=1.5 s); fast in the bottom panel (duration=0.75 s). The planned trajectories are straight with a symmetric bell-shaped speed proﬁle, in agreement with experimental evidence in reaching movements (Morasso 1981). The joint torques, computed according to Eq. 7 are visualized as equivalent virtual forces applied to the hand. If there were no interactions among the degrees of freedom and the arm could be assimilated to a point mass, then the virtual forces should be aligned with the trajectory: directed forward in the ﬁrst half of the movement (when acceleration is positive) and backward in the second half of the movement (when acceleration is negative). In fact, such longitudinal component is present but there is a

31

Fig. 6 Visualization of the interaction forces, as end-point disturbances, during reaching movements in diﬀerent directions and with diﬀerent duration

lateral component as well, which may be directed either on the right or the left. Such lateral push has the same order of magnitude of the longitudinal (standard) component and depends in a non-linear way on the direction and overall speed of the movement without any obvious regularity. In spite of this strong non-linearity and anisotropy, human reaching movements are characterized by remarkable space and time invariant features (Morasso 1981) which are apparent in Fig. 7. Which is the mechanism that allows the brain to obtain such remarkable results? Years ago, it was suggested that a stiﬀness strategy was suﬃcient and the brain might limit its

Fig. 7 Reaching movements in the horizontal plane. Trajectories sampled at 100 Hz (tabove) and corresponding speed proﬁles (below)

processing to an equilibrium trajectory (the so-called equilibrium-point hypothesis, Bizzi et al. 1992). Although in this control paradigm there is no critical value of stiﬀness for achieving the goal (in contrast to the standing task), it is quite clear that the discrepancy between the equilibrium and the real trajectory is a monotonic function of the level of stiﬀness. The direct measurement of the end-point stiﬀness during the trajectory (Gomi and Kawato 1996) demonstrated that the physiological stiﬀness levels were insuﬃcient to compensate the interaction forces in a suﬃcient way and thus an additional control mechanism was necessary. There are several lines of evidence that this mechanism operates in feedforward, at least in the initial part of the trajectory. One of them comes from the

32

comparison between the reaching movements of normal subjects and the movements of cerebellar ataxic patients (Sanguineti et al. 2003). As shown in Fig. 8, the ability of solving the task, i.e. reaching the targets, is preserved in the cerebellar patient and the main diﬀerence is that the movements of the patient lose the spatiotemporal invariance typical of the normal subjects: the trajectories are distorted and the distortion pattern varies with movement direction, amplitude, and speed. In fact, interaction forces, which depend on speed and acceleration, vanish before starting and after terminating the movements and thus do not prevent the patients from reaching the target if they ignore them when producing the motor response to the presentation of the target. Interaction forces are automatically elicited at movement onset and are thus likely to immediately deviate sideways the planned trajectory, unless compensated. This eﬀect can be estimated by looking at the aiming error, i.e. the angular diﬀerence between the initial movement direction and the target direction. This error is signiﬁcant in cerebellar patients and was found to be consistent with the lateral pattern of interaction forces (Sanguineti et al. 2003). Consider for example the initial part of the movements in Fig. 8. The aiming error is clearly larger in the rightward movements than in the leftward movements, in agreement with the simulation (Fig. 6) in which the lateral component of the interaction force is larger in the former case than in the latter at movement onset. This consideration only applies to the initial part of the movement (roughly up to the time of peak velocity); the ﬁnal part of the movement can be modiﬁed by the incoming feedback, if necessary, and the patients appear to perform secondary corrective moveFig. 8 Reaching movements and corresponding speed proﬁles for a normal subject (left ) and for a cerebellar ataxic patient (right)

ments in order to compensate the accumulation of errors in the initial part. In conclusion, this kind of result not only suggests that the compensation of interaction forces is achieved by means of a feedforward control, based on an internal model of dynamics equivalent to Eq. 7, but it also suggests that cerebellum is a likely site for the model. Thus the compensation of interaction forces is achieved by a combination of muscle stiﬀness (preﬂex) and feedforward control. If the feedforward control is deﬁcient, the subject can use a combination of backup strategies: (1) increasing stiﬀness (it reduces deformation); (2) decreasing speed (it reduces interaction forces); and (3) stepping up the role of feedback, particularly in the ﬁnal part of the movement. Feedforward compensation of inertia anisotropy The non-linearity and anisotropy of arm dynamics are also apparent in the analysis of the hand mechanical impedance. By applying small perturbation and measuring the force response (Fig. 9, top left panel), it is possible to estimate the mechanical parameters of the hand (mass, viscosity, stiﬀness) and investigate how they vary with the position of the arm and the direction of the disturbance (Tsuji et al. 1995). In particular, the dependence on direction can be represented by means of an ellipse: Fig. 9 shows the distribution of the inertia, viscosity and stiﬀness ellipses, respectively, in diﬀerent parts of the workspace. The fact that these ellipses are far from round is an index of the degree of anisotropy of the arm

33 Fig. 9 Measurement scheme of the hand mechanical impedance (top left panel). Inertia ellipses (top right panel). Viscosity ellipses (bottom right panel). Stiﬀness ellipses (bottom left panel). Note that inertia ellipses are approximately aligned with the forearm and viscosity and stiﬀness ellipses are approximately aligned in a polar way

musculoskeletal system, again a relevant feature which must somehow be taken into account by the brain and compensated for. Let us consider here the apparent inertia, which is characterized by the fact that the inertia ellipse is approximately aligned with the forearm. This means, for example, that in hitting something the best hitting direction (which allows one to transfer the maximum kinetic energy to the object) is aligned with the forearm; moreover, in reaching movements, the apparent inertia ‘‘perceived’’ by the motor system varies with movement direction and this fact was made explicit by experiments on reaching movements carried out by Ghez et al. (1994), who found that indeed the acceleration peaks vary with movement direction in agreement with the apparent inertia of the hand. Starting from this experimental basis, Flanagan and Lolley (2001) carried out reaching movements of the index ﬁnger in diﬀerent directions with the ﬁnger pushing a weight. The friction of the weight with the supporting surface was minimized by means of an air sled but the subjects had to solve a delicate coordination problem because they did not push the weight from behind but from the top and so the ﬁnger was in danger of slipping after movement onset. The theory of friction says that in order to avoid slippage the force component tangential to the contact surface must be less that a given fraction of the perpendicular component (the friction coeﬃcient of the surface). In the pushing experiment by Flanagan and Lolley (2001) the tangential force varied with movement direction in agreement with the variation of the acceleration found by Ghez et al. (1994): as a consequence, also the critical value of the normal components varied with movement direction in a propor-

tional way. Although the task could be solved by setting the value of the normal force according to the ‘‘worst case’’, the subjects chose the more economical strategy of modulating the normal force in anticipation of the expected inertial force: energetic eﬃciency was obtained at the expense of computational complexity. Again, this is an example of feedforward control which captures the non-linearity and anisotropy of the biomechanical plant and compensates it by means of an internal model of movement dynamics.

Motor learning in dynamic environments In the most general situation, the arm movements occur in a dynamic environment in which, in addition to the unavoidable internal forces considered in the previous section there are also external forces, characterized by more or less complex non-linearities and/or anisotropies. As regards internal forces, we summarized the experimental evidence that in a large variety of situations, the motor control problem is solved by complementing the mitigating eﬀect of preﬂexes with the feedforward control provided by an internal model of the self-generated force disturbances. The fundamental question is: can we extend the same paradigm also to external forces and, if this is the case, what are the limits of this approach? A very general setup for studying problems of this kind is a robotized haptic interface (Fig. 10, above). It consists of a manipulandum, usually a planar arm with 2 degrees of freedom, which can apply external forces to the hand, with arbitrary direction and intensity, operating as a bi-directional proprioceptive channel: mechanical energy ﬂows back and forth between the

34

Fig. 10 Robotized haptic interface (tabove). Arm reaching movements carried out in a viscous force ﬁeld in the initial phase of learning (below)

biological and the artiﬁcial systems, providing on-line multidimensional information about shape, surface and mass characteristics which are essential for skilled manipulation. In the most general situation, the external force vector is generated as a function of the hand position and kinematics: Fext ¼ f ðx; x_ ; €xÞ

ð8Þ

having deﬁned x as the hand position/orientation in the workspace. In this way, it is possible to generate force ﬁelds which simulate arbitrary paradigms of external forces, contacts, etc. For example, Burdet et al. (2001) generated an unstable force ﬁeld which simulated a sort of inverted pendulum. The subjects were required to perform reaching movements between two targets (A and B). The haptic interface generated lateral forces (perpendicular to the line joining A and B) which were proportional to the lateral displacement (null only along the nominal trajectory): the rate of growth of such toppling force set a critical value of the stiﬀness, as deﬁned in the case of the standing posture. Initially the subjects were unable to reach the targets because of the unexpected lateral

push and of the insuﬃcient value of the hand stiﬀness. Gradually, however, they learned to compensate the toppling forces by ‘‘optimally’’ matching the end-point stiﬀness to the destabilizing force ﬁeld. This was obtained not only by an overall increase of the size of the stiﬀness ellipses, which could be obtained simply by proportionally co-activating all the arm muscles, but by a re-arrangement of the global muscle activation patterns which was equivalent to an ‘‘optimal rotation’’ of the stiﬀness ellipses: a rotation which aligned the long axis of the ellipses to be perpendicular to the nominal trajectory. So this is an example that preﬂexes can be conditioned in new and previously inexperienced situations and that a well-conditioned preﬂex can solve novel interaction problems. In the example above, the force ﬁeld vanishes if the subject is able to reproduce the nominal trajectory and in fact, after learning, the reorganization of the motor control is characterized both by an optimal impedance matching and a reduction of the variability of the generated trajectory: the result is a ‘‘super-normal’’ performance with very regular trajectories, which virtually ‘‘hide’’ the eﬀect of the ﬁeld. In other dynamic environments learning is diﬀerent, in the sense that it is impossible to ‘‘hide’’ the ﬁeld after learning but it is necessary to resist it in a direct way. For example, Fig. 10 (below) shows the eﬀect of a viscous ﬁeld which generates leftward lateral forces proportional to the speed of the movement: in this ﬁeld the external force is null at the beginning and the end of the movements and reaches peak value in mid-ﬂight. In the initial trials of exposition to this ﬁeld, the deformation of the reaching movement is substantial, but with a suitable practice (a few hundred trials) the subjects have no difﬁculty to recover the unperturbed spatiotemporal patterns. In a computational framework for describing how the humans go about learning modulation of arm impedance for interaction with a novel force ﬁeld, an essential component of the adaptive system can be formulated in terms of internal models (IMs) which predict the temporal pattern of sensory feedback that should be received for the pattern of muscle activations that have been commanded. The error or mismatch is computed at the level of reﬂexes and serves not only to modulate the impedance at the individual muscle level, but perhaps more importantly, to provide crucial information for adaptation of the IM. This adaptation is the key element which allows the human limb to work smoothly in a wide variety of mechanical environments with a suﬃcient degree of ‘‘generalization’’ (Goodbody and Wolpert 1998; Karniel and Mussa Ivaldi 2003; Davidson and Wolpert 2004; Della Maggiore et al. 2004; among others). The pattern of generalization of the IM after training on a small data space suggests that the basis functions used by the brain to represent the IM are state and not time dependent and that the domain of these basis functions has a coordinate system close to that of the

35

muscle sensors as opposed to visual sensors (Shadmehr and Mussa Ivaldi 1994). Furthermore, the ability to learn control in multiple environments is made possible by a gradual, time-dependent process during which the parameters of the recently learned IM become stable and resist change (Brashers-Krug et al. 1996). This appears to be an organizing mechanism for the adaptive controller, allowing it to separate dynamics of the environment into distinct classes and presumably facilitate future identiﬁcation only after a few movements in the ﬁeld.

Conclusions The crucial role of preﬂexes has been reviewed and the fact that they must be optimally matched to the speciﬁc interaction modalities of any given task. In some case, this optimization can be carried out oﬀ-line, as a result of phylogeny or engineering design; in other cases, it must be carried out on-line as an integral part of the learning process. Furthermore, there is a complex system of internal models for implementing feedforward and feedback control paradigms but the lesson for the designer of biomimetic control systems is that the peripheral part of the system is really crucial and biomimesis must begin from there. This is the ﬁrst step in outlining some general guidelines for biomimetic design and the second one is that in no way must the ‘‘control software’’ be conceived and designed independently of the physical characteristics of the body-environment interaction as regards sensors, actuators, materials. On the contrary, as the nervous tissue is optimally matched to the sensor and muscle tissue and, ultimately, to the external world, so the computational machinery must be carefully matched to the overall system of interactions.

References Bizzi E, Hogan N, Mussa Ivaldi FA, Giszter SF (1992) Does the nervous system use equilibrium-point control to guide single and multiple movements? Behav Brain 15:603–613 Brashers-Krug T, Shadmehr R, Bizzi E (1996) Consolidation in human motor memory. Nature 382:252–255 Brown IE, Loeb GE (1997) A reductionist approach to creating and using neuromusculoskeletal models. In: Winters JM, Crago PE (eds) Biomechanics and neural control of movement. Springer, Berlin Heidelberg New York, pp 148–163 Burdet E, Osu R, Franklin DW, Milner TE, Kawato M (2001) The central nervous system stabilizes unstable dynamics by learning optimal impedance. Nature 414:446–449 Casadio M, Morasso P, Sanguineti V (2004) Braccio di ferro: a new robotized haptic interface. Gait Posture (in press) Davidson PR, Wolpert DM (2004) Scaling down motor memories: de-adaptation after motor learning. Neurosci Lett 370:102–107 Della Maggiore V, Malfait N, Ostry DJ, Paus T (2004) Stimulation of the posterior parietal cortex interferes with arm trajectory adjustments during the learning of new dynamics. J Neurosci 24:9971–9976

Ellaway P, Taylor A, Durbaba R, Rawlinson S (2002) Role of the fusimotor system in locomotion. Adv Exp Med Biol 508:335– 342 Flanagan JR, Lolley S (2001) The inertial anisotropy of the arm is accurately predicted during movement planning. J Neurosci 21:1361–1369 Gagey PM, Bizzo G, Ouaknine M, Weber B (2002) Two mechanical models for postural stabilization: the tactics of the center of gravity and the tactics of the center of pressure. http://perso.club-internet.fr/pmgagey/TactiqueDuPied-a.htm Gatev P, Thomas S, Thomas K, Hallet M (1999) Feedforward ankle strategy of balance during quiet stance in adults. J Physiol 514:915–928 Gerritsen KG, van den Bogert AJ, Hulliger M, Zernicke RF (1998) Intrinsic muscle properties facilitate locomotor control—a computer simulation study. Motor Contr 2:206–220 Ghez C, Gordon J, Ghilardi MF, Sainburg RL (1994) Contributions of vision and proprioception to accuracy in limb movements. In: Gazzaniga MS (ed) The cognitive neurosciences. MIT, Cambridge, pp 549–564 Gomi H, Kawato M (1996) Equilibrium-point control hypothesis examined by measured arm stiﬀness during multijoint movement. Science 272:117–120 Goodbody SJ, Wolpert DM (1998) Temporal and amplitude generalization in motor learning. J Neurophysiol 79:1825–1838 von Holst E, Mittelstaedt H (1950) Das Reaﬀerenzprinzip. Wechselwirkungen zwischen Zentralnervensystem und Peripherie. Naturwissenschaften 37:464–476 Hunter IW, Kearney RE (1982) Dynamics of human ankle stiﬀness: variation with mean torque. J Biomech 15:747–752 Jaax KN, Hannaford B (2002) A biorobotic structural model of the mammalian muscle spindle primary aﬀerent response. Ann Biomed Eng 30:84–96 Jacono M, Casadio M, Morasso P, Sanguineti V (2004) The sway density curve and the underlying postural stabilization process. Motor Contr 8:292–311 Karniel A, Mussa-Ivaldi FA (2003) Sequence, time, or state representation: how does the motor control system adapt to variable environments? Biol Cybern 89:10–21 Lacquaniti F, Maioli C (1989) The role of preparation in tuning anticipatory and reﬂex responses during catching. J Neurosci 9:134–148 Loram ID, Lakie M (2002a) Human balancing of an inverted pendulum: position control by small, ballistic-like, throw and catch movements. J Physiol 540:1111–1124 Loram ID, Lakie M (2002b) Direct measurement of human ankle stiﬀness during quiet standing: the intrinsic mechanical stiﬀness is insuﬃcient for stability. J Physiol 545:1041–1053 Morasso P (1981) Spatial control of arm movement. Exp Brain Res 42:223–227 Morasso P, Sanguineti V (2002) Ankle stiﬀness alone cannot stabilize upright standing. J Neurophysiol 88:2157–2162 Morasso P, Schieppati M (1999) Can muscle stiﬀness alone stabilize upright standing? J Neurophysiol 82:1622–1626 Peterka RJ (2000) Postural control model interpretation of stabilogram diﬀusion analysis. Biol Cybern 83:335–343 Proske U, Gregory JE (2002) Signalling properties of muscle spindles and tendon organs. Adv Exp Med Biol 508:5–12 Sanguineti V, Morasso P, Baratto L, Brichetto G, Mancardi GL, Solaro, C (2003) Cerebellar ataxia: quantitative assessment and cybernetic interpretation. Human Move 22:189–205 Shadmehr R, Mussa-Ivaldi FA (1994) Adaptive representation of dynamics during learning of a motor task. J Neurosci 14:3208– 3224 Tsuji T, Morasso P, Goto K, Ito K (1995) Human hand impedance characteristics during maintained posture in multi joint arm movements. Biol Cybern 72:475–485 Wagner H, Blickhan R (2003) Stabilizing function of antagonistic neuromusculoskeletal systems: an analytical investigation. Biol Cybern 89:71–79 Wagner H, Blickhan RR (1999) Stabilizing function of skeletal muscles: an analytical investigation. J Theor Biol 199:163–179

36 Weiss PL, Hunter IW, Kearney RE (1988) Human ankle joint stiﬀness over the full range of muscle activation levels. J Biomech 21:539–544 Winter DA, Patla AE, Prince F, Ishac M (1998) Stiﬀness control of balance in quiet standing. J Neurophysiol 80:1211–1221

Wolpert DM, Ghahramani Z, Jordan MI (1995) An internal model for sensorimotor integration. Science 269:1880–1882 Zatsiorsky VM, Duarte M (2000) Rambling and trembling in quiet standing. Motor Contr 4:185–200

Preflexes and internal models in biomimetic robot ...

Jan 14, 2005 - external devices, will be quite similar to biomimetic ro- bot systems which attempt to ..... subject can use a combination of backup strategies: (1) increasing stiffness (it ..... computer simulation study. Motor Contr 2:206â220.

Download PDF

611KB Sizes 0 Downloads 117 Views

Report

Preflexes and internal models in biomimetic robot ...

Recommend Documents