Learning to drive the human way-A step towards ...

Viewer
Transcript

24

Int. J. Vehicle Autonomous Systems, Vol. 6, Nos. 1/2, 2008

Learning to drive the human way: a step towards intelligent vehicles Michel Pasquier* and Richard J. Oentaryo Centre for Computational Intelligence, Nanyang Technological University, Nanyang Avenue, Singapore 639798 Fax: +65-679-26-559 E-mail: [email protected] E-mail: [email protected] *Corresponding author Abstract: This paper describes a series of works on the development of an intelligent driving system that will learn from example. In this approach, inspired from the control mechanisms in the human cerebellum, driving skills are modelled as continuous decision-making processes using approximate rules that map sensory input onto control output. Since designing such a rule set is difficult, we aim at capturing human expertise by automatically extracting the rules from the sample data. A driving simulator provides both scenarios and data collection features while a learning system comprising several self-organising neuro-fuzzy rule-based subsystems realises the driving model. Driving skills successfully achieved so far include operational manoeuvres such as reverse/ parallel parking and U-turn, validated both in simulation and using a microprocessor-controlled model car, as well as lane-following and lane-changing. Also discussed is the emergence of tactical driving skills, such as deciding when to overtake, to automatically handle various traffic situations. Keywords: autonomous driving; brain-inspired cognitive architecture; neuro-fuzzy system; Generic Self-organising Fuzzy Neural Network realising Yager; GenSoFNN-Yager; intelligent transportation systems; parallel parking; pseudo outer-product fuzzy neural network using the compositional rule of inference; POPFNN-CRI; reverse parking; tactical driving; U-turn. Reference to this paper should be made as follows: Pasquier, M. and Oentaryo, R.J. (2008) ‘Learning to drive the human way: a step towards intelligent vehicles’, Int. J. Vehicle Autonomous Systems, Vol. 6, Nos. 1/2, pp.24–47. Biographical notes: Michel Pasquier received a diploma in Electrical Engineering and a PhD in Computer Science in 1985 and 1988, respectively, from the National Polytechnic Institute of Grenoble, France. From 1989 to 1994, he worked as a Researcher in Tsukuba, Japan, at the ElectroTechnical Laboratory and then at Sanyo Electric. In 1994, he joined in Nanyang Technological University, Singapore, where he teaches Artificial Intelligence and other Computer Science courses, and he is the Cofounder and the Director of the Centre for Computational Intelligence. His research interests include cognitive systems, adaptation and learning and nature-inspired systems, with applications in intelligent transportation, robotics and automation.

Copyright © 2008 Inderscience Enterprises Ltd.

Learning to drive the human way

25

Richard J. Oentaryo received a BE (first class honour) in Computer Engineering from Nanyang Technological University (NTU), Singapore in 2004. He also received the Information Technology Management Association Gold Medal cum Book Prize Award for the most outstanding Final Year Project in 2004. Presently, he is pursuing a Doctoral (PhD) study at the Centre for Computational Intelligence, NTU. His main research interests include cognitive architectures, fuzzy systems, neural networks and computational neuroscience.

1

Introduction

Traffic statistics show that human errors remain by far the primary cause of accidents. According to a study by the US Department of Transportation, lapses in driver attention contribute to majority of the traffic crashes (Wang, et al., 1996). Improving safety is thus a major concern and the development of in-car technologies for monitoring, prevention and guidance, has become accordingly a key research area (Aparicio et al., 2005). Despite the challenge, the potential benefit of automated driving systems is slowly being recognised, with the dual aim to reduce the burden of the human driver while providing both safe and smooth vehicle operations. This paper presents an overview of the work conducted since 2000 at the Centre for 2 Computational Intelligence (C i), Nanyang Technological University (NTU), towards the design of intelligent transportation systems that will learn to operate entirely from example (Pasquier et al., 2001). In this approach, driving skills are modelled as continuous decision-making processes involving a set of approximate rules that relate sensory inputs to control outputs, as plausibly justified from the perspective of the control mechanisms in human cerebellum (Kandel et al., 2000). Since designing such rules is notoriously difficult, and a well-known issue in rule-based and fuzzy systems (Lin and Lee, 1996), one main objective of our research is to make use of human expertise implicitly, which requires to automatically extract the control rules from sample training data. For this purpose, a driving simulator has been realised to provide both scenarios and data collection capabilities. Subsequently, a number of self-organising neuro-fuzzy rule-based systems have been developed and employed over the years to realise the desired driving model. These are generic cognitive systems that can be and have been applied to many other application domains (Oentaryo and Pasquier, 2006; Quek et al., 2001, 2006; Quek and Zhou, 2002; Tung and Quek, 2005; Tung et al., 2004). The skills successfully achieved so far include operational manoeuvres such as low-speed reverse and parallel parking as well as U-turn, validated both in simulation and using a microprocessor-controlled model car, then high-speed lane-keeping, lane-changing and overtaking behaviours, and finally specific tasks such as negotiating red-light crossings. Latest developments include the emergence of tactical driving skills that allow the intelligent vehicle to automatically handle various traffic situations by deciding for instance when to stay put and when to overtake. The following sections describe the experiments conducted and results obtained so far, and end with a discussion of current issues and future work. While fully automated robot taxis and autonomous vehicles may still be far away, the technologies developed as a result of this research could readily be used in intelligent guidance and assistance systems.

26

M. Pasquier and R.J. Oentaryo

2

Learning and memory systems

Vehicle driving involves complex decision-making processes that are characterised by multiple, interdependent stages involving mappings between sensory inputs and control outputs. In order to attain some degree of expertise in such processes, a certain amount of practice is required. This exemplifies the notion of decision-making as a form of cognitive skill that can be developed through practice. Cognitive skill involves the ability to effectively exploit one’s knowledge in the execution of cognitive processes (Anderson, 1981). In this respect, learning from examples has been established as a critical factor in facilitating the gradual transition from a novice’s slow and laborious execution to an expert’s rapid and accurate execution of a skilled behaviour (Tomporowski, 2003). Such a proficiency development is achieved through accumulation, recognition and refinement of the salient features from the past experiences (Anderson, 1981). Central to this development is thus the ability to acquire new knowledge (i.e. learning) and to retain that knowledge for later retrieval (i.e. memory). As part of our endeavour to model the acquisition of driving skills, a brain-inspired cognitive architecture modelling learning and memory systems is being developed to describe the mechanisms underlying cognitive skill acquisition. Of particular interest in this cognitive architecture is the cerebellum, a major site for procedural memory in the brain that supports acquisition of a variety of habit and control skills (Kandel et al., 2000). In this project, a number of neuro-fuzzy systems have been employed to identify and extract from data the relevant control rules and to subsequently emulate the functionality of the cerebellum. Neuro-fuzzy systems synergise fuzzy logic and neural network technologies by combining the human-like reasoning style of the former with the learning and adaptation capabilities of the latter (Lin and Lee, 1996). Such hybrid systems allow bridging the gap between low-level neural mechanisms and high-level symbolic cognition (using fuzzy IF-THEN linguistic rules), and constitute thus crucial components of the proposed cognitive architecture. In this application, the neuro-fuzzy systems developed are found capable of learning and realising various vehicle manoeuvres in a similar manner to humans (in terms of car trajectory, speed/brake adjustment, etc.).

2.1 POPFNN-CRI Our initial work on the automated vehicle system employed the Pseudo Outer-Product Fuzzy Neural Network using the Compositional Rule of Inference (POPFNN-CRI) (Ang et al., 2003) to realise several vehicle manoeuvres, as reported in Section 4. POPFNN-CRI is a five-layer neural network realising a multiinput-multioutput fuzzy system. It is a part of the POPFNN family of neuro-fuzzy systems that share the same architecture and self-organising method while realising different fuzzy inference schemes (Ang et al., 2003; Quek and Singh, 2005; Quek and Zhou, 1999; Quek and Zhou, 2001). The structure of the POPFNN-CRI network is depicted in Figure 1. Nodes in layer 1 (input layer), called input linguistic nodes, represent the input variables such as velocity, distance information (from proximity sensors and vision system), etc. In this layer, the non-fuzzy inputs are fuzzified into a fuzzy singleton and then transmitted directly to layer 2. The layer 2 (condition layer) nodes are called input label

Learning to drive the human way

27

nodes and correspond to the input linguistic labels such as ‘slow’, ‘moderate’ and ‘fast’ for the velocity input. Nodes in layer 3 (rule-based layer) are called rule nodes, each of which represents a fuzzy rule (e.g. ‘IF velocity is Slow and route is Clear THEN accelerator is High’) and implements the CRI scheme. The layer 4 (consequence layer) nodes are called output linguistic nodes and correspond to the output linguistic labels such as ‘Low’, ‘Medium’ and ‘High’ for the accelerator control. Each node in layer 2 and 4 is represented by a trapezoidal membership function. Lastly, nodes in layer 5 (output layer), called output linguistic nodes, represent the output variables and perform an output defuzzification (Lin and Lee, 1996) to compute the final output values from the resultant output fuzzy sets. More details about the layer operations can be found in Ang et al. (2003). Figure 1

The POPFNN-CRI network

Source: Adapted from Ang et al., 2003.

The learning process of the POPFNN-CRI network comprises two stages. The first stage aims at deriving the fuzzy membership functions of the input and output variables using Fuzzy Kohonen Partition (FKP) or Pseudo Fuzzy Kohonen Partition (PFKP) algorithm (Ang, 2000). The former method is a supervised learning algorithm that uses known information about the clusters of the data samples to generate fuzzy partitions, whereas the latter is an unsupervised learning algorithm that uses competitive learning to construct fuzzy partitions. In the second stage, a novel POP algorithm (Quek and Zhou, 2001) is employed to identify the fuzzy rules. The POP

28

M. Pasquier and R.J. Oentaryo

algorithm is a simple one-pass learning algorithm that can identify relevant rules in an intuitive manner. In this scheme, among the links between the rule nodes and the output linguistic labels, the link with the highest weight is selected and the rest are pruned. The remaining rule nodes after the link selection process constitute the final rules identified by the system.

2.2 GenSoFNN-Yager A more recent development of the automated vehicle system at C2i focuses on the use of a Generic Self-organising Fuzzy Neural Network realising Yager inference (GenSoFNN-Yager) to perform various manoeuvres, as described in Section 4. GenSoFNN-Yager is a dynamically-evolving system that can automatically formulate a consistent rule base, in which each fuzzy label is uniquely represented by one fuzzy set (the order of which is unchanged) and no ambiguous or obsolete rules shall remain upon completion of training. It belongs to the GenSoFNN family of neuro-fuzzy systems that have similar structure and self-organising features but realise different inference methods (Oentaryo and Pasquier, 2006; Quek et al., 2006; Tung and Quek, 2002, 2005). The network models Yager fuzzy inference scheme (Keller et al. 1992). The main benefit of such a scheme is that when the input matches the antecedent exactly, the resultant output will be exactly the consequence of the rule, which is not the case in CRI scheme and other conventional approaches. The scheme is also conceptually clear and strongly maps to the Boolean/crisp logic, which in a way resembles human intuitive way of logical reasoning. As such, the Yager inference provides the GenSoFNN-Yager system with a stronger logical foundation than other systems employing conventional inference methods. An experimental validation of this conjecture is provided in (Oentaryo and Pasquier, 2006). The GenSoFNN-Yager network, illustrated in Figure 2(a), comprises five layers of neurons. As with the POPFNN-CRI, the input fuzzification and output defuzzification in the GenSoFNN-Yager network are performed in layer 1 (input layer) and layer 5 (output layer), respectively. Nodes in layer 2 (antecedent layer) are called antecedent nodes that represent the input linguistic labels (fuzzy sets) and compute the membership value indicating the degree of matching between the inputs and the fuzzy sets. Each node in layer 3 (rule layer) is a rule node that computes the degree of fulfilment of the fuzzy rule it represents. Layer 4 (consequence layer) consists of consequence nodes that represent the output linguistic labels (fuzzy sets). Detailed operations of the network layers are given in Oentaryo (2005) and Oentaryo and Pasquier (2004). The learning mechanism of the GenSoFNN-Yager network, as shown in Figure 2(b), consists of three stages: self-organising, rule mapping and parameter learning, all of which takes place in a single pass of training data (i.e. online learning). A Discrete Incremental Clustering (DIC) algorithm (Oentaryo, 2005; Tung and Quek, 2002) is employed in the self-organising phase to craft the input and output label nodes. As for the rule formulation phase, RuleMAP algorithm (Oentaryo, 2005; Tung and Quek, 2002) is used to construct the fuzzy rules linking the derived input and output labels. Finally, the parameter learning stage employs back-propagation algorithm (Rumelhart et al., 1986) to update the parameters of the input and output labels.

Learning to drive the human way Figure 2

29

The GenSoFNN-Yager network, (a) architecture of the GenSoFNN-Yager network and (b) learning process of the GenSoFNN-Yager network

Source: Adapted from Oentaryo and Pasquier (2006).

3

Driving simulator and data collection

A 3D OpenGL-based simulation software, depicted in Figure 3(a), has been developed to collect control data from a human driver, to train the neuro-fuzzy systems to perform various manoeuvres under different road scenarios, and to subsequently assess their driving capabilities (Pasquier et al., 2001). Feedback information used for training includes sensory data, such as the distances from obstacles, road edges and lanes and control signals consisting of acceleration, brake and gear ratio. Sensory data is obtained from vision inputs, to process images of the road from the front camera and proximity sensors that is, ultrasonic or laser, which are all simulated (except of course when using a model car). Other sensory data, such as sound, vibrations and forces experienced by the driver during turning and acceleration, are not used at the moment. A Thrustmaster NASCAR Pro Racing hardware console is provided for the human driver to control the vehicle and generate the training data. As illustrated in Figure 3(b), the console comprises a steering wheel, an acceleration pedal and a brake pedal. The accelerator, indicating the Throttle Position (TPS) control, is used together with the gear signal and brake input to determine the overall velocity of the vehicle. The steering wheel determines the desired steering angle relative to the current bearing. The diagram of the training data collection process from the human driver is given in Figure 3(c). The driver uses visual feedback from the simulator to decide the next action to be taken (e.g. turn left). The log file records all the actions taken including the sensory

30

M. Pasquier and R.J. Oentaryo

information at each simulation time interval. An illustration of one vehicle model realised in the simulator, including the list of its input proximity sensors (e.g. SFLS = Side Front Left Sensor, SBRS = Side Back Right Sensor, MFS = Mid Front Sensor, etc.) is given in Figure 3(d). Figure 3

4

The vehicle driving simulator system, (a) software interface, (b) hardware interface, (c) training data collection processes and (d) the vehicle model

Simulation results

In our initial work, POPFNN-CRI was employed as the first prototype to realise auto-driving, parallel parking and U-turn manoeuvres, mainly due to its simple and intuitive learning algorithms. GenSoFNN-Yager was later developed to address the limitations of POPFNN-CRI and is able to perform online learning and yield a more concise, consistent rule base. We then applied GenSoFNN-Yager to realise the new types of manoeuvre, including vision-based auto-driving/overtaking, tactical driving and reverse parking.

Learning to drive the human way

31

4.1 Parking and U-turn manoeuvres One of the manoeuvres learnt by the GenSoFNN-Yager using the vehicle simulator is a left-in reverse parking realised in a narrow road track with a small parking slot. Work related to a similar manoeuvre is described in Holve and Protzel (1996). The data processing of the TPS/brake, steer and parking slot detection control outputs are summarised in Figure 4(a)–(c), respectively. Instead of using a single network to deal with the four outputs simultaneously, a ‘divide-and-conquer’ approach was used to reduce the system complexity. In this method, four GenSoFNN-Yager networks were constructed and trained separately, each of which with one output corresponding to a control output. The manoeuvre comprises three stages, the data collection of which takes 1–2 min. In the first stage, the vehicle controlled by the trained networks follows the road track until it finds a vacant parking slot of suitable size. Next, the vehicle moves forward and adjusts its position to maintain a constant distance to the road edges, so as to alight itself at a favourable position to park. Finally, the vehicle performs the manoeuvre and stops at the specified site. An example of successful manoeuvre is shown in Figure 5. Figure 4

Data processing in the reverse parking system, (a) throttle position or brake subsystem, (b) steer subsystem and (c) parking slot detection subsystem

A rule firing strength study was subsequently conducted to evaluate the capability of the GenSoFNN-Yager systems in maintaining the consistency of its rule base. The rule base structures of the resulting GenSoFNN-Yager networks are presented in Table 1. The third row shows the number of labels/clusters (e.g. Near, Far) per input feature (e.g. rear left distance), while the fourth shows the number of fuzzy rules (e.g. for slot detection network: ‘IF rear left distance is Medium AND mid left distance is Medium AND front left distance is Medium, THEN slot detection is On’). A consistent rule base is best exemplified by a wide spread of the rules being fired over the total number of

32

M. Pasquier and R.J. Oentaryo

rules for all possible parking situations. The results of the rule firing strength of the steer, brake, TPS and slot detection networks are given in Figure 6(a)–(d), respectively. A large variation of rule firing strengths was observed in steer and TPS subsystems signifying the high complexity of both controls. This is reasonable as many steering and speed adjustments are required within multiple turns of the manoeuvre, as in the case of human driving. The brake and slot detection subsystems, on the other hand, exhibit a smaller variation than that of the steer and TPS subsystems. This can be justified from the reasoning that they are usually applied for a short period and remain inactive for the rest of the time. Figure 5

Left-in reverse parking using GenSoFNN-Yager network, (a) parking slot detection, (b) parking adjustments, (c) reverse parking manoeuvre and (d) completed parking

Table 1

Structure of the GenSoFNN-Yager network in reverse parking

Parameter

Steering system

TPS system

Brake system

Detection system

No. of input labels

35

83

25

13

No. of output labels

9

23

2

3

No. of input labels per input feature No. of fuzzy IF-THEN rules

(2,7,6,11,9)

(33,30,20)

(8,10,7)

(5, 5, 3)

45

112

46

13

Learning to drive the human way Figure 6

33

Rule firing strengths of the GenSoFNN-Yager network in reverse parking, (a) steer control, (b) brake control, (c) throttle position control and (d) parking slot detector

Another kind of manoeuvre realised within our system is parallel parking, which involves parking a vehicle in parallel to other parked cars (Paromtchik and Laugier, 1996). Here, the experiment mainly focuses on the left-in parking, with similar road scenario and data collection duration to the reverse parking previously discussed. In this case, only three control outputs were considered: steer, TPS and brake. Using the same ‘divide-and-conquer’ approach, three POPFNN-CRI networks were constructed that correspond to the three control outputs. The data processing for the steer and TPS/brake controls are given in Figure 7(a) and (b), respectively. An additional input, Angle-To-Track (ATT), is defined as the angle between the road edge and the car, which can be estimated from sensor data. Figure 7

Data processing in the parallel parking system, (a) steer subsystem and (b) throttle position/brake subsystem

34

M. Pasquier and R.J. Oentaryo

The experimental setting of the parallel parking consists of only a single stage, that is, direct execution of the manoeuvre from a predefined initial vehicle position. As opposed to the reverse parking, no parking slot detection or multiple vehicle adjustment was performed. The results of the four sample runs with different initial vehicle positions are presented in Figure 8(a)–(d). Figure 8(a) shows the resulting trajectory when the vehicle started at a location far away from the parking slot. In Figure 8(b), the starting position of the vehicle is nearer to the slot. Figure 8(c) and (d) use the same setup as Figure 8(b) but with different vehicle orientation, that is, ATT = +5° and –5°, respectively. Figure 8

Parallel parking using POPFNN-CRI network, (a) initial position is far from the slot (ATT = 0°), (b) initial position is near to the slot (ATT = 0°), (c) Initial position is near to the slot (ATT = +5°) and (d) initial position is near to the slot (ATT = –5°)

Similar to the parallel parking, the U-turn (three-poi nt turn) manoeuvre realised in the simulator employs three control outputs: steer, TPS and brake, with the data collection duration of 1–2 min. The data processing for the steer and TPS/brake controls are respectively given in Figure 9(a) and (b). As in the parallel

Learning to drive the human way

35

parking case, three POPFNN-CRI systems were trained separately to control the three outputs. The manoeuvre consists of three stages. From its initial position (on the left side of the road), the car starts to move to the right and then stops when the road bank is close to its front body. Next, the car sets its reverse gear, moves backward while steering to the left and then stops when its rear reaches the road bank. Finally, the car toggles the gear and moves forward again while reducing its steering angle until it is nearly parallel to the road edge. Figure 9

Data processing in the U-turn system (a) steer control subsystem and (b) throttle position/brake subsystem

Experiments have been carried out using road tracks with various widths. However, training on two sample tracks, 5 and 8 metres wide and termed narrow and wide tracks, respectively, proved sufficient as the system can generalise to other widths. The vehicle trajectories for the two track datasets from time t0–t3 are depicted in Figure 10(a) and Figure 10(c), respectively. The vehicle was able to perform the U-turn proper using either an independent rule base, formulated from the network training using each individual track dataset or the combined rule base, formulated from the training using the combination of both datasets. The resulting traces of the steer subsystem for the narrow and wide tracks are given in Figure 10(b) and (d), respectively. It can be seen that from time t0 to t1, the steer was turned fully to the right (negative values). From t1 to t2, the steer was turned to the opposite direction. Finally from t2 to t3, the steer was turned again to the right and then adjusted the vehicle orientation by slowly reducing its angle. The rule firing frequencies of the steer control network for narrow and wide tracks are presented in Figure 11(a)–(b) and (c)–(d), respectively. The rule ID denotes the rule number and the frequency indicates how many times a rule is fired. As shown in Figure 11(a) and (b), seven rules were constructed and fired at different frequencies for both the independent and combined rule bases. The only difference between them lies in the distribution of the firing frequencies. Similarly, eight rules were formulated regardless of the rule base (see Figure 11(c)–(d)). Comparing Figure 11(a) and (c) (or Figure 11(b) and (d)), it can be observed that the rules crafted are very similar (e.g. rule 6, 28, 35 and 42). This is reasonable since although the input data (e.g. distances to the road edges) may vary for different tracks, the variation is relatively small. In such a case, the likelihood is that a similar set of rules would be formulated, as shown in Figure 11(a)–(d).

36

M. Pasquier and R.J. Oentaryo

Figure 10

U-turn manoeuvre using POPFNN-CRI network, (a) vehicle trajectory for the narrow track, (b) steer control trace for the narrow track, (c) vehicle trajectory for the wide track and (d) steer control trace for the wide track

Figure 11 Rule firing frequencies of the POPFNN-CRI network in U-turn manoeuvre, (a) independent rule base for the narrow track, (b) combined rule base for the narrow track, (c) independent rule base for the wide track and (d) combined rule base for the wide track

4.2 Driving and overtaking Automated driving manoeuvres involve a sequence of actions that lead the vehicle in a given test route without crashing into the road banks (Maeda and Murakami, 1983). Our initial work on the auto-driving manoeuvre utilised the POPFNN-CRI (Ang, 2000) and ANFIS (Jang, 1993) networks as the driving agents. The data processing involves

Learning to drive the human way

37

three outputs: steer, TPS and brake, as given in Figure 12(a)–(c), respectively and the data collection takes on average 1–2 roundtrips with a total duration of 3–5 min. Accordingly, three POPFNN-CRI and three ANFIS networks were separately trained. Two sample tracks used in the experiment are depicted in Figure 13. The figure also shows a comparison among the (anti-clockwise) trajectories produced by the human driver, the POPFNN-CRI, and the ANFIS network. Based on the total deviation from the human driver’s trajectory, it was concluded that the POPFNN-CRI performed better than the ANFIS model. The results also showed that the POPFNN-CRI system was able to produce a smoother vehicle trajectory as compared to that of human driver. Figure 12 Data processing in the auto-driving system, (a) steer subsystem and (b) throttle position/brake subsystem

Figure 13 Auto-driving using POPFNN-CRI and ANFIS networks, (a) Track I and (b) Track II

Recent development of the auto-driving system at C2i incorporates a vision system to detect the road lanes and subsequently uses the extracted image features as inputs to train the GenSoFNN-Yager networks. The features extracted from the image include lateral offset and look-ahead curvature, which are, respectively defined as the distance of the vehicle centre from the approximated lane centre and the rate of change in the angle between a road curve and a tangent to the road curve (see Figure 14). The data processing in the steer and TPS/brake subsystems are shown in Figure 15(a) and (b), respectively.

38

M. Pasquier and R.J. Oentaryo

Figure 14 Overview of vision-based auto-driving system

Figure 15 Data processing in the vision-based auto-driving system, (a) steer control subsystem and (b) throttle position/brake subsystem

The study of the vision-based auto-driving system focused on the two types of manoeuvre: lane-changing and lane-following. The trajectories for the lane-changing manoeuvre using different velocities are presented in Figure 16. Figure 16(a) and (b) describe the recall (training) results for lane-changing to the left and to the right, respectively, while Figure 16(c) and (d) describe the generalisation (testing) results for lane-changing to the left and right, respectively. As shown in Figure 16(a) and (b), the trained networks were able to perform the left and right lane changing properly, and the trajectories of the auto driver follow closely the human’s. This demonstrated that the steer control network is able to recall and mimic the human driver’s behaviours. Figure 16(c) and (d) also indicate smooth trajectories for both left and right lane-changing at speed 50–80 km/h. At 100 km/h, however, the vehicle failed to realign to the lane centre and collided with the road bank. As in the real case, this incident may occur in a vehicle operating at high-speed in a narrow road. It also shows there is a limit to the generalisation ability of the system.

Learning to drive the human way

39

Figure 16 Lane-changing using GenSoFNN-Yager network, (a) recall result for left-lane changing, (b) recall result for right-lane changing, (c) generalisation result for left-lane changing and (d) generalisation result for right-lane changing

The trails of the steer, TPS and brake control outputs for lane-following manoeuvre are shown in Figure 17(a), (c) and (e), respectively. The squared errors of the outputs are also computed, respectively shown in Figure 17(b), (d) and (f). It can be observed from Figure 17(a) and (b) that the output of the steer subsystem was able to follow closely that produced by the human driver, and maintain the vehicle position within the same lane while driving around left and right bends. The brake and TPS subsystems operate in a cooperative manner to control the vehicle velocity. The TPS subsystem, as indicated in Figure 17(c) and (d), reduced the TPS level at the left and right road curve, in a manner similar to the human driver. Similarly, the plots in Figure 17(e) and (f) show an increased brake level especially when a left or a right bend is encountered. Again, this observation can be justified by comparing the network output with the human actions in the real-world case; the brakes are rarely applied except when encountering road bends. The rule firing strengths of the steer, TPS and brake control networks for the lane-following and lane-changing manoeuvres are shown in Figure 18. The description of the trained GenSoFNN-Yager network structures is presented in Table 2. One essential rule (Rule 1) was identified for the steer subsystem, as per Figure 18(a). This rule reflects the human driving behaviour. As in the real-world situation, the steering wheel is turned only when we intend to change our heading direction; it is kept straight for most of the time. In the case of TPS subsystem, four major rules (Rules 5, 6, 7 and 8) were identified (see Figure 18(b)). During the simulation, TPS were kept constant while

40

M. Pasquier and R.J. Oentaryo

driving on a straight route and reduced only when a road bend was identified or when the vehicle stopped. As a result, the rules that maintain a constant TPS level were fired more often than those that decrease the TPS level. A large variation in the distribution of rule firing strength was observed in the brake subsystem. A total of 65 rules were crafted for the brake subsystem (see Figure 18(c)), indicating the complexity of the brake subsystem. The high firing frequency of the rules in the brake subsystem can be explained from the observation that the vehicle had to adjust its brake frequently as it met the road bends. Figure 17 The trace of control outputs in the lane-following manoeuvre, (a) steer control output, (b) steer control errors, (c) throttle position control output, (d) throttle position control errors, (e) brake control output and (f) brake control errors

41

Learning to drive the human way

Figure 18 Rule firing strengths of the GenSoFNN-Yager network in lane-changing manoeuvre, (a) steer control, (b) throttle position control and (c) brake control

Table 2

Structure of the GenSoFNN-Yager network for combined lane-following and changing

Parameter Input labels Output labels No. of input labels per input feature Rules

Steering system

TPS system

Brake system

9 4 (6,3) 8

21 6 (7,7,4,3) 25

24 7 (9,3,7,5) 65

4.3 Tactical driving Latest developments of our vehicle autonomous driving system involve the realisation of a tactical driving module to develop an intelligent vehicle that can drive competently in dynamic traffic environments. Tactical driving involves high-level decision processes in response to the changing surroundings such as deciding which manoeuvre to perform in a specific situation, provided partial information about the current traffic configuration (Michon, 1985). Figure 19 shows the data processing in a hybrid tactical driving system comprising two subsystems, each of which was realised using the GenSoFNN-Yager network. The first subsystem, termed overtaking decision subsystem, decides whether the vehicle should change the lane to perform overtaking. The resulting overtaking decision is then used as input to the second subsystem, termed lane-changing subsystem, which instructs the vehicle which lane to switch to. As tactical driving involves combination of different behaviours, the data collection duration depends on the number and the duration of each (sub)manoeuvre. Two additional inputs, left-lane clear and right-lane clear, were used to indicate the occupancy state of the adjacent left-lane and right-lane, respectively.

42

M. Pasquier and R.J. Oentaryo

Figure 19 Data processing in the hybrid tactical driving system, (a) overtaking decision tactical subsystem and (b) lane-changing tactical subsystem

Various test scenarios have been devised to evaluate the performance of the tactical driving system. One of such scenarios is illustrated in Figure 20. The intelligent vehicle (dark-grey) was initially placed in the middle lane together with another vehicle (#3) ahead, as shown in Figure 20(a). The adjacent left lane was occupied with two vehicles (#1 and #2). When the intelligent vehicle is driving slower than its desired speed and has been waiting behind vehicle #3 for a period of time, it may decide to switch to the right lane to overtake. Once the manoeuvre is completed, it would switch back to the middle lane. Figure 20(b)–(d) show the resulting trajectory sequence of the intelligent vehicle. In Figure 20(b)), the vehicle decided to switch lane to overtake vehicle #3. The vehicle then performed the overtaking and decided to change back to the desired lane (middle lane) afterwards, as shown in Figure 20(c). Finally, Figure 20(d) shows the orientation of the intelligent vehicle upon completion of the manoeuvre. Figure 20 Tactical driving test scenario, (a) initial state, (b) overtaking decision, (c) overtaking execution and (d) finished manoeuvre

The simulation results of the lane-changing tactical subsystem are provided in Figure 22. Figure 22(a) shows that when the overtaking decision was initiated (overtaking decision is 1), the lane-changing tactical subsystem would move the vehicle to the right-lane (right-lane switch is 1) only when the right-lane clear input indicated that it was safe to go to right-lane. A value of 3 in the right-lane clear input implies that it is very safe to switch to the right-lane, whereas a 0 implies otherwise. This was subsequently validated, as shown in Figure 22(b). The desired lane output value of –1 indicates that the vehicle should change to the adjacent left-lane. However, it did not do so since the left-lane clear input did not indicate that the left-lane is free. Conversely, the lane-changing tactical subsystem would output a one for the left-lane switch only if

Learning to drive the human way

43

the left-lane clear input indicates that the left-lane is vacant, as illustrated in Figure 21. Observing Figure 22(c), when the lane-changing tactical subsystem initiated a switch to the right-lane (i.e. right-lane switch is 1), the intelligent vehicle changed its operation from lane-following mode to lane-changing mode. When this happened, the fluctuation in the lateral-offset was reduced to a flat 1. Figure 21 The trace of inputs and control outputs in overtaking decision tactical subsystem, (a) time being blocked, (b) minimum front sensor distance, (c) overtaking decision and (d) lane position

Figure 22 The trace of input and control outputs in lane-changing tactical subsystem, (a) overtaking decision, right-lane switch andright-lane clear, (b) desired lane, left-lane switch and left-lane clear and (c) lateral offset and right-lane switch

44

M. Pasquier and R.J. Oentaryo

4.4 Experiments using a hardware model car Following up on the success of our driving skill learning approach in simulation, further validation tests were carried out using a hardware realisation of the vehicle model. A microprocessor-controlled model car was thus built for this purpose. The model car does not host the neuro-fuzzy system or other computational modules; instead, an on-board microcontroller (Handyboard) serves as an execution module and transmission link between the (formerly remote-controlled) model car and a personal computer. The latter hosts the neuro-fuzzy system and performs all the functions required in the automated control of the vehicle. The computer receives all the sensory data from the model car, computes the necessary control signals and outputs these back to the model car via the same transmission link to actuate the rear and front motors, resulting in the desired motion of the vehicle. Figure 23 illustrates the organisation of the interaction between the software simulator and the hardware model car as well as the circuitries of the model car. An example of a reverse parking sequence using the model car is depicted in Figure 24. When the manoeuvre is initiated, the on-board microcontroller retrieves data from all the infrared or ultrasonic sensors. It then puts the sensor values into a vector and sends it to the host computer via the (half-duplex) transmission link. The computer presprocesses the received data and presents the appropriate information as input to the trained neuro-fuzzy system. Next, the neuro-fuzzy system’s rule base processes the input data and outputs two control signals for adjusting the front and rear motors. These control signals are then transmitted back to the microcontroller via the same half duplex link. Finally, the microcontroller actuates the front and rear motors based on the received control signals. Some safety logic is also embedded in the microcontroller. The results using the model car are more or less similar to those using the software model. Due to the car control issues (e.g. low servo resolution, no brake control, etc.), however, the performance of the hardware system may not be as good as that of the software system. Figure 23 The microprocessor-controlled model car, (a) architecture of the software-hardware interface, (b) external circuit, (c) internal circuit

Learning to drive the human way Figure 24

5

45

The hardware model car implementation of the reverse parking manoeuvre, (a) initial position, (b) reverse parking (backward adjustment), (c) reverse parking (forward adjustment) and (d) completed parking

Conclusion and future work

The success of the developed neuro-fuzzy learning memory systems in realising various vehicle manoeuvres is certainly very promising. Many simulations have been conducted to investigate the performances of these systems, the results of which exemplify their robustness and recall/generalisation capabilities. However, several limitations remain so far. Chiefly, the operational success largely depends on the choice of parameters and data features used in the training process. Determining the optimal system parameters is still a trial-and-error process (although some optimisation technique is being developed) while the relevance of features to the given problem may not be known a priori. In order to resolve all these issues, incorporation of higher-level (meta-cognitive) mechanisms complementing the neuro-fuzzy systems is presently being investigated, leading to the design of a novel brain-inspired cognitive architecture that includes capabilities such as attention focus, goal setting, planning, parameter tuning, etc. Employing this architecture will allow a more comprehensive modelling of human driving skill acquisition. As a complementary step in this endeavour, an Electroencephalography (EEG) study of the human driver’s brain is conducted to measure the brainwaves that is, the electrical activity of the brain, while driving by recording from electrodes placed on the scalp or cerebral cortex area (Lutz et al. 2004). A set of tactile detectors for steering, brake and TPS control outputs are used together with the EEG equipment to capture the human user’s behaviours. A neuro-fuzzy system is subsequently used to establish relevant associations between the observed brain activities and the control actions. Another experimental study has been conducted to model the human driver’s attention focus using a camera-based computer vision module to track the eye’s point of regard.

46

M. Pasquier and R.J. Oentaryo

The overall aim is to build a user-modelling tool capable of capturing the basic aspects of attention, decision and control underlying the meta-cognitive mechanisms in the human driving task.

References Anderson, J.R. (1981) Cognitive Skills and Their Acquisition, Lawrence Erlbaum Associates. Ang, K.K. (2000) ‘POPFNN-CRI(S): a fuzzy neural network based on the compositional rule of inference’, M.Phil. dissertation, School of Computer Engineering, Nanyang Technological Univerisity, Singapore. Ang, K.K., Quek, C. and Pasquier, M. (2003) ‘POPFNN-CRI(S) pseudo outer product based fuzzy neural network using the compositional rule of inference and singleton fuzzifier’, IEEE Transactions on Systems, Man and Cybernetics, Part B, Vol. 33, No. 6, pp.838–849. Aparicio, F., Páez, J., Moreno, F. and Jiménez, F. (2005) ‘Discussion of a new adaptive speed control system incorporating the geometric characteristics of the roadway’, International Journal of Vehicle Autonomous Systems, Vol. 3, No. 1, pp.47–64. Holve, R. and Protzel, P. (1996) ‘Reverse parking of a model car with fuzzy control’, In Proceedings of the 4th European Congress on Intelligent Techniques and Soft Computing, Aachen, Germany, pp.2171–2175. Jang, J-S.R. (1993) ‘ANFIS: adaptive-network-based fuzzy inference system’, IEEE Transactions on Systems, Man and Cybernetics, Vol. 23, No. 3, pp.665–685. Kandel, E.R., Schwartz, J.H. and Jessel, T.M. (2000) Principles of Neural Science, 4th edition, New York: McGraw-Hill, Health Professions Division. Keller, J.M., Yager, R.R. and Tahani, H. (1992) ‘Neural network implementation of fuzzy logic’, Fuzzy Sets and Systems, Vol. 45, No. 1, pp.1–12. Lin, C.T. and Lee, C.S.G. (1996) Neural Fuzzy Systems: A Neuro-Fuzzy Synergism to Intelligent Systems, Upper Saddle River, NJ: Prentice Hall. Lutz, A., Greischar, L.L., Rawlings, N.B., Ricard, M. and Davidson, R.J. (2004) ‘Long-term meditators self-induce high-amplitude gamma synchrony during mental practice’, Proceedings of the National Academy of Sciences, Vol. 101, No. 46, pp.16369–16373. Maeda, M. and Murakami, S. (1983) ‘Vehicle speed control using fuzzy logic controller’, Proceedings of 9th System Symposium, pp.7–11. Michon, J. (1985) ‘A critical view of driver behavior models: what do we know, what should we do?’ in L. Evans and R. Schwing (Eds.), Human Behavior and Traffic Safety, Plenum. Oentaryo, R.J. (2005) Automated Driving Based on Self-Organizing GenSoYager Neuro–Fuzzy System (Technical Report No. C2i-TR-002/05), Singapore: Centre for Computational Intelligence, School of Computer Engineering, Nanyang Technological University. Oentaryo, R.J. and Pasquier, M. (2004) ‘Self-trained automated parking system’, Proceedings of the 8th IEEE International Conference on Control, Automation, Robotics and Vision, Vol. 2, pp.1005–1010. Oentaryo, R.J. and Pasquier, M. (2006) ‘GenSoFNN-Yager: a novel hippocampus-like learning memory system realizing Yager inference’, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN 2006), Vancouver, BC, Canada, pp.1684–1691. Paromtchik, I.E. and Laugier, C. (1996) ‘Autonomous parallel parking of a nonholonomic vehicle’, Proceedings of the IEEE Intelligent Vehicles Symposium, pp.13–18. Pasquier, M., Quek, C. and Toh, M. (2001) ‘Fuzzylot: a novel self-organising fuzzy-neural rule-based pilot system for automated vehicles’, Neural Networks, Vol. 14, No. 8, pp.1099–1112. Quek, C., Pasquier, M. and Lim, B. (2006) ‘POP-TRAFFIC: a novel fuzzy-neural approach to road traffic analysis and prediction’, IEEE Transactions on Intelligent Transportation Systems, Vol. 7, No. 2.

Learning to drive the human way

47

Quek, C. and Singh, A. (2005) ‘POP-Yager: a novel self-organizing fuzzy neural network based on the Yager inference’, Expert Systems with Applications, Vol. 29, No. 1, pp.229–242. Quek, C., Tan, K.B. and Sagar, V.K. (2001) ‘Pseudo-outer product based fuzzy neural network fingerprint verification system’, Neural Networks, Vol. 14, No. 3, pp.305–323. Quek, C. and Zhou, R.W. (1999) ‘POPFNN-AAR(S): a pseudo outer-product based fuzzy neural network’, IEEE Transactions on Systems, Man and Cybernetics, Part B, Vol. 29, No. 6, pp.859–870. Quek, C. and Zhou, R.W. (2001) ‘The POP learning algorithms: reducing work in identifying fuzzy rules’, Neural Networks, Vol. 14, No. 10, pp.1431–1445. Quek, C. and Zhou, R.W. (2002) ‘Antiforgery: a novel pseudo-outer product based fuzzy neural network driven signature verification system’, Pattern Recognition Letters, Vol. 23, No. 14, pp.1795–1816. Rumelhart, D.E., Hinton, G.E. and Williams, R.J. (1986) ‘Learning representations by back-propagating errors’, Nature, Vol. 323, No. 6088, pp.533–536. Tomporowski, P.D. (2003) The Psychology of Skill: A Life-Span Approach, Praeger. Tung, W.L. and Quek, C. (2002) ‘GenSoFNN: a generic self-organizing fuzzy neural network’, IEEE Transactions on Neural Networks, Vol. 13, No. 5, pp.1075–1086. Tung, W.L. and Quek, C. (2005) ‘GenSo-FDSS: a neural-fuzzy decision support system for pediatric all cancer subtype identification using gene expression data’, Artificial Intelligence in Medicine, Vol. 33, No. 1, pp.61–88. Tung, W.L., Quek, C. and Cheng, P. (2004) ‘GenSo-EWS: a novel neural-fuzzy based early warning system for predicting bank failures’, Neural Networks, Vol. 17, No. 4, pp.567–587. Wang, J-S., Knipling, R.R. and Goodman, M.J. (1996) ‘The role of driver inattention in crashes: new statistics from the 1995 crashworthiness data system’, Proceedings of the Association for the Advancement of Automotive Medicine, Vancouver, BC, pp.1–16.

Learning to drive the human way-A step towards ...

manoeuvres in a similar manner to humans (in terms of car trajectory, speed/brake adjustment, etc.). 2.1 POPFNN-CRI. Our initial work on the automated vehicle system employed the Pseudo Outer-Product. Fuzzy Neural Network using the Compositional Rule of Inference (POPFNN-CRI). (Ang et al., 2003) to realise several ...

Download PDF

1MB Sizes 0 Downloads 156 Views

Report

Learning to drive the human way-A step towards ...

Recommend Documents