2.75cm Imitation Learning for Vision-based Lane ...

Viewer
Transcript

Imitation Learning for Vision-based Lane Keeping Assistance ITSC’17 Workshop on Deep Learning for Autonomous Driving

Christopher Innocenti∗ , Henrik Lind´en∗ , Ghazaleh Panahandeh∗ , Lennart Svensson† , Nasser Mohammadiha∗† October 16-19, 2017 - Yokohama, Japan Zenuity AB∗ , Chalmers University of Technology† , Gothenburg, Sweden

Outline

1. Introduction 2. Proposed Methods 3. Experimental Results 4. Summary

1

Introduction

Introduction

Problem Statement and Motivation

1. How to predict lateral control signals from camera images? • Modular system approach • Explicit control of data flow through sub-modules

• Holistic system approach • Intermediate levels of processing abstracted away • Don’t require explicit feature engineering • Don’t require object/semantics annotated data for learning

2. How to evaluate control signals and performance in a good way? • Vehicle tests • Realistic

• Closed loop simulation • Fast, inexpensive, safe

2

Introduction

Background and Contributions

• End to end learning for self-driving cars [Bojarski et al., 2016b] • PilotNet, 9 layer CNN with ∼250k trainable parameters • Images captured from 3 front cameras for training • Lane following with 98% level of autonomy

• Our contributions • Experimentally show that data augmentation might not be necessary for learning LKA functionality (using a model based on PilotNet) • Propose 2 metrics for numeric evaluation based on safety (positioning) and comfort (trajectory smoothness)

3

Proposed Methods

Proposed Methods

Approach

• Model a driving policy πθ as a convolutional neural network • Find θ∗ = argminθ E(s,a)∼dπ∗ [l(a, πθ (s))] by supervised learning • Learn a mapping from states to actions using state–action pairs sampled from regular highway driving, without data augmentation Expert action: a

State: s

Policy: πθ

Action: πθ (s)

− Loss: l(a, πθ (s))

Policy adjustment

4

Proposed Methods

Data

• 640×480 images from Volvo Cars test expeditions • Dataset selection of approximately 2.5M images sampled at 20Hz

5

Proposed Methods

Actions

• Actions (and states) pruned to be more ”uniformly” distributed • Resulting dataset of approximately 1.4M state–action pairs (27h) • Steering wheel angle → curvature

−1

−0.5

0

0.5

SWA [rad]

1

−1

−0.5

0

0.5

1

SWA [rad] 6

Proposed Methods

CNN Architecture

• Based on PilotNet [Bojarski et al., 2016b] • 264k trainable parameters • Trained for 9 epochs of the dataset • Performance?

1

10

FC4

50

FC3

100

FC2

1216

FC1

FLAT.

C5

1 × 16 × 76

C4

3 × 18 × 64

C3

5 × 20 × 48

C2

14 × 43 × 36

C1

32 × 89 × 24

68 × 182 × 1

IMAGE

PRE.

1 r

7

Proposed Methods

Positioning Penalty

• Penalty width w and shape factor β road and situation dependent

ep (d; w , β) =

   1

d <0 d w

(βw ) − βd   0

0≤d ≤w d >w

Positioning penalty

wL

wl

wL0

wr

β=1 β = 0.1 β = 0.01

1 0.5 0 0

w /2

w

Distance to lane marking: d

vehicle

dl

dr wv

8

Proposed Methods

Discomfort Penalty

• Based on penalty from a vehicle motion model [S¨ orstedt et al., 2011] 2 3 • Comfort level g ≈ 1.8 m/s (or 1.8 m/s ) [Felipe and Navin, 1998, Xu et al., 2015]  2  y2 g ed (y ; g ) =  5+ 6

if x < g 2

y 6g 2

6

if x ≥ g

Level of discomfort

10 Comfortable Uncomfortable

5

0 0

1 2 Lateral acceleration

m s2

3 or jerk sm3

vehicle

9

Experimental Results

Experimental Results

Reality Gap

Example: Visual backpropagation [Bojarski et al., 2016a] • Pixel regions that contribute most to control decision/prediction

Volvo (real)

CarMaker (synthetic)

Unity (synthetic)

10

Experimental Results

Discomfort

s2

acceleration

m

Example: Lateral acceleration and jerk

0 −2

πθ π∗

2,900

3,000

3,100

3,200

3,300

s3

jerk

m

4

3,400 πθ π∗

2 0 −2 2,900

3,000

3,100

3,200

3,300

3,400

Road distance [m] 11

Experimental Results

Performance

Example: 34km road geometry, β = 0.01, wl = wr = 0.4m, g = 1.8 • Positioning: badly positioned only ∼ 2% of the time • Acceleration: ∼ 9% more uncomfortable, but still comfortable • Jerk: ∼ 279% more uncomfortable, but still comfortable πθ

π∗

πθ /π ∗

0.006 0.013

0 0

– –

Avg ed (yacc ; g ) Max ed (yacc ; g )

0.152 13.283

0.140 11.890

1.086 1.117

Avg ed (yjerk ; g ) Max ed (yjerk ; g )

0.064 19.816

0.023 11.230

2.783 1.765

Penalty Metric Avg ep (dl ; wl , β) Avg ep (dr ; wr , β)

12

Summary

Summary

Conclusions

• The learnt policy seems to provide robust behaviour in simulated environments without data augmentation. • Instantaneous decisions provides noisy behaviour, filtering based on previous decisions improves the driving behaviour. • More research on safety and verification aspects by understanding the internal representations of networks needed. More videos at: http://goo.gl/MKKnuF

13

References i

Bojarski, M., Choromanska, A., Choromanski, K., Firner, B., Jackel, L., Muller, U., and Zieba, K. (2016a). Visualbackprop: visualizing cnns for autonomous driving. arXiv preprint arXiv:1611.05418. Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L. D., Monfort, M., Muller, U., Zhang, J., et al. (2016b). End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316. Felipe, E. and Navin, F. (1998). Canadian researchers test driver response to horizontal curves. Road Management & Engineering Journal TranSafety, Inc, 1.

References ii

S¨ orstedt, J., Svensson, L., Sandblom, F., and Hammarstrand, L. (2011). A new vehicle motion model for improved predictions and situation assessment. IEEE Transactions on Intelligent Transportation Systems, 12(4):1209–1219. Xu, J., Yang, K., Shao, Y., and Lu, G. (2015). An experimental study on lateral acceleration of cars in different environments in sichuan, southwest China. Discrete Dynamics in Nature and Society, 2015.

Generative Adversarial Imitation Learning

Learning-Based Approach for Online Lane Change ...

The Role of Imitation in Learning to Pronounce

LEARNING OF GOAL-DIRECTED IMITATION 1 ...

The Role of Imitation in Learning to Pronounce

Active Imitation Learning via State Queries

Mirroring, not imitation, for the early learning of L1 ...

The Imitation game.pdf

Imitation and Improvement Sneak Peek.pdf

Choosing Intellectual Protection: Imitation, Patent ...

Imitation and Improvement Sneak Peek.pdf

Lane Change Form.pdf

DiaryofaNanny sunny lane

Park Lane -

kacy lane alsscan.pdf