Hardware Fault Tolerance through Artificial Immune System

Vadim Kataev

University of Paderborn, 2007

Present-day hardware systems ● ● ●

Rapidly growing complexity Reliability Demands of safety-critical systems

www.digitalmicroscope.com

Natural Immune System ● ●

Complexity Reliability

www.infections.bayer.com

Hardware Fault Tolerance ● ●

Fault avoidance is not practical Faults better to be tolerated

Fault Tolerant System ● ● ●

Monitoring Error detection Self-repair

Faults detection ●

Partial match length c

V = (v1, v2, v3 ... vn) U = (u1, u2, u3 ... un)

dist(V,U) < c V U dist(T,U) > c T c

Faults detection ●

Negative selection algorithm Self S

Immature random tolerance conditions R

Match in c contiguous positions?

Reject

Matches < c

Matured tolerance conditions R

Faults detection ●

Negative selection

selfVectors=[[1,0,1,1], [1,1,1,0]] detectors=[[1,0,0,0], [0,0,1,0]] for vector in selfVectors: if vector in detectors: nonselfDetected()

Systems of state machines ● ●

Hardware design Finite state machines in hardware

s1

t1

s2

t2

t3 s3 t4

Entity feature mapping Immune system

Hardware fault tolerance

Self

Acceptable state

Non-Self

Invalid state

Antibody

Error tolerance conditions

Genes

Variables forming tolerance cond.

Paratope

Invalid state verification cond.

Epitope

Valid state verification cond.

T-Helper

Recovery procedure activator

Memory cell

Sets of tolerance condition

Process feature mapping Immune system

Hardware fault tolerance

Recognition of Self

Recognition of valid states

Recognition of Non-Self

Recognition of invalid states

Ontogenetic learning

Learning correct states

Clonal detection

Isolation of self-recogn. toler. cond.

Inactivation of antigen

Return to normal operation

Life of an organism

Operation lifetime

Design principles

Immunotronics Embryonics Self-repair messe-muenchen.de

Immunological system

States bus

State machine A

State machine B

s1

s1

s2

s2

Control line s3

s3 invalid state

invalid state

State machine C s1 Immune system Monitore

s2

s3 invalid state

Tolerance conditions creating s1

Input data Generator

Input data

t1

s2

t2

t3

Output data

s3

Output data Analyser

Self vectors

Immune system Monitore

Tolerance conditions

Greedy detector Generator

Faults recognition s1

Response execution

s2

Continue Stop

s3 invalid state State data

Antigen inactivation

B cells

State recognition Response activation

Self / non-self recognition

Signaling peptide T cells Confirmation

Valid tolerance Epitope

Costimulation

Error tolerance Paratope

Content addressable memory

Design principles

Immunotronics Embryonics Self-repair

Official logo of embryonics

Embryonic array ● ● ●

Multiple distributed state-machines No central control unit Lymphatic network

Embryonic system Embryonic Lymphatic Trans-layer Embryonic Immune

Immune-Embryonic cell interaction

Control States

Immune cell implementation

Design principles

Immunotronics Embryonics Self-repair

www.menshealth.com

Self-repair technique ● ● ● ●

Return to full functionality Faulty cells elimination Redundance through additional cells Cellular differentiation

Self-repair through column shift ● ● ● ●

Simplicity Deactivating column containing faulty cell Right shift Spare columns required f1

f2

f4 f3

f2

f4

f1

f1

f3

f2

f4 f3

f4 f3

f2

f4

f1

f3

f3

f3

f2

f6

f3

f3

f3

f2

f6

f4

f3

f5

f1

f4

f3

f3

f5

f1

Results

Decade counter 2-bit binary counter

University of York

Decade counter. Structure ● ● ● ● ●

0 to 9 counter 10 states 4 bits data 2 inputs (CEN-count enable, RST-reset) Operation: Incremental count(CEN=1, RST=0) Hold (CEN=0, RST=0) Reset (CEN=X, RST=1)

Decade counter. Data representation

Previous state

CEN RST

Current state

1

0

0

0

0

1

0

0

1

0

1

0

0

0

1

0

0

0

1

1

1

0

0

0

1

1

0

1

0

0

...

Decade counter. Results ●



Ideal match length c has to be predefined Percentage difference between total and single cycle detectable faults depends on c length Error has to be detected in a single clock cycle 7% increase in failure probability occurs when single-cycle detections are considered

Results

Decade counter 2-bit binary counter

University of the West of England, Bristol

2-bit binary counter. Structure ● ● ●

● ● ●

15 embryonic cells One spare row One spare column 15 immune cells One-to-one supervising Non-self checking

2-bit binary counter. Cell-elimination Q1

Q0

1

1

0

0

D Q clk

1

1

0

0

D Q clk

D Q clk

Q1

Q0

D Q clk

2-bit binary counter. Results ●



One immune cell for one emryonic cell. More embrionyc cells to be monitored lead to concurrent monitoring. Hardware overheads of the immune cell may become prohibitively large. Three layers for fault tolerance 1. Built-in self-test inside each embryo-cell 2. Off-line error checking 3. Negative selection

Conclusion ● ●

Implementation in hardware Future research

Questions ?

?

Hardware Fault Tolerance through Artificial Immune ...

selfVectors=[[1,0,1,1], [1,1,1,0]] detectors=[[1,0,0,0], [0,0,1,0]] for vector in selfVectors: if vector in detectors: nonselfDetected(). Page 9. Systems of state machines. ○ Hardware design. ○ Finite state machines in hardware s1 s2 s3 t1 t2 t3 t4. Page 10. Entity feature mapping. Immune system. Hardware fault tolerance. Self.

328KB Sizes 0 Downloads 271 Views

Recommend Documents

No documents