Integration, Architecture, and Applications of 3D CMOS‐ Memristor Circuits K.‐T. Tim Cheng and Dimitri Strukov
Univ. of California, Santa Barbara ISPD 2012 1
3D Hybrid ‐ CMOS/NANO add-on
nanodevices layer top nanowire level
CMOS stack
CMOS layer
bottom nanowire level
similar two-terminal nanodevices at each crosspoint
• CMOS stack + nano add-on • nanowire crossbar of two-terminal devices (memristors) 2
Resistive Switching “Memristive” Devices (latching switches, a.k.a. resistive switches, a.k.a. programmable diodes, a.k.a. memristive switches)
200
+ Wide range of material systems and physical phenomena Current ( uA )
100 50 nm hp 0 ‐100 <50 ns ‐200 ‐2
‐1
Pt TiO2 TiOx Pt 0 1 Voltage ( V )
+ V ‐ 2
J. Yang Iet al. Natue Nano, (2008)
3
Area‐Distributed “CMOL” Interfaces interface via (“pin”)
nanodevices (latching switches) gold nanowire levels (nanoimprint) interface pins Tip radii 2-10 nm CMOS stack (just a cartoon)
MOSFET Si wafer K. Likharev (2004, 2005); D. Strukov and K. Likharev (2006) http://www.oxfordplasma.de/ process/sibo_wtc.htm
4
AFOSR‐MURI HyNano: 3D Hybrid CMOS‐Nano Circuits
5
The HyNano Team
Michael Chabinyc Materials, UCSB
Wei Lu EECS, Michigan
Tim Cheng ECE, UCSB (Director)
Susanne Stemmer Materials, UCSB
Marivi Fernandez‐Serra Konstantin K. Likharev Physics, Stony Brook Physics, Stony Brook
Dmitri Strukov Luke Theogarajan Qiangfei Xia ECE, UCSB ECE, UCSB ECE, UM Amherst 6
Project Overview APPLICATIONS 3D hybrid memories
information processing
3D hybrid SoC
ARCHITECTURES/CIRCUITS 3D CMOS/nano circuits w. area‐distributed interface Optical lithography e‐beam
nanoimprint
mixed‐signal CrossNets compact models
DEVICES reproducible, high‐performance, high‐endurance devices
drift diffusion and ab‐initio modeling
MATERIALS a‐Si
metal oxide
organic solid electrolyte 7
Thrust Area #1: Application/Architecture/Ckt Exploration • Memory arrays for high‐performance computing • CMOL‐based FPGA • Neuromorphic networks for bio‐inspired information processing • Evolvable analog circuits • Tunable bias network for analog design • Weighted multiply and add circuits • High precision Digital‐to‐Analog converter
8
“CMOL” Interface – Integrating CMOS with Crossbar Memory Array interface via (“pin”)
nanodevices (latching switches) gold nanowire levels (nanoimprint) interface pins
CMOS stack (just a cartoon) MOSFET Si wafer
Addressing Crossbar Memory Array • There are two types of pins – Blues and Reds • Each array of pins has its own decoding scheme Double decoding scheme: • An array of N2 blue pins uniquely accessed with 2N control signals. • Another 2N control signals for the corresponding N2 red pins
Double Decoding Scheme • Four decoders: demux
memory cell array
select decoder
select decoder
mux/demux data I/O
11
Crossbar Construction – Top View
Crossbar Construction – Top View
Crossbar Construction – Side View interface via (“pin”)
nanodevices (latching switches) gold nanowire levels (nanoimprint) interface pins
CMOS stack (just a cartoon) MOSFET Si wafer
Crossbar Construction – Top View
Crossbar Construction – Bottom Level
Crossbar Construction – Top Level
Crossbar Construction
Crossbar Construction
Crossbar Construction
Connectivity Domain
Crossbar Construction
Crossbar Construction
Unused Address Space The red pin can only interact with blue pins in its connectivity domain Address space provided by yellow cells is wasted!
Key Geometric Parameters • Distance between nanowires is 2FNANO • Size of cell is 2βFCMOS • β2 = r2 + 1 where r is an odd integer > 1. • Crossbar is tilted by an angle α equal to ArcTan(1/r) with respect to the pins. • # of reachable crosspoints per wire segment is r2 – 1
Crossbar Construction – Bottom Level
Adding a Second Crossbar Layer Connectivity domain in the first crossbar layer
Connectivity domain in the second crossbar layer
The mapping is done through pin translation wires
Blue pins are common to all crossbar layers. Red pins are "redefined" for each layer using the pin translation wires.
First layer of red pins.
First layer of red and blue pins.
Layer of (bluish) wires connected to the blue pins.
Single (orange) wire connected to a red pin. The cross‐points with the bottom wires are shown in green.
First complete crossbar layer.
A single pin translation wire (in yellow).
Every orange wire is “translated” into another point using the same type of pin translation wire.
The first crossbar layer with its pins translation wires are then “buried” in SiO2
We start to build the next crossbar layer (bluish wires)
We start to build the next crossbar layer (bluish wires)
We add the orange wires (the cross‐points are formed)
And we add the pins translation wires and repeat the process…
Maximum Number of Layers • Each layer has N2 cells. • There are r2 – 1 cross points per cell. • That gives us a total of N2(r2 – 1) cross points per layer. • The double decoding scheme allows us to address up to N4 locations • Which means that we can (potentially) have up to N2/(r2 – 1) crossbar layers.
How Does it Stand Up as a Memory? Memristor
PCM
STTRAM
DRAM
Flash
HDD
Density (F2)
<4
8–16
37–64
6–8
4–6
2/3
Energy per bit† (pJ)
0.1–3
2–27
0.1
2
10000
1–10x109
Read time (ns)
10-100(?)
20–70
10–30
10–50
25000
5–8x106
Write time (ns)
~10
50–500
13–95
10–50
200000
5–8x106
Retention
years
years
weeks?
<
years
years
Endurance (cycles)
>1012
107
1015
1015
106
104
40
If Successful, 3D Hybrids Can Achieve….. • Unprecedented memory density – Footprint of a nano‐device is 4Fnano2/K, for K vertically integrated crossbar layers – Potentially up to 1014 bits on a single 1‐cm2 chip
• Enormous memory bandwidth – Potentially up to 1018 bits/second/cm2
• At manageable power dissipation • With abundant redundancy for yield/reliability 41
Thrust Area #1: Application/Architecture/Ckt Exploration • Memory arrays for high‐performance computing • CMOL‐based FPGA • Neuromorphic networks for bio‐inspired information processing • Evolvable analog circuits • Tunable bias network for analog design • Weighted multiply and add circuits • High precision Digital‐to‐Analog converter
42
CMOL‐Based FPGA • Programming for xpoint memristors similar to CMOL digital memories • Uniform fabric with CMOS inverter cells • Crossbar wires for routings A B
cell A+B
F
B A
B nanodevices
A
RON Cwire
Rpass
A+B CMOS inverter 43
Density: CMOS vs. CMOL Metrics (units)
2009
2010
2011
2012
2013
Comments
Half-pitch FCMOS (nm)
50
45
40
36
32
In accordance with ITRS
Half-pitch Fnano (nm)
20
18
16
14
12
-
CMOS memories (Gbits/cm2)
6.7
8.2
10.5
13
16
Follows ITRS (with A = 6F2CMOS)
CMOL memories (Gbits/cm2)
4
10
23
36
67
Initial progress impacted by q
CMOS FPGA (Mgates/cm2)
0.4
0.5
0.6
0.8
1.0
Rescaled from 0.18 μm rules
CMOL FPGA (Mgates/cm2)
625
775
1,000
1,200
1,500
-
Metrics (units)
2016
2019
2022
2025
2028
Comments
Half-pitch FCMOS (nm)
30
28
26
24
22
Grows slower than in ITRS
Half-pitch Fnano (nm)
10
6
4
3.5
3
-
CMOS memories (Gbits/cm2)
18
21
25
29
35
Follows A = 6F2CMOS
CMOL memories (Gbits/cm2)
100
350
900
1,200
1,700
Spectacular progress at lower q
CMOS FPGA (Mgates/cm2)
1.1
1.3
1.5
1.7
2.1
Rescaled from 0.18 μm rules
CMOL FPGA (Mgates/cm2)
1,700
2,000
2,300
2,700
3,200
44
Thrust Area #1: Application/Architecture/Ckt Exploration • Memory arrays for high‐performance computing • CMOL‐based FPGA • Neuromorphic networks for bio‐inspired information processing • Evolvable analog circuits • Tunable bias network for analog design • Weighted multiply and add circuits • High precision Digital‐to‐Analog converter
45
Thrust Areas: # 2: High‐Performance/‐Yield Devices # 3: 3D Hybrids Integration Integrating CMOS with devices of different materials: • a‐Si • Metal oxide • Organic • Solid‐state electrolyte
Using: • Nanoimprint • E‐beam lithography • Optical lithography • Heterogeneous wafer‐level integration
(a)
(b)
50 μm
E‐Beam Crossbar Arrays (Lu)
<20nm Overlay Alignment (Xia)
100 μm
46
Integrated Crossbar Array/CMOS System PI: Lu
Crossbar array
CMOS
Integrated crossbar/CMOS chip with probe card attached
Kim et al. Nano Lett., 12, 389–395 (2012). 47
Integrated Crossbar Array/CMOS System
48
Performance of a‐Si and Metal‐Oxide Device Array “on”
“off”
filament
100nm
• Tight distribution from 256 devices measured • Devices shown excellent on/off and intrinsic diode characteristics 49
Project Overview APPLICATIONS 3D hybrid memories
information processing
3D hybrid SoC
ARCHITECTURES/CIRCUITS 3D CMOS/nano circuits w. area‐distributed interface Optical lithography e‐beam
nanoimprint
mixed‐signal CrossNets compact models
DEVICES reproducible, high‐performance, high‐endurance devices
drift diffusion and ab‐initio modeling
MATERIALS a‐Si
metal oxide
organic solid electrolyte 50
BACKUP SLIDES
51
Thrust Area #3: High‐Performance/‐Yield/‐Reproducibility Devices • a‐Si (Lu) • Metal oxide (Stemmer, Xia) • Organic (Chabinyc) • Solid‐state electrolyte (Lu)
52
a‐Si Memristive Devices and Arrays
PI: Lu
2.5 2.0 1.5 1.0 0.5
50
1st cycle After 2nd cycle 10-7 10-8 10-9 10-10 10-11 10-12 10
0.0 -4
-4 -2 0 2 4 Voltage
-2
0 2 4 Voltage (V)
“on”
40
-6
# of devices
Current (100nA)
3.0
6
filament
“off”
30 20 10 0
2.6 2.9 3.2 3.5 3.8 4.1 4.4 Vth (V)
Lu et al., Nano Lett. (2008, 2009)
100nm
53
Project Organization UCSB CMOS circuit MBE fabrication design for of memristive CMOL devices; integration Cheng, Strukov, Theogarajan
Stemmer
U. Michigan
Organic memristive devices
Memristive device modeling
Digital and analog 3D hybrid circuit architectures
Chabinyc
Strukov
Cheng, Strukov, Theogarajan
UMass
Stony Brook University
Metal oxide memristive a-Si & solid electrolyte Ab-initio simulation of devices; devices; 3D integration 3D integration with memristive devices with CMOS CMOS Lu
Xia
<------- experiment --------->
Fernandez-Serra, Likharev
Mixed-signal neuromorphic 3D hybrid circuit architectures Likharev 54
<---------theory/modeling------->
Crossbar Architecture ‐ Xbar to preserve density ‐ Passive (no transistors) but nonlinear I‐ V ‐ Common way (from periphery)
top (nano)wire level
i
similar two‐terminal devices at each crosspoint
bottom (nano)wire level
‐vw
vr vw v Write
Read V
V
=
= V
Vr/2
V
Vw/2
A
V
V =Vr July 2011 MURI Kickoff
V
=Vr/2
CMOS for decoding and sensing
V
V =Vw
V
=Vw/2
55
Generic Memory Array • Asserting a word line makes the access element to place the contents of the memory element in the bit line. A particular bit is then selected with a MUX.
multiplex er
Access element
decode r
Memory element
Generic Memory Array • An array of N2 memory elements can be uniquely accessed using 2N control signals (word+bit lines). Other representation of the same array
Area‐Distributed “CMOL” Interfaces (II) Most important feature: pin array tilt by angle = arcsin(Fnano/FCMOS) = arctan(1/r) pin 1
pin 2A 2FCMOS 2rFnano
A pin 2B
B
2Fnano
Every nanowire (and hence every crosspoint) may be addressed from CMOS! K. Likharev (2004, 2005); D. Strukov and K. Likharev (2006)
58
A Possible Solution With this particular connectivity domain geometry (r=3), we can cover all the plane... But that is not always the case.
The pin translation wires are another layer of wires on top of the crossbar
We can add more crossbar layers by simply inserting a layer of pin translation wires between them.
Crossbar Analysis The crossbar is rotated by an angle α such that:
Where r is an integer (an odd integer greater than 1). Once we set r and β (the CMOS cell complexity), the angle α and Fnano are fixed as well the length of the wires in the crossbar and the number of memristive devices reachable per wire segment.
Crossbar Analysis The parameter r also sets the maximum, minimum and average paths the electric signals have to propagate to access a bit (a memristive device). This paths are given by: Maximum (worst) case: Minimum (best) case: Average (real) case:
2Fnano * (r2 - r + 1) 2Fnano * r 2Fnano * (r2 + 1)/2
3D Hybrid Integration with Multi‐Layer Crossbars crosspoint device in 1st layer
crosspoint device in 2nd layer
E 5 D
~N2 β 2
4
crosspoint devices per layer (out of N4 total)
C
via translation layer crossbar layer
3 B 2 A 1
CMOS layer N data/control lines N2 access devices/vias 1
connectivity domain in 1st layer
via translation wires
A
2
B
3
C
4
D
5
E
connectivity domain in 2nd layer
D. Strukov and R. S. Williams (2009)
62
Project Overview APPLICATIONS (Cheng, Likharev, Strukov, Theogarajan) information processing
3D hybrid memories
3D hybrid SoC
ARCHITECTURES/CIRCUITS (Cheng, Likharev, Strukov, Theogarajan) 3D CMOS/nano circuits w. area‐distributed interface Optical lithography e‐beam (Strukov, Stemmer) (Lu)
mixed‐signal CrossNets
nanoimprint (Xia, Chabinyc)
compact models (Likharev, Strukov)
DEVICES
reproducible, high‐performance, high‐endurance devices a‐Si (Lu)
drift diffusion and ab‐initio modeling
(Fernandez‐ organic solid electrolyte Serra, Strukov) metal oxide (Lu) (Xia, Stemmer) (Chabinyc)
MATERIALS
63
IC Applications Continue to Demand More Memory and Higher Bandwidth High
GPU, CPU, Chipset & FPGA Networking Application Processor
Memory
Baseband
Chip Size
BT/WiFi
PM Transceiver Peripheral I/O Controller
PA
Switch Discrete
100
200
For most applications running on high‐ end SoCs, amount of available memory and memory bandwidth have been and will continue to be the bottlenecks 300
I/O
400
500
600
High
64
Bistable Two‐Terminal Devices (latching switches, a.k.a. resistive switches, a.k.a. programmable diodes, a.k.a. memristive switches) Demonstrated with many materials; no clear winner yet; few reproducibility reports, e.g.: Si /α-Si / M: Ti / Pt / TiO2 / Pt:
S. H. Jo and W. Lu (2008)
Q. Xia et al. (2009); 65 J. Borghetti et al. (2010)
Device Requirements Vary for Different Ckts/Architectures/Applications Signal
Dynamic range of resistance
DC
AC
Small
Tuning
‐
Large
Memory, FPGA, DAC
MAC
66
CMOS‐CMOL Integration: Initial Demonstration (a)
(c)
PI: Xia
(d)
(e)
(b)
Xia, Strukov et al. (2009)
67