What Devices do Data Centers Need? Cedric F. Lam, Hong Liu, Ryohei Urata Google, 1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA {clam, hongliu, ryohei}@google.com

Abstract: We discuss the trend in fiber optic technology developments to fulfill the scaling requirements of datacenter networks. OCIS codes: (060.0060) Fiber Optics and Optical Communications; (250.0250) Optoelectronics

1. Introduction Cloud computing has been driving the need for larger and larger scale datacenters [1] with higher and higher bandwidth network fabrics. As datacenter networks scale, optics is becoming ever more important from ultra-longdistance transmission between datacenters to short-reach interconnects inside datacenters. In this paper, we discuss the roles of fiber optic devices and the developing trend in these devices to fulfill the scaling requirements of datacenter networks. 2. Inter-datacenter Network Architecture Figure 1 shows the generic architecture of a typical wide-area datacenter network. The bottom layer is a private backend backbone, which provides sparse, high-capacity and ultra-long haul point-to-point transport links between mega-scale datacenters [2]. This network transports machine generated traffic or data copies between datacenters and is not directly facing the public Internet. Above the private backend network is a transport backbone, which interconnects datacenter operators to the public Internet through peering so that users can gain access to datacenter services and facilities. It should be emphasized that the private backend backbone network is usually architecturally simple with point-to-point links, but is much larger in capacity compared with the publicly facing backbone network [3].

Figure 1

Layered Inter-datacenter Network Architecture

This inter-datacenter private backend network employs scarce and expensive long-distance fiber for transmission. The public facing transport network, on the other hand, contains many high-capacity metro transport links to interconnect with other carrier networks. Metro transport networks also serve to connect carrier networks to edge cache systems used by datacenter operators and content providers to improve content distribution experiences with faster access without burdening the expensive backbone transport network. Fast growing OTT (over-the-top) services such as YouTube and Netflix in the recent years are accelerating the deployment of edge cache and metro optical transport systems. In addition, hardware and software technologies that maximize the utilization of backbone fiber infrastructure resources and simplify backbone transport network operation are highly desirable. In terms of new physical technologies, spectrally efficient coherent transponders with flexible bit rate [4] will not only maximize the fiber capacity, but also simplify operation of transport networks by reducing the variety of transceivers operators have to maintain. These transponders automatically adapt the transmission rate to channel conditions and maximize the link capacity accordingly. Studies have also shown that such transponders can produce significant cost savings in a reach-diverse environment [4]. Advances in high speed electronics and integrated coherent receivers in the recent years have made these transponders a reality. Moore’s law helps to continuously

drive down the cost and power of long haul transponders while offering new enhanced capabilities such as softdecision error correction codes. Other technologies that help to maximize long-haul fiber link capacities include, (1) WDM multiplexing techniques leaving no guard bands in the optical spectrum in order to maximize the system capacity, (2) Raman amplifiers to improve the optical signal to noise ratio, and (3) large effective area fiber to reduce optical non-linearity [3]. In terms of network control and management, SDN (Software Defined Network) has been demonstrated [2] to significantly improve the overall system utilization and availability through centralized traffic engineering. 3. Network Fabrics inside Datacenters 3.1 Optical Interconnects Inside a datacenter, there are vast numbers of servers working in unison to run each application. These servers are interconnected through networking fabrics with extremely large bi-sectional bandwidth. Modern datacenters scale out using switching fabrics formed by low-cost commodity switching silicon [5, 6] and state-ofthe-art high speed interconnects. Figure 2 shows the fat-tree cluster fabric topology typically deployed inside datacenters. Realization of such scale-out network clusters with large bi-sectional bandwidth requires a vast number of efficient high-speed interconnect links. Optics plays a crucial role in forming these high-speed interconnects, not only in signal transmission performance, but also in economics and operation.

Figure 2

Intra-datacenter Network Architecture

Modern mega-data centers employee a large number of 10Gb/s interconnects from servers to switches and in between switches [5], for a distance from 2m (servers to switch connection) to 2km (for switch connections between buildings). At such speed and distance, it is difficult for copper connections to meet the performance requirements. As datacenter capacities increase, both the speed and number of interconnects need to increase. A scale-out datacenter fabric typically involves thousands of interconnect links [6]. For the same bisectional network bandwidth, the efficiency and cost of the fabric strongly depend on the port count on each switch chassis, which is in turn limited by the bandwidth of each switch silicon and front panel transceiver density. To achieve the best front panel I/O density, low-cost, low-power and low-profile optical transceivers are thus of utmost importance in scaling out datacenter infrastructures. In order to maintain low cost and low power, it is important to not overdesign the performance of optical interconnects used inside datacenters. For short reach interconnects up to a few hundred meters at 10Gb/s, VCSEL based multi-mode optical transceivers are very low-cost and consume very low power. To achieve the 2km transmission distance requirements, single mode transceivers are preferred. Moreover, because of the large number of interconnects and much longer reaches involved in a scale-out cluster fabric implementation, the cost of optical fiber itself is significant in modern mega-scale datacenters. The cost of single mode fiber is intrinsically much lower than that of multi-mode fibers. So single-mode transceivers are becoming a trend for datacenter interconnect to not only save the cost of fibers but also improve the cabling efficiency and provide much longer reaches and futureproofing inside datacenters.

3.2 Scaling Optical Interconnects To scale the interconnect speed beyond 10Gbps, transmitting signals in a single lane becomes exponentially more difficult. Parallel-lane transmission in the form of ribbon cable or WDM interfaces [5] help to realize higher bandwidth transmissions. The ribbon fiber approach increases the cost of interconnect cable infrastructure and makes upgrades difficult because of the necessity to install new fiber ribbons when the interconnect speed and number of transmission lanes increase. This is especially true in clusters prewired with structured cables between racks. Single mode fiber not only solves the transmission distance problem, but lends itself easily to capacity upgrade through WDM parallel lanes without introducing new fiber cables. As a matter of fact, technology advancement in the past a few years has significantly reduced the cost of optical transceivers such that the cost of fiber cables has now become a dominating part of interconnect infrastructure [7] and low-cost integrated WDM array transceivers with built-in multiplexer/demultiplexer will play a significant role to reduce the overall system cost and improve the performance and efficiency of the networking fabric. The design considerations for short-reach WDM optical interconnect is very different from long haul WDM transmission links. The choice of wavelength plan and spacing for short-reach WDM integrated transceivers will directly impact the cost, size and power consumption of the resulting transceiver module. As an example, an uncooled solution is preferred to eliminate the thermo-electric cooler (TEC) and reduce power. For 2km transmission distances with baud-rate less than 10Gbaud, direct modulation with on-off keying is simple, low power and cost-effective. Dispersion is usually not a limiting factor for transmission distances less than 2km. However, as link speed increases from 10G to 100G (4x25Gb/s), and 400G (16x25Gb/s, or 4x100Gb/s), direct modulation and on-off keying may no longer be the most effective way to support the transmission rate and reach. Novel modulation schemes and digital signal processing (DSP) will be needed for datacenter interconnects, also leveraging off previous work done for long haul transmission. Tradeoffs between cost, power consumption and complexity will eventually determine the most optimal transmission scheme at higher transmission speeds. To overcome the extra insertion loss introduced by multitude of patch panels used in structured wiring inside datacenters, transceivers used in datacenter need to support higher losses than usually needed for the distances that they need to cover. At low baud rate, datacenter transceiver performances are usually loss budget limited. As baud rate scales up, dispersion penalties will no longer be ignorable and need to be considered in interconnect designs. 4. Outlook of Optics in Next Generation Systems Innovations in photonic integration circuit (PIC) and optical packaging techniques are necessary to maintain the scalability of next generation datacenter systems. Optics has evolved from long distance communication to short reach interconnects linking servers to switches, and switches to switches in modern datacenters. As the speed of datacenter switching systems increase, signal speed will keep increasing, leading to even more challenges in integration and packaging - higher speed means higher baud rate and/or more signal lanes, as well as higher power consumptions. Optical interconnects outperform their electrical counterparts at longer distances for higher speed signals [7], and will become more & more ubiquitous as data rates increase. However, compared to electronic design and manufacturing, optics is still lagging far behind in integration and automation. The pent-up demand in bandwidth will drive innovations on integrated photonics and packaging designs. In the long run, photonic integration circuits hold the key to enhance system functionality and reduce the size and power consumptions of optical transceivers, from short reach interconnects to long-haul coherent transmission [8]. 5. References [1] L.A. Barroso & U. Hoelzle, The datacenter as a computer – an introduction to the design of warehouse-scale machines, Morgan & Claypool Publishers, 2009. [2] S. Jain, et al, “B4: experience with a globally-deployed software defined WAN,” SIGCOMM 2013. [3] C. Lam, et al., “Fiber Optic Communication Technologies: what’s needed for datacenter network operators,” IEEE Communications Magazine, July 2010, pp32-39. [4] X. Zhou, “Rate-Adaptable Optics for Next Generation Long-Haul Transport Networks,” IEEE Communications Magazine, March 2013, pp41-49. [5] H. Liu, et al., “Optical Interconnects for Scale Out Data Centers,” Chapter 2 in Optical Interconnects for Future Datacenter Networks, Springer, 2013. [6] A. Vahdat, M. Al-Fares, N. Farrington, R.N. Mysore, G. Porter, S. Radhakrishna, “Scale-Out Networking in the Data Center,” IEEE Micro, 23/7/10, pp29-pp41. [7] H Liu, et al., “Scaling optical interconnect in datacenter networks – opportunities and challenges for WDM,” IEEE Hot Interconnect 2010. [8] ECOC 2013 workshop, “Low-cost access to photonic ICs 5th European photonic integration forum,” http://www.ecoc2013.org/workshopproposals.html#ws2

Conference title, upper and lower case, bolded ... - Research at Google

1. Introduction. Cloud computing has been driving the need for larger and ... This network transports machine generated traffic or data copies between datacenters ... services such as YouTube and Netflix in the recent years are accelerating ...

309KB Sizes 6 Downloads 268 Views

Recommend Documents

Conference title, upper and lower case, bolded ... - Research at Google
computing platform as a shared resource for many distributed services [1]. ... behind the network scaling requirements and discuss architectural options for ...

Conference title, upper and lower case, bolded, 18 ...
Zehnder modulator (MZM) with 29-1 bit length pseudorandom binary sequence (PRBS). A second MZM is sinusoidally modulated at 28 GHz to convert 28 Gbaud NRZ DQPSK to 50% RZ DQPSK. The 1st transmission section is emulated by an OSNR degradation obtained

Conference title, upper and lower case, bolded, 18 ...
1. Introduction. The relentless rise in popularity of data-intensive cloud-based services continues to raise the already unsustainable network performance ...

Conference title, upper and lower case, bolded, 18 point ... - Audentia
dispersion and polarization mode dispersion, the major remaining fiber impairments that limit transmission are the fiber attenuation and fiber non-linearities.

Slides and Ladders--Upper Lower Case Letters.pdf
Slides and Ladders--Upper Lower Case Letters.pdf. Slides and Ladders--Upper Lower Case Letters.pdf. Open. Extract. Open with. Sign In. Main menu.

Title Placeholder - Research at Google
3(a) shows an illustration of a room structure in our room simulator. .... effect, the model is trained on pseudo-clean utterances. For this .... speech recognition,” Speech Communication, vol. 50, no. 2, pp. 142–152, Feb. 2008. [15] C. Kim and .

Title Placeholder - Research at Google
ing samples from lattices with standard algorithms versus (3) using full lattices with our ... than use expected n-gram frequencies, [6] instead employed a brute-force sampling ..... plus 1 − α times the baseline model probability. Word Error Rate

Title Placeholder - Research at Google
In delex- icalization, slot values occurring in training utterances are re- placed with a .... Emilio's decor and service are both decent, but its food quality is .... 101. CONCAT. 1.0. 0.98. 95.33%*. 2.80†. 2.98. 2.87*. 126. HUMAN-G. 1.0. 1.0. 94.

Paper Title (use style: paper title) - Research at Google
decades[2][3], but OCR systems have not followed. There are several possible reasons for this dichotomy of methods: •. With roots in the 1980s, software OCR ...

Paper Title (use style: paper title) - Research at Google
open source Tesseract engine as baseline, results on a large dataset of scanned ... addition to training on an ever increasing amount of data, it has been generally ... characteristics driven by a global static language model, we noticed that ...

SIGCHI Conference Proceedings Format - Research at Google
based dialpad input found on traditional phones, which dates ..... such as the Android Open Source Project (AOSP) keyboard. ...... Japan, 2010), 2242–2245. 6.

SIGCHI Conference Paper Format - Research at Google
the Google keyboard on Android corrects “thaml” to. “thank”, and completes ... A large amount of research [16, 8, 10] has been conducted to improve the qualities ...

SIGCHI Conference Proceedings Format - Research at Google
spectral illumination to allow a mobile device to identify the surfaces on which it is ..... relies on texture patterns from the image sensor, which add computational and .... each transparent surface naturally encodes an ID in the mate- rial's optic

SIGCHI Conference Paper Format - Research at Google
for gesture typing is approximately 5–10% higher than for touch typing. This problem ..... dimensions of the Nexus 5 [16] Android keyboard. Since most of today's ...

2007 JavaoneSM Conference - Research at Google
Features Java Technology, open Source, Web 2.0, Emerging Technologies, and More ... Javaone Pavilion: May 8–10, 2007, The Moscone Center, San Francisco, CA .... HoST : John Gage, Chief Researcher and Vice President, Science Office, Sun Microsystems

SIGCHI Conference Proceedings Format - Research at Google
May 12, 2016 - ... the three most popular online contexts [39]: search engines, social networks ... number), online browsing history (the list of websites you vis- ited), online .... to rank, from the most to the least personally identifying, 10 type

SIGCHI Conference Paper Format - Research at Google
Murphy and Priebe [10] provide an ... adopted social media despite infrastructural challenges. .... social networking sites and rich communication systems such.

SIGCHI Conference Paper Format - Research at Google
Exploring Video Streaming in Public Settings: Shared .... The closest known activity is the online documented use of. Google+ .... recurring themes in our data.

SIGCHI Conference Proceedings Format - Research at Google
Apr 23, 2015 - Author Keywords. Online communities; Online social networks; Google+ ..... controversial; ten people commented that they felt that dis- cussion topics ..... Sites like. Digg have demonstrated this diffuse attention can effectively.

SIGCHI Conference Paper Format - Research at Google
awkward to hold for long video calls. Video Chat ... O'Brien and Mueller [24] explored social jogging and found that a key ... moved from forests and parks to everyday urban centers. .... geocaches each within an hour time period. The area.