Software Defined Networking at Scale Bikash Koley on behalf of Google Technical Infrastructure BTE 2014

Software Defined Networking at Scale Bikash Koley on behalf of Google Technical Infrastructure

Software Defined Networking at

Google

Bikash Koley on behalf of Google Technical Infrastructure

Software Defined Networks require Software Defined Operations Google made great progress in SDN data and control plane

It is time to transform the management plane with the industry! Google Confidential and Proprietary

Warehouse Scale Computers

Source: Google, 2012

100 Billion

searches per month on google.com

Google Confidential and Proprietary

Google’s Global CDN

Google Confidential and Proprietary

B4: Software Defined inter-Datacenter WAN

Google Confidential and Proprietary

History of B4 WAN

SDN Fully Deployed Exit testing "opt in" network

SDN Rollout

Central TE Deployed

Google Confidential and Proprietary

B4: SDN Architecture Mixed SDN Deployment

Cluster Border Router

Data Center Network

EBGP IBGP/ISIS to remote sites

Quagga

OFC

Paxos

RCS

Paxos

Paxos

OFA

OFA

OFA

OFA

OFA

OFA

OFA

OFA

TE Server

● Ready to introduce new network function virtualization (NFV) Google Confidential and Proprietary

B4: SDN Equipment ● The only way to get well defined control and data plane APIs on a routing HW at that time was to build it ourselves ○ ○ ○ ○ ○

Built from merchant silicon OpenFlow support Does not have all features Multiple chassis per site Fully centralized software controlled

Google Confidential and Proprietary

Why SDN? ● SDN ⇏ Cheap Hardware ● SDN = programmatic decomposition of control, data and management planes ● Well defined APIs ⇒ fundamentally easier operational model ● Separation of control and data planes ⇒ much higher uptime ● Network function virtualization ⇒ new functions rolled out in days (vs years)

Google Confidential and Proprietary

Why SDN? ● SDN ⇏ Cheap Hardware ● SDN = programmatic decomposition of control, data and management planes ● Well defined APIs ⇒ fundamentally easier operational model ● Separation of control and data planes ⇒ much higher uptime ● Network function virtualization ⇒ new functions rolled out in days (vs years)

Virtual Network ⇔ Physical Network

Google Confidential and Proprietary

Many Networks → One Network Software Defined Network

Layer-cake Network EMS

NMS

App

EMS

NMS

App

EMS

NMS

Optical

MPLS

IP

Control

Control

Control

Plane

Plane

Plane

LSR ptica

Trans Data Router port Plane

Optical tical Transport

App

App

App

App

App

App

App

App

App

App

Network Operating System

rt

Optical lane Optical Transport

nsport LSR Plane

• Heterogeneous control plane

• Common network OS

• Heterogeneous network apps

• Common network apps

• Large inefficiencies

• Global view of network states

Transpor Router

Google Confidential and Proprietary

Anatomy of a Software Defined Network Topology Model

Config Model

Config

Workflow

Analytics

Management Plane

Telemetry

Config API???

BGP

IGP

TE

Optical Restoration

Control Plane

SNMP OpenFlow/PCE-P/... SNMP

Data Plane switches/routers

Optical Transport

Google Confidential and Proprietary

Anatomy of a Software Defined Network YANG/..? Topology Model

Config Model

Config

Workflow

Management Plane

Analytics

Telemetry

Netconf/JSON/..?

BGP

IGP

JSON PUB/S UB?

TE

Optical Restoration

Control Plane

OpenFlow/PCE-P/...

Data Plane switches/routers

Optical Transport

Google Confidential and Proprietary

Software Defined Network Configuration Config Model

Topology Model

Content [config data] Operations , ,

RPC

Transport Protocol [ssh, https,..]

Google Confidential and Proprietary

Towards Declarative Transactional Semantics ● Good progress in control plane -> dataplane APIs and protocols (OpenFlow, PCE-P.. ) ● Limited progress in management plane -> control plane protocols and APIs ○

Netconf (RFC 6241) is promising, need universal adoption

● Very limited progress in standard network data model definition ○ ○ ○

YANG as modeling language is promising No vendor-neutral data model yet to describe network/device configuration No standard network topology model

● No progress in streaming transfer of bulk-variable/data ○

SNMP is clunky and not that simple ☺ Google Confidential and Proprietary

Towards a Common Network Model ● Network Config model to describe declarative configuration ○

Google is working on a rich vendor-neutral network data model described in YANG

● Network Topology model to describe multi-layer network topology (Layer-0 - 7) ○

Google made significant progress in structured hierarchical

description of multi-layer connected graphs using protocol buffers* (aka protobuf)

● We welcome collaboration in developing common config and topology models as the basis of true software defined network operation * http://code.google.com/p/protobuf/

Google Confidential and Proprietary

SDN: Beyond the Network Boundaries ● Goal ○

Exchange traffic optimally between provider networks (ASNs)

● Limitations today ○ ○

Mutual intents of traffic exchange are expressed via BGP as *hints* Suboptimal traffic exchange as the peer networks *guess* optimality

● SDN advantage ○ ○

A common network model and a rich pub/sub API, leveraging cloud Declarative intent expressed by an ISP: ■

e.g. deliver 10.20.30.0/24 to Denver, 10.20.31.0/24 to San Francisco do_not_deliver traffic in {Portland, Los Angeles}, avoid_congestion in topology_A, use augmented_topology_B

Google Confidential and Proprietary

SDN: Beyond the Network Boundaries

We welcome collaboration with the ISPs in developing programmatic traffic exchange

Google Confidential and Proprietary

Questions? [email protected]

Google Confidential and Proprietary

Software Defined Networking at Scale - Research at Google

Google Confidential and Proprietary. Google's Global CDN. Page 7. Google Confidential and Proprietary. B4: Software Defined inter-Datacenter WAN. Page 8 ...

2MB Sizes 9 Downloads 373 Views

Recommend Documents

Dynamic iSCSI at Scale- Remote paging at ... - Research at Google
Pushes new target lists to initiator to allow dynamic target instances ... Service time: Dynamic recalculation based on throughput. 9 ... Locally-fetched package distribution at scale pt 1 .... No good for multitarget load balancing ... things for fr

Shasta: Interactive Reporting At Scale - Research at Google
online queries must go all the way from primary storage to user- facing views, resulting in .... tions, a user changing a single cell in a sorted UI table can induce subtle changes to .... LANGUAGE. As described in Section 3, Shasta uses a language c

Experimenting At Scale With Google Chrome's ... - Research at Google
users' interactions with websites are at risk. Our goal in this ... sites where warnings will appear. The most .... up dialog together account for between 12 and 20 points (i.e., ... tions (e.g., leaking others' social media posts), this may not occu

Tera-scale deep learning - Research at Google
The Trend of BigData .... Scaling up Deep Learning. Real data. Deep learning data ... Le, et al., Building high-‐level features using large-‐scale unsupervised ...

Web-scale Image Annotation - Research at Google
models to explain the co-occurence relationship between image features and ... co-occurrence relationship between the two modalities. ..... screen*frontal apple.

Optimizing Google's Warehouse Scale ... - Research at Google
various corunning applications on a server, non-uniform memory accesses (NUMA) .... In addition, Gmail backend server jobs are not run in dedicated clusters.

Large-scale speaker identification - Research at Google
promises excellent scalability for large-scale data. 2. BACKGROUND. 2.1. Speaker identification with i-vectors. Robustly recognizing a speaker in spite of large ...

How Software-Defined Infrastructure Is Evolving at Intel - Media15
In comparison, we started exploring open-standards-based software-defined technology in the storage environment in 2014. Additionally, enterprise support for open-standards-based technology is more robust for the server environment than for the netwo

Building Software Systems at Google and ... - Research at Google
~1 network rewiring (rolling ~5% of machines down over 2-day span) ... services. • Typically 100s to 1000s of active jobs (some w/1 task, some w/1000s). • mix of ...

How Software-Defined Infrastructure Is Evolving at Intel - Media15
For years, Intel IT has been evolving toward software-defined infrastructure (SDI), beginning with software-defined compute (SDC), to move from a proprietary fixed-function RISC Unix* compute ..... Enterprise applications that handle complex data war

Mathematics at - Research at Google
Index. 1. How Google started. 2. PageRank. 3. Gallery of Mathematics. 4. Questions ... http://www.google.es/intl/es/about/corporate/company/history.html. ○.

100GbE and Beyond for Warehouse Scale ... - Research at Google
from desktops to large internet services, computing platforms ... racks and clusters interconnected by massive networking ... five years for WSC interconnects.

Large Scale Performance Measurement of ... - Research at Google
Large Scale Performance Measurement of Content-Based ... in photo management applications. II. .... In this section, we perform large scale tests on two.

VisualRank: Applying PageRank to Large-Scale ... - Research at Google
data noise, especially given the nature of the Web images ... [19] for video retrieval and Joshi et al. ..... the centers of the images all correspond to the original.

Distributed Large-scale Natural Graph ... - Research at Google
Natural graphs, such as social networks, email graphs, or instant messaging ... cated values in order to perform most of the computation ... On a graph of 200 million vertices and 10 billion edges, de- ... to the author's site if the Material is used

Large-scale Incremental Processing Using ... - Research at Google
language (currently C++) and mix calls to the Percola- tor API with .... 23 return true;. 24. } 25. } 26 // Prewrite tries to lock cell w, returning false in case of conflict. 27 ..... set of the servers in a Google data center. .... per hour. At thi

HaTS: Large-scale In-product Measurement of ... - Research at Google
Dec 5, 2014 - ology, standardization. 1. INTRODUCTION. Human-computer interaction (HCI) practitioners employ ... In recent years, numerous questionnaires have been devel- oped and ... tensive work by social scientists. This includes a ..... the degre

Google Image Swirl: A Large-Scale Content ... - Research at Google
{jing,har,chuck,jingbinw,mars,yliu,mingzhao,covell}@google.com. Google Inc., Mountain View, ... 2. User Interface. After hierarchical clustering has been performed, the re- sults of an image search query are organized in the struc- ture of a tree. A

Google Image Swirl: A Large-Scale Content ... - Research at Google
used to illustrate tree data data structures, there are many options in the literature, ... Visualizing web images via google image swirl. In NIPS. Workshop on ...

Large-scale, sequence-discriminative, joint ... - Research at Google
[3]. This paper focuses on improving performance of such MTR. AMs in matched and ... energy with respect to the mixture energy at each T-F bin [5]. Typically, the estimated .... for pre-training the mask estimator, we use an alternative train- ing se