Inside Azure datacenter architecture with Mark Russinovich - BRK3347

Inside Azure datacenter architecture with Mark Russinovich - BRK3347

Show Video

Afternoon everybody. There. We go how's it going how's the Ignite going first. How many first-timers unite, first-timers ok. Quite a few of you welcome big night my, name is mark russinovich I'm chief technology, officer of Microsoft. Azure how many people have heard of badger, I don't just, actually. When I first started doing this talk that was a real question so. We've come a long way and what, I'm going to show you is some, of the latest, innovations that we have in Azure give, you a little look at maybe some of the things that are coming and along. The way frame, that in, overview. Very high-level overview of the azure architecture. So this will take you inside into. Various aspects, of the azure platform here's. An agenda for what I've got it's a ton, of material, and I think, that one of the most difficult, things for me about this talk is that, I start out with a whole collection of demos, and cool features. Internal. Information that. Would, end up being like three versions, three of these in order and so I've got to then trim them all down to. Just a few and you can see it's. Just painful cutting it down and I still I'm going to cover a lot so I'm gonna be moving very quickly through this I've got, ten demos for you and there's, some pretty cool shock, and awe kind of demos in here but, first I'm going to give you a little background on our, our. Data. Centers talk about our data center architecture our, philosophy, about how we place data centers where. We're going with data centers some, future-looking aspects, of data centers get. Into physical networking, which is how we connect the data centers together and what, we do within the data center with networking, talk. About logical, networking, which is the virtual networking capabilities, go. Into the, server design so I'm going to take you on a history of azor, servers, going all the way back to 2010 which is when I joined, a show you what, servers look like then and the kind of servers that we're deploying now go.

Into Our, computer, which, includes, how we deploy virtual machines as well as containers because containers, are becoming a very critical aspect of our platform, and I'll show you some really cool innovations, we've got including. Some that we announced this week here at ignite talk. About Azure storage one, of the fundamental building blocks, of any cloud application, and we've, done a number of things this year to, improve, the. Scalability, of our storage platform, as well. As in, the performance, of our disk platform. It's all good cover that in that and then, finally a, brief. Look at our data platform, so I'm going to show you some things that are gonna be coming in Azure database that, should excite any of you that have, large data bases and Azure database and then. Talk. A little bit about cosmos, DB and how that's architected, so let's go ahead and get started by, taking, a look inside of our data centers and. Like. I said I'm gonna start with how we place data, centers we. Divide the world into Gio's or geographies, and these, used, to be large areas, of the world like Western Europe North America, but, more, and more it's, becoming a country and you'll. See our geo is now mapping to countries. What. We believe, our philosophy, here about where we decide to create one of these gos and how we define. The borders of it is that customers, are generally, comfortable having their data anywhere within that. That. Border of that geo. So. Within, a geo where we place data centers we call those regions and a. Region is defined by a latency envelope, of roughly. Two milliseconds, so we place the data centers close enough together that data. Can move with, ability. To synchronously, replicate, data within a region and then. We also go into, region. Pair architecture, with, almost all of our regions there's only one exception right now and that's it's, South, America, the Brazilian data center region which is not paired with South Central US but, everywhere else in the world we've gone in with two regions and the reason we do that is that, we want to create the ability for customers, to dr, from one region to the other so. We separate them by hundreds of miles so that they can tolerate large. Catastrophic. Events like the power grid failure like a large like a hurricane which is top.

Of Mind for a lot of people these days and we. Also to ensure, high, availability of, those region pairs we. Update. Software to them independently, so we'll do one and then the other in case there's a problem with the software update caught bringing. Down part of a region we don't bring down both of the region pairs and customers can fail away. One. Of the asks. That we've gotten for a long time is, resiliency. To look, more localized kinds of faults within a region like a data, center flood or, a power, outage in a data center and we've, had some, of those in our past and so to. Have. Customers. Be able to protect against those we've. Started to roll out something, called availability, zones, availability. Zones are taking. A region and splitting into separate, three. Separate, isolated. Data. Center areas, where. Those data, center areas are have. Non-correlated. Fault, modes by. That I mean is that they're on independent, electricity. They're on independent, power they're, an independent water and. They're. Independent physical. Facilities, so if there's a flood in a data center it's only going to impact one of those AZ's not, the other two the, reason we go in with three is that. We want customers, and us, to be able to create stored, persistent. Storage services, that replicate, data and to. Replicate data with, high durability you need a quorum, or a vote on what, is the most durable data and so, that means you need at least three to, achieve that where you can lose one and still be able to write data with, a vote of two of three with the third one not participating. Customers. Will. See three zones though. Regions, could have potentially, more than that and so we're doing a logical, to physical mapping, so, here's a hypothetical. Region, that has four zones, and. Two. Customers with different subscriptions, might have different mappings, of their, zones that they see Zone one two and three to the physical zones underneath. Now. That's kind of the logical view of how we place data centers here's the physical view of how we place data centers I've got a few pictures here some of our regions. Of some of the data centers and our region's here's two. Data centers they're. Actually two. Data centers that consists of four independent, coalos 8, megawatt, coalos here so total of 64 megawatts of capacity is, what you're looking at this, is in Quincy Washington near, the Columbia River and that. Reflects, our, our. Goal to power. Our data centers off renewable energy sources like the Columbia River. Here's. Another look at Cheyenne, Washington. So you can see the similar data center architecture, there, this architecture. Is called DC 2015. So it's a few years old we've got a couple, of generations of data center we've created since this one, but this is one of our larger. Regions, our kind of regions that we are, creating, to basically, be able to expand. Indefinitely. In terms of the footprint that we can put there lots of land lots of power here's. A look inside that data center so here's a hallway it kind of looked looks, like the Deathstar hallway but. That's. Kind. Of the way it looks is relatively. Boring and actually what's kind of weird, is walking inside one of those things and having contributed to the software that's running on these things I have no idea what's running on those particular, servers. This is humming, away doing who-knows-what. Now. Here's a look at Dublin Ireland this is another one of our larger, regions you can see there that there's, two, data centers in this picture and they're. Slightly different data, center architectures, so we've been building out in Dublin for, the last eight or so years so data center architectures, have changed several times since then here's another look at at Dublin. So in this picture you can see another data center facility, in addition. To the two we were just looking at this is from a different angle there that long one across. The top yet. Another data center architecture. We. Always iterating. On data center design and the reason that we do so is to optimize, the efficiency of, our, servers or peat try to get PUA down which is the, wasted electricity. You. Want to maximise that electricity so that the, servers are cheaper to run and one, of the ways that we're looking at with, Microsoft, Research on running, data centers extremely cheaply is by. Putting. Them on the ocean floor this is a project, called Natick how many people have heard of Natick so. Quite a few of you that a couple years ago there was version one of Natick which. Made, a story in the New York Times Microsoft, Rhys research, took a very small cylinder, put one rack in it and dropped, it in about 30 feet of water for, a month, and then pull it up and said servers still working cool so.

We Didn't, learn a tremendous, amount from that but. What we hope to learn are, some. Aspects. Of how, we can optimize, the energy efficiency, of the data center and maximize, the lifetime of the server some I'll, get to that in a second after I show you just what we did in this experiment what you were just seeing there is the tube that was about 12 meters. Long 3 meters wide had. 12 racks in it of servers. And that. Is. Then gonna be dropped to the ocean floor and you can see this gantry, now which. Is going going. To be floated out one kilometer offshore and then. Drop the cylinder. To. The ocean floor in, about. A hundred feet of water. This. Is off the shore. It's a coast of Scotland northern. Scotland now. Once it's dropped on the floor we need to connect it to electricity. And, networking. And so you can see this is the tube that comes. From land that we connect up to the cylinder. These. Cylinders, can go from being manufactured, saying we want one to actually be dropping on the ocean floor in 30 days and so that's exactly what we did in this place the cylinder. Was made in France and then, we had it shipped to Scotland and dropped it in the ocean floor 30 days later. The. Reasons why we think this can potentially be interesting, is that this. Is a lights-out operation, besides well, the obvious thing that is cooled by the ocean water so. And, it's, liquid cooled so those servers. Were encased we have liquid water. From. The ocean coming in cooling the server's is that. We're. Not touching the servers it's a light out operation, and the theory is if we don't touch the servers if humans aren't manipulating, them if the, temperature, remains constant, if. The water. Vapor is extracted. From the air and the air is extracted, from the cylinder so it's basically, a vacuum, and. It's. Isolated, from the outside world that those servers will last longer so this experiments, gonna last, about a three. Years they're. Sitting there on their some floor and we expect to have pretty good view of server, reliability, after, about a year of operating this now. We, were concerned about potential, impact to the environment we've done analysis and shows that we. Could put lots of these on the ocean floor and we're just not going to affect even. The local surroundings, with. The heat that's dissipated, from these there's just so much ability. For the ocean to absorb. So much heat but, nevertheless we. Do want to understand if there's going to be any impact that we're not aware of and so what we've done is mounted cameras, down, there and. This. Is actually, a feed from one of the cameras so this, is pumping. Images up into Azure machine learning. Pipeline which is coming using, a fish, classifier, that a marine, biologists, now are looking at and. We're. Trying to track you know have the health of the ecosystem around, these tanks there's. Actually a live feed on, the website for Natick but, if. You go there at this the cameras covered by seaweed so we, do have to have some people, go down there occasionally not, to service the servers but to clean the seaweed, off the cameras. Now. Let's talk a look at physical. Networking. When. It comes to physical networking, architecture is, designed. To connect our regions together over. Our when. You. Can see the the. Fact that we can connect, to consumers, we connect to enterprises to, express route Direct Connect in internet. Exchanges, and then. We've got a regional network which I'll talk about which, connects the regions this data centers in the region together and then to the WAM the. This. Networking. Infrastructure. Supports, all of our networking services, and you can see a bunch of networking services, that. Are both logical, and physical there, across, the bottom and, that's a set of services continues, to expand in fact we've, got a couple that I'm going to talk about that we denounced here just did ignite that. Dark. Fiber backbone is one of the largest in the world fact, we think it's one of the two largest dark, fiber network. Backbones, connecting, data, centers together in the entire planet, and you, can see a map, of it here, literally. Tens of thousands of miles of cable. Connecting our data centers together including.

Some Very. Expensive, projects. That we've gone on with other, companies to lay cables. Across the Pacific, and across. The Atlantic just very recently about a year and a, year. Ago we announced the Mireya subsea cable connecting. Bilbao. Spain. I think we're, somewhere in Spain Costa, Spain and Virginia, now, most of the subject cables in the Atlantic connect UK, and New, York and so this provides a extra level of redundancy for our cables that are there plus. It's 160, terabits of capacity, making it the largest, bandwidth. Subsea. Cable crossing the Atlantic, we. Also have lots of points of presence we've got 3,500, peering points with. Partners 103, points of presence out there and, those. Points of presence all to come back and talk to later some of the things we're doing with that. One. Of the ways. That we make, a region, scalable, is by not directly, connecting the data centers together, if you imagine some, of our data some of our region's as they continue to expand we'll have potentially. Dozens of data centers and if you think of having. Those data centers all be able to talk to each other well. You could crossbar, them together but, that's going to be a tremendous amount of complexity. And a, tremendous amount of cable to connect them all together so what we do is we create what, these are regional. Network gateways, which are redundant, you can see there's, two physical facilities, where. The data centers are connected to both of them and all. Of them and so now traffic, within a region can, stay within that region by going through the regional, network gateways between any of those two data centers and then. Those regional network gateways are connected, to the backbone, that. While. That simplifies, the. Topology that we would otherwise have if we directly connect the data centers together. Managing. The routes managing, the Ackles across all of the physical devices we've got across, all the hundred, and fifty plus data centers that we've got operational. Right now is extremely. Complicated, and the. Way that we used to do, it was. The way that traditional. IT does it is that, we've had spreadsheets. And. Files, that show the configuration, for the routers and the Ackles and the routes and then. We go and would. Program the routers when we updated, or expanded, or capacity, or changed. Our routes what. Happens in that case though and how. Many of you are network engineers in here so. A few of you have, you ever miss, routed something in black hole traffic, so. This. Is one of the big risks as the system become so complicated, as nobody understands, anymore the, effect of a single configuration, change and the, wrong one had literally impact, all of our regions around the world by sinkhole in traffic so.

A Few, years ago we started a project with Microsoft Research to create an emulation, of our entire network a high fidelity one that runs in Azure virtual machines, where. The actual, physical devices are represented, with emulators. That we can give the exact configuration we'd, give to the real devices -. And have, them operate the same way they would with, bug for bug compatibility. In this. Way we can anytime to change make, a change to our when. Our routes, are Ackles we can deploy it to this run, the simulation, and make sure that everything is still connected the way we want to and we've, caught, numerous bugs before they've gone out into production and this is really the only way we could be operating at the scale we're at with, keeping, Network highly, available in the face of change configuration. Changes you. Can see we've spent over 12 million core hours simulating. Changes to our network in the last year. So. That's a look at physical networking let's talk about logical. And. If, you take a look at what we're how. We architected Azure from a logical perspective, it's. Useful. To contrast, that to the traditional, enterprise. Network, architecture, which is based off of appliances. That, have management. The control plane and the data plane all within the same box and when Azure started that's the way we did things we quickly, within a, year. Or two after azure launched, found, this to be highly complicated, fragile. Expensive. And. Lacked. High, availability. Those. Boxes, don't. Have. No scale out architecture and, so you just have to buy bigger and bigger boxes, so. What we did was start working on software-defined networking before, it became a really hip term and suffered to find networking, you take those layers and blow, them out into software programs, that run on standard. Servers, so. We took management. That's the azure of management as a resource management that's how you program what you want for your virtual, network your virtual machines your IP addresses, your. Network security groups then. We implemented controllers, that, take the high level management API is translate, them into what. The. Goal state the kind of state that is desired for, in at the collection of virtual machines and then, the, controllers take, that information and then plumb it down into all the distributed, devices. In this case servers that implement, the, routes and the Ackles for, those virtual machines so this is virtualized, networking, at massive scale. So. An example of the operations, creating, a tenant or creating, so Network that.

Virtual. Network then, it has ackles plumbed into it the control plane takes those ackles gives, it to the servers on which those virtual machines are running and the. Key to this flexibility, is that's. Host, level software-defined networking. Host. Level software-defined, networking, will make sense in the context, of the overall as your architecture, which you can see in this diagram you can, see at, the very top the portal the command-line interface third-party tools by the way PowerShell, is not in there it, should be but, I just wanted to know in Geoffrey Snover so I took it out as. Your. Resource manager, and, then. You can see container, orchestrators, that run on top of virtual machines along, with compute network and storage services, the. Azure fabric, controller which manages, the deployment of virtual machines onto our servers and on top of that is a hardware manager running, on top of the hardware infrastructure, responsible. For managing the health and the provisioning of those servers. If. We just blow out the network parts. Of this you, can see that there's a network resource provider that plugs into as your resource manager it's, responsible, for implementing, the network api's and then. It's, got. Side by side but the compute resource provider that implements the virtual machine api's, I'll. Talk about how they coordinate in a little bit but. The network resource provider then, has components. Underneath, it that manage, those, control controllers. That I talked about in the SDN slide like. The neat the regional network manager then talks to the network state manager, the directory service which keeps track of. The. Virtual, network physical. To customer, address, mappings. And then. A software load balancer, which. Does the load balancing, if. You take a look at inside of one of our servers, it. Roughly looks like this you've got what, we call containers, which can be which are virtual machines those virtual machines can have containers, within them that are now considered first-class, objects, there's. A node agent sitting on the host which is managing, the creation, of those virtual machines and then. There's another, number, of other agents including network agents like, network, manager. Agent and the load balancer agent and those. Programs something called the virtual filtering, platform, that, is responsible, for encapsulating the encapsulating, traffic applying.

And. Implying. Routes and firewall, rules. And. You can see that when we what. We, do when we deploy a virtual machine commands. Come down both to deploy the virtual machine and then, the, network stack. Finds. Out where that virtual machine is and talks to its agent to wire that virtual machine up to the network so. A. Number of other logical. Services, that were building. Out as part of the overall logical, network capabilities. Include. Something called the. Azure front, door service which. We announced here at ignite I talked about those 130 points of presence and we are launching here at ignite as, your front door service in about. 70, of those and with. This you can terminate SSL sup SSL, traffic at, our front doors you can do. Load Global, HDB load balancing, through our front doors it's integrated with app service and as your Web Apps so if you're using those it's. Using that for, application. Acceleration, our of our front doors as well and we've, been using Azure front door inside, Microsoft across. Our properties like Dynamics 365, and office 365, and being an Xbox. Live for the last five years so this is us bringing that internal service and making it available for, customers, to use. Another. Service that were announced this week. The is, something called natural as your virtual LAN. As. Your virtual WAM is going after the problem where you've, got lots of different users running around some, of them that, are located, in near, your corporate data center some, of them that are branch offices, some of them are out on using mobile devices and you, want control over their traffic you want to be able to filter. Their traffic you want to be able to do. Security inspection on their traffic but getting, a handle on it is hard, you, could make them all VPN, send. All the traffic back into your corporate network but. Then you're overloading, your corporate network your. Device is there your firewall, devices get overloaded so, what we're doing with virtual, wim is making. So making it so you can very. Easily plug, your, branch offices, and your, VPN gateways into Azure. Into. A backbone, the virtual LAN backbone, through hubs so that all the traffic goes, up and gets routed to the right places, in the cloud and. Related. To that is the. General availability we announced here at ignite of the firewall service another logical, networking service in this, case what's. Behind the scenes is you create a network virtual, firewall, is its. Launching, virtual machines it's kind of serve a list because you don't see them we're launching, them we're scaling them out based on your traffic load the. Level of service that you've requested and, this. Firewall. Can implement level. Two all the way up to level level, seven policies, so you can use fqd you fqdn. Kind. Of rules, to block, traffic or route. Traffic you, can use IP Ackles, as well, on the firewall you can have. The firewall, allocated. Be allocated the IP addresses that fire will be allocated from a specific subnet, making it very easy to set, up firewall, rules that, will scale with, your cloud deployments, on your, own local, corporate firewalls. So. I mentioned, that, virtual. Machines are first class citizens obviously, an azure networking, but. Up to this point, containers. Have, not been we're. Working on a project here called Swift which, we're rolling, out right now for our own services, we'll make available to, third, parties, that, lets an Orchestrator, program, the azure virtual, network directly the idea is you've. Got a bunch of containers, they, belong to different applications, that should be in different virtual networks that, you're deploying to a cluster, so. How do you do that because up, to now virtual, machines are in, a single virtual network well. With Swift, what. You can do is delegate, management. Of a subnet to an Orchestrator, and. The. Orchestrator, and a set of virtual machines the orchestrator is managing at, that point the orchestrator can call into the azure network. Agent, on the, server and. Assign. IP addresses to particular VMS from that subnet range and then. Within the VM it, will run a container, network plug-in, that, takes that IP address and exposes, it to the container or Maps it to whatever is the container seized so.

In This way as a container moves from virtual machine to virtual machine the orchestrator knows which virtual machine it's on and, can. Tell using, an agent, here's. The picture. Can tell the agent on. The host which. Subnet, which, IP addresses from the subnet it's managing this. Delegated, network controller here is on the master nodes and. Will. Tell the agents that. Are, deploying. Containers, which. IP addresses, that they should plumb into. The host to say hey this containers, here the IP address should be this and let's might. Be kind of hard to follow so let's go ahead and take a look at a demo of that in action. So. What I'm going to do here. On. The, demo machine is. I've. Got a kubernetes, cluster. And. I'm going to show you. That. That cluster. Consists. Of two nodes a. Worker. And a. Master. So. Here's, the worker and what I'm going to do for this worker now is. Deploy. Some. Pods to it but let's first show that there's no pods running on this. Get. Pods no. Pods running so I'm going to run a little script what this script is doing is delegating. A subnet, to the, kubernetes, master, and saying you, can down use, this subnet the IP addresses from the subnet to assign them to containers on that. Worker node that worker VM in, fact we're assigning two of them to virtual networks. -. Kubernetes. So. Now with two words on networks we can start to deploy containers into one virtual network or the other one and they're. Isolated, from one another just in the same way two virtual machines would, be isolated, from one another even. Though these containers happen to be running on the same virtual, machine and to. Demonstrate that in action. Let's. See if this thing has. Come. Up with IP. Addresses. And. It does they they both have the same IP address even though they're both running and these, are real as, your. IP addresses, that are assigned to virtual networks so, they're accessible from other virtual machines talking. To those IP addresses and, so. What I'm going to do is, launch. A web server on one of them so, I'm going to actually.

Cube. Control. Exec. -. IT. Into. This first. Container. Whoops. Copy/paste. Bash. And to, show you that this is a not a can. Demo. What. Message do you want me to write. Hello. Asher. Let. Me be correct here hello, comma Asher, oh hello. Comma capital, Asher. Okay. Now, let's start up a web server. 23. 0.15. I think that's it let's put it on port 80 and, now. So that web server is running inside one of these containers that's mapped to one of those now. I'm going to take from, another virtual, machine that's outside of these outside. In, a different, node, from the kubernetes worker I'm. Going, to and, that's. Part of the same virtual network that web servers in. See. If I can get that file and there it goes now this. Other virtual machine is in that second, subnet, the one that doesn't have that, with the one with the container that doesn't have, the. Web. Server running it. And. You, can see that I can ping that. Container. The second container and to, prove that it's actually, a different container even though it's got the same IP. Address, in a different virtual network. And. There. It's, not running the web server so what, we've just done is showed two containers they've, got the same ip address exposed, to from, virtual networks to, different virtual networks in the same virtual machine and so this allows now. Containers. To have the same network, policies, applied to them as, virtual machines do first-class, citizens in the azure network. So. Let's take a look inside of Rogers servers now and again I said I was going to take you to walk down memory lane. Starting. With our gem tube machine you can see I've got ratios. Of core and RAM there at the top to give you an idea of how we've grown over time because in the, original days we started by, kind. Of scale out commodity, servers so, this server has 32 gig of ram in it it's got a one gigabit, Nick in it it's. Got no SSD, in it so. It was an awesome machine. Back in 2010, and then. We introduced, our Gen 3 machines, got, slightly larger we introduced 10 gigabit networking then 40 gigabit networking, started. To introduce SSDs. Then. We introduced a scale up machine. We. Called this one Godzilla. Because. At the time it was one of the largest. Servers. In the public cloud had 512. Gigabytes, of RAM in it so you could have a virtual machine with roughly 450, gig of ram in it and this. Was to run scale up ASAP Hana workloads, other min memory databases we had lots of enterprises, starting to migrate Hana. Family. Introduced to our Gen 5. General-purpose. Machine. And you can see it's half the memory now or more, than half the memory of our Godzilla so. Starting to get bigger, general, purpose machines then here's our Gen 6 which we've are in, the process of deploying now, but. Then we had customers, coming to us and saying we've got even bigger ASAP. Hana workloads, they don't fit in your Godzilla what, are you gonna do for us so. We introduced, this. And. This is called the beast it's. Got four terabytes, of RAM in it. But. You're. Impressed at that. Well. They came and said okay that's good but we've got even bigger sa, P workloads. So. We've just announced this beast. V2, it's got 12, terabytes of RAM in it now, you might ask why it's called beefy - why didn't we call it son, of beast. Son. Of a beast I guess it would be better. Or. Beauty. And the Beast turns, out there's a movie some, kind of movie that has that name but anyway so we just came up with beast be - but let me show you. Beast. B - oh by. The way we also have sphere, that's. Another one. Okay. So here's beast. So. I'm going to show you one, of my favorite tools. And. Show you just how much RAM this thing has in it just for real. All, right whoever wrote this, it's. Like cut off it's got a negative number here. This. Is one of this is a sad sad day. I've. Got to run this tool instead. So. How much RAM does this have in it. There. It is right there. So. Notepad, is amazingly. Fast on this thing. And. You, can actually open seven, chrome tabs in it without any problem. Okay. So. Now. We've, also, got special-purpose servers, so here's HPC. Skew we. Started introduced GPUs, so one, of our first GPUs series had, a k-8 II processor, for. General-purpose can, compute. On GPUs, when, you introduced high, performance, computing, VMS. They have infinite band network in fact they've got a hundred gigabit, infinite been networking connecting, them together, we.

Introduced. High, performance, computing. SKUs that are compute, intensive so they run at higher frequency, and have more cores and then. We introduced started. To introduce deep. Learning excuse, so the ND v2. This. One has eight. Nvidia, V 100's, cross. Linked together with MV link in a, 4u, server box. So those things consume, quite. A bit of electricity and, then. Here's, the LV 2 which we're also announcing this. Is. High-density, SSD, skew for, database. Type applications, like a Cassander or a MongoDB that, store data locally on the SSDs, and want high performance, we. Provide high performance with, this SKU using, something that's part of server called nvme, direct so. We expose the SSDs, directly. The SSD I accuse. Directly. In to the virtual machines map them directly on the virtual machine so they can directly access I, the i/o queues putting. Commands. Into it and reading the results out of them but. We put filters on the. Command queue so, that the, VM cannot, do something like flash the firmware or, other, things. That it shouldn't be doing to the SSD so this provides security. While, giving the VM native. SSD performance, and let's just see what. Kind of performance, we, can get from one of these things. So. This. Tab. Is. An. Nvme direct. VM. We're, actually looking at a sure serve a host here with, task manager running on the host this is the VM with. An application, running in it then I'm going to launch you, can see perfmon, here. And, you, can see. That. We've got 0i, ops, going right now but I'm going to press ENTER to run this little script which, is going to start to hammer the disk. And. This is going to quiesce here. At. About. 3.7. Million, I ops, off of, one disk, sorry. One server there's a bunch of disks in here so. This is actually cloud. Leading, right. Now there's, no other cloud, server. That delivers this I ops, this high off of local SSDs, and you can see evidence, that it's mapped directly into the VM because this is the host task, manager no activity on the host hears. Inside the VM that's processing. All those iOS it's, burning a lot of course to do that so. That's a look at nvme. So. Going in as your compute now let's, take a look at the compute architecture, the architecture again, but this time we're going to focus in on the. Compute stack. And. Here. You can see the different layers and the compute stack you can see the global as a resource manager. Resource. Providers, that operate at the regional level I've already talked about NRP the network resource provider we're, focused on CRP, the compute resource provider here but, you, can see all these other regional controllers, are nm. Regional network manager regional. Directory service, the. U.s. lb service the SLV service for region all of those are going to be coordinating, with cluster level services. For. Smaller blast radius to managing a small subset, of the servers and then. Talking, to the nodes and then when you deploy a virtual machine this, is the kind of flow of information, across all these different components cuz they're all, orchestrating. To, get that, data down, to the server's, so you got your virtual machine up with a disk attached to it and networking. So we're, gonna spend the next 10 minutes or so going through this so. I'm. Just a joke we're not going to do that do, you want. To yeah. All, right well skip that but, we've been doing all a bunch of things to make virtual machines more easier to manage and one of the things we're excited in us this, week at ignite, is giving, a serial. Console. Access, to your virtual machines. So. If you're, not impressed, right, okay. Hold your plot hold your plus I'll show, you this in action. Okay. So we. Go to. Back. Here and go to the browser. I'm. Going to refresh. These. And we're going to come back and I'll show you what those are in a second but. What I want to show you here, is. I've. Got a Linux. Virtual, machine here running an azure. And. You, can see that it's got two network interfaces on it one. Of them is the Ethernet network. Interface which is connected to the azure network and, I'm going to do something that's going to simulate a problem, that.

Will Come usually because of some corruption some, bug. But. Shows. You. Actually. That's I'm glad I didn't do that because I wanted to ping this virtual, machine first to show you. So. I'm pinging that virtual machine I. Mistyped. That. And. What this command does this takes. Out that Ethernet. Interface. So. The second I do this. Those. Pings stop, and at, this point your VM is completely. Inaccessible. Or. Is it. If. We go back to the serial console, I'm, SSH. Into my VM through. The serial port on the. Virtual machine by. Leveraging an agent, in the, azure host. The server. So. So, if I do this. Boom. I just fix the network. Now. Let's take a like that's, not just for Linux it's for windows too I'm. Gonna crash windows here using, another one of my favorite tools. This. Is a tool called not my fault that I wrote for Windows internals and, you. Can crash the system in lots of cool ways I'm, going to cross it in a simple way I've, just pressed the crash button and this, thing is gone, so. Now, we want to see what happened to this if, we go to the serial console for my Windows machine you can see what happened it. Crashed, my. Fault dot sis this information, spit out at the, crash through, the serial port and, now, we can see it and also, because. We're capturing. Screenshots, of the, VM. Into, Azure storage we can show you that in the portal too so you can see here what. Happened with that VM so. That's a look at serial console. Access. So. Another way that we're advancing. The state art with containers besides giving them first-class. Citizenship. In the azure network is to take, them serverless and, this is something we announced here you might have seen Scott. Go up three talked about or Cori talked about it it's, virtual. Cubelet, what, this is it's effectively an, agent, that plugs into a kubernetes cluster and, behind. It is. Whatever. It wants could. Be a cluster of another type of compute, but, two kubernetes it just presents itself as a node another server. One. That has effectively, infinite capacity and. With. We created, a virtual cubelet, that connects with Azure container, instances, which is our service container service, you can just launch a docker container into a CI you. Don't have to create a virtual machine ahead, of time and the, container will just launch, into. A server list infrastructure, and so a virtual, Keyblade on top of a CI effectively. Makes kubernetes, a service, if you had a kubernetes cluster with zero nodes in it except for a virtual. Cubelet backed by a CI you could start deploying pods to it and they would automatically scale, out onto a CI, we're. Also taking service. Servus, how. Many people have heard of service traffic out of curiosity so. Quite a few of you this is our own micro-services, Orchestrator, it supports, higher level programming language that supports stateful, micro services we've built a tremendous, amount, of not just a sure but Microsoft on top of service fabric and you can see a list of some of the services here like those, core as your resource providers the compute network and storage are all built on service, fabric Cortana. Cosmos. DB's built on service fabric Skype for business service. Bus event hubs event grid power bi as your DB in tune, and a list goes on and on all. Of these operating, at massive scale for years on top of service fabric and we made this available open source for Linux and Windows we've. Made it as a service in Azure so you can launch a service fabric cluster but. We're taking it service with a we're, announcing this week preview. Of service fabric mesh with.

Service Fabric mesh we, managed service fabric clusters in the background, you launch, service. Fabric apps to the mesh service, you don't specify virtual, machines we. You, just specify them as containerized, micro services and then, we launch them and manage, the virtual machines underneath them and do all the wiring and load balancing and everything else that's necessary to. Keep them running highly available way. I'm. Gonna show you a quick demo of service fabric mesh now. And. Show, you just kind of the scale capabilities. That you can get with this oh. I'm. Actually that's in a few minutes that's, a different demo you're. Free to stand out here if you want no I'm just. Yeah. It's gonna be the next demo after this sorry. So. Here I've got a. Service Roderick mash application, and it. Consists of a web front-end, plus three micro services, that. Are the same their fireworks, micro services and we've. Passed in, they. Each have a replica count of one and we. Pass in the. Here's. Here's, one of them. Replica. Count of one, you. Can see we pass in which color the, fireworks should be as environment, variables so that's the red one we, scroll down we're gonna see the. Green one and I can see replicas count of one and, then. Finally a blue one and what. This is doing. Is this, right here that's. What this app is doing, just. Firing a few, every second red blue, green kind, of randomly we. Want to get, something a little more exciting than this so. What we're gonna do is launch, this, version of it where. All we did was take that application set. The ripple count to 500 and, I'm. Gonna launch it. Right. Now as an, update to that existing one which is running on service roderick mesh. So. In about somewhere. Around 30 seconds or so we're, gonna start to see this thing scale out. Should, be coming, along. There, we go or scaling out number of replicas. So. It's at our service like this we didn't have to go say how many virtual machines we want we're not we're only paying for those containers as they run if we shut them down we're not paying anything more for, any compute memory, or anything else so. This is the power of server, list right there. So. The other thing we're doing with. Containers. Is taking to the edge and micro services in general so. You've heard about intelligent, cloud intelligent, edge or vision for computing, that spans both and we, want the computing to be consistent, across both so. That, service traffic application, we want to make it possible for you to take that and deploy it on Raspberry, Pi Class devices, now you probably wouldn't really want to do that but. The, I point, is that, we want to make it easy for you to take a micro services app that can run in the cloud and also, run it on on-prem edge devices, or devices out in the real world same set of tooling. Orchestration. Of the launching of those applications, and the updating of those applications through, a consistent, cloud experience, so. What I'm going to show you is deploying a highly available cluster, to. A group of edge devices I've got okay, you can come out now. I've, got this. Set, up up here. Now. This is a very. Expensive piece. Of IOT machinery. This. Extremely. Expensive and. Very. Dangerous. It's. Got a spinning, wheel on. It that if you, know I put my hand in it I would get. In trouble now. This thing is and you can see that it's got a sensor. That. Is preventing, me from hurting, myself now if I go to the demo machine, I'll. Show you that I've, got actually that app. Running. Up here in the cloud. And. This should be showing, me that. Telemetry. Coming from that thing. It's not what we should be seeing here, is data spewing, out here about. From, that device here in this cloud-based portal, with, these application. And the machine learning algorithms all running up here in the, cloud but. What I'm going to do is push that. Application. Down to. The. Devices so. If I go to the IOT devices here. What. I'm going to do is create this highly available cluster, out of the collection of those three devices and.

For. Some reason I'm not seeing those devices. Hmm. Seems. Like. I've. Got a network problem apparently. It's. Preventing, this demo from working. And. It is connected to the network. So. I should be seeing the devices here but I'm not for some reason so this is a pre, oh here, we go sorry they're, here, there. They are so what I'm going to do is select them all and then, say create edge cluster so I was looking in the wrong place I'm, gonna call this ignite edge. Cluster, this. Is going to create a highly available cluster, you, can see there it's provisioning, so, it's deploying a cluster to those three IOT, devices that are managing, that connected. To that very. Expensive. Piece of machinery and now I'm going to deploy a. Bunch. Of apps too, so. Run on web app run, an ignite, this. This. Edge. So. I'm deploying the. Machine. Learning protection, app which is using vision recognition, to, determine if there's some object, extracting, those in front of the device, and it's going to turn it off so. This, will finish. Deploying down to those devices hopefully, in a second here. And. The, whole idea here is I want this thing to be highly available because this, machine can't be offline but it needs to be safe too so I need to run that machine learning algorithm on that device and I need to be have it be able to tolerate failures. Of. Components. Like those raspberry, PI's it's. Taking a while to deploy so this might be lagging. No. Way, do. A refresh here. And. Yeah. Okay they're running their statuses running so let's go back now and. What I'm going to show to demonstrate this highly high availability is, I'm going to unplug. This. Device and the network hub this device is connected to from. The, network so I've just unplugged, it right there. And. Not only that but if I come so once I've done that if I come back over to the machine I've got another. View of the portal running on this, system that is talking to those devices and you can see. That. It stops collecting telemetry, wait no, I. Don't. Know what just happened that's, not what I meant to do. Oh that's, the wrong oh, here, we go. So. But. You can see that the machine. Learning app is running on edge device number three and, so. What I'm going to do is go to unplug edge device number three. And. At this point we should see that. This. Controller, this monitor. See that that device is offline we're, gonna see an automatic failover that the highly, available clusters doing we can have both kubernetes. And service wrapper caps you can see on this thing and, they're. A machine learning api moved and you can see that i'm. Still safe. So. That is, disconnected. Highly available, deployment. From the cloud using, micro services that are also consistent, across the cloud and on Prem, so, that's where we're, taking. That thanks. Now. The. Other thing that one of the other things we're doing with the with. Compute is that we want to protect your data if. You came to my session earlier in Microsoft Mechanics you saw me talk about something called confidence or computing the idea is that. There's, lots of threats for your data and, there's even potentially, different kinds of threats when you put your data in the cloud there's.

Third Parties that might request access to your data there's hackers. That can might breach the infrastructure, and get access to your data and so. We take steps lots, of steps in defense we spend a billion dollars in cybersecurity but. We also implement, capabilities, like, encryption. For our services, at rest encryption. And transit, applications. Can also encrypt, their own data in transit but what's been missing up to now is protecting. The data while it's in use and the. Cloud is not very useful unless you can process that data to get insights out of it machine learning analytics and. If. You can't do, that while fully protecting, it then. You're missing one, of the these three so, with confidence or computing, which, is based off of this black box kind of technology, called trusted execution environments, you. Can stick your data in a t'ee, and then it's isolated, from everything outside of it there's. Two types of te software. And hardware an example, of a hardware based one is Intel SGX where. The processor, itself creates these black boxes they're encrypted, portions of RAM where. The CPU is in a special mode nothing, can get into that mode except for the, code that is intended, to run in that Enclave, and, what. You can do then is you can get an attestation from, the, Enclave what. Code are you running and it, will produce a quote, signed, by the processor, that, says this is the code I'm running and then. From, the outside you can verify that that truly was an Intel SGX processor, by looking at the public key for that processor, and, then. Verify. That is the code that you trust and once it is establish, a secure channel with that code and hand its keys, so. That it can decrypt data into. Its Enclave and then, process, it safely. Protected. By this black box and, the kinds, of scenarios so. Actually, for confidence, for computing we may.

Just Announce the availability of the DC series which are SGX enabled virtual machines you, can go launch them right now, in Europe and us and we're, also announcing, the release of an SDK to help you right confidential, applications, one. Of the coolest, scenarios, that, this opens up is something called multi-party machine learning. And. What I'm going to show you is. The. Power of multi-party, machine learning here oh. You. Know what. Proof, that that is a real, demo. Yep. Okay, so. Connecting. To back, to our demo. VM and now we're going to go into a virtual, machine that is one of those DC series virtual machines. And. Those DC series, virtual machine. Is. Right. Here. So. I've. Got two hospitals they're represented, by different desktops and they both have breast cancer training, data that, they don't want to share with one another. But. They'd like to combine them to get deeper insights and so. The first thing they're going to do is. Encrypt. Their data to. The Enclave. So. I've just encrypted and then they're going to upload it, to. The. The. DC series VM. And. We're. Going to do that with user B as well. Here's. User B's desktop, we're. Gonna go to. Drag. This and crypt it. Up. Upload. It. There. Sonny is user B. And. This data is encrypted so that only the that's machine learning Enclave can see it now it's doing training run over it we're gonna switch back to hospital, a, because. The goal here is to get back a trained machine learning model so we're gonna go to model and we're. Gonna download it. Now. We're going to go. To evaluate the model. And. We. See that the local model we do an evaluation on the local model, we. Have accuracy, of about eighty four percent but. The model we just downloaded from the cloud from that combined data set machine learning, wrote. That data from the other Hospital. You. Can see it's a 97% accuracy at neither point did either either Hospital expose its data to the other one or to anything else including the cloud azure. Administrators, don't have access to it the hypervisor doesn't, have access to it the. Hospital of course doesn't have access to it and so that's the. Promise of confidence. Our computing. So. Let's talk about storage, now and. As your storage architecture, is. Tiered. Architecture. So you have api's, on the, front end and those API sets continue, to grow so blob was one of the first ones that we had, we have Azure, queues also. Built on top of Azure storage as your. Files and then, we've got SMB as well and. There's. A load balancers, that are on the front end load. Balancer so blood balancing traffic talk, is sending those requests to front ends there's, a table. Tier or partition, layer this is where the data is partitioned, and the. Partition, mapping, to, stream. Layer or this extent layer down, here at the bottom this distributed file service. Across. The server's, so a particular, blob, might be broken up into chunks that are on a bunch of different servers and every one of those chunks is replicated, three, times for high availability and then. If you use a GRS, storage, account where. It allows you to replicate, your data into another region then, that's, copied to another stamp, asynchronously. In that other region, we. Just announced recently as, your. Data like Storage Gen 2 which, is a massive, scale out data Lake Service it's built on top of voucher storage but now it includes an HDFS, front-end. The, gen 2 API and, the. Blob API supports, hierarchical, file systems this is something that we added on top of Azure, storage and the, reason that we could scale out the. Way we did to support these data, Lake type scenarios which are massive files, massive. Accounts of lots of big files is by. R ER connecting, storage, to. Take. What. Used to be vertical monoliths, of a storage stamp where the front end was mapped to the partition table which matched to the stream servers all on the same set of servers to. Separate them out into different software, tiers that can scale independently, and you. Can see the kind of scale this gives us for a blob the, limit throughput, limit on a blob up until this the. 60 mega bits megabytes. Per second, with, this it's, 50 gigabytes, per second, and, so you can up, load a file, now that, used to take 15 hours in just a few minutes.

We've. Also expanded, our SSD offerings, so we have standard, hard-disk as an offering, we've got premium SSD, as an offering and we've just introduced recently standard, SSD, which has, performance. That's more. Like, hard. Disk but, it has much lower latency, so it's 500 I ops 60. Megabytes per second, like, a the hard disk offering but it's single-digit, latency. Rather than multiple, digit latency that you'd get from a hard disk and then. We've, also introduced, something. Called. Ultra. SSD which, we just announced here at ignite, this. Is our next-generation disk. Storage. The. Purpose-built, block storage service it's, called direct drive codenamed, direct drive because the. Hosts, that the disks are mounted, to have, information about which servers, in the direct drive cluster, have, the. Relevant pieces of the files of the VHDs, and so, it can go talking directly to them to, write and read data rather, than going through the load balancers, in the front end and the, partition table servers, that that, premium. Even premium, SSD traffic has to go through so, how many people saw quarry is ultra, SSD demo raise our hands all. Right so those a lot of you haven't seen it Silla but I'll show it to you really quickly here. And. So here i've got an ultra SSD server by, the way these support up to 64. Terabyte. Drives. I'm. Going to launch this tool called Iometer which hammers the disk as fast, as it can now. This particular virtual, machine is provisioned. For a 160,000, I office which, is the, most you can get out of any s a single, disk in any. Cloud and, to. Show you how. Constant, consistent that performance is here diameters, I ops. Per second you, can see it's hovering, right just around 160,000, and the. Latency is a. Millisecond. Or less. Also. Interesting Lee that's what Corey showed. Which. Is pretty impressive. But. How about this I've got a special, server. With. The special discs on it that, will show you that this technology, is capable of doing it higher so we're working on making. Even, bigger. Faster discs and this. Is a preview of that that, I wanted to show you I asked the team begged them so that I could one up quarry. So. Let's fire this one. That's, 250,000. High, ops. And. The. Latency is still. About. One millisecond. So. We're. Going even further, with, our, disks, than we have own to what I just showed you, okay. So now I want to take you inside. Of our data platforms, and talk, a little bit about the database architectures, that we've got, so. When. You take a look at all the way out are many, of our data services, including Azure sequel database as your sequel data warehouse and cosmos. DB have this similar type of high-level architecture where. You go from regions. Each region consists of multiple data centers like I also already talked about there's, the concept of stamps, so they have got, deployment, groups of a, number of servers typically out, on the order of thousands of servers, those. Stamps, have what are called fault domains and a fault domain is a single point of failure. Group of servers that have a sync common single point of failure in, our data centers that would be a top iraq router or, a Power Distribution unit, if one of those fails you, lose those servers and so, we need to make sure that those storage services are, understand. Fault, domains and so they can spread data out among them I mentioned. Three copies of data well actually with a premium Azure. Database a. Premium. Azure database database you. Get four copies of data so. The four copies spread, out across different, fault domains so. If you can lose two. Servers and still be able to write to your deser database, now. Within a machine, you can see that there's a bunch of agents there's containers. That. Are implementing, those databases instances. You can see there's resource governors there's. A transport, layer that has allows that database to connect to the outside world there's admission control that throttles, on the, way in, traffic. So, that the.

User Gets the throughput. That. They provision. And then, the database engine itself and, the. Database engine of course has a query processor. Language runtimes and so on that's. The overall architecture now. How, many of you have used. As your sequel database see raising hands so. Quite a few of you how many of you have ever wanted to raise the are use the request units on your database and, then. Only to experience, the fact that you've got downtime as the, database gets resized. Anybody. It's, just one person so this I'm talking to you then everybody, else gonna ignore this section yeah. Well. I'll. Explain. The kind of experience that the few people that raise their hands have because up. To now what we've done is co-locate. The database, with the data that it manages on the same server so. There's multiple replicas, like I mentioned in the case of premium there's four so there's four servers but one of them is there the. Right master, out, of those four and, when. It writes it writes to its local disk. It also sends, the data to the other replicas, for them to write to their local discs. But. You can't scale. Up the compute independent, of the storage because there's a one-to-one mapping with. These replicas, and the data on their local disks so what, we've done is. Re. Architected, Azure database to. Break that connection between compute, and storage and this is actually a trend across data services, in our, cloud in general is to separate compute and data which. Will allow you to independently. Scale the, compute in the data in fact you can shut down the compute and still. Preserve the data something. You couldn't do with. This previous architecture, here's. A diagram, of this new architecture we call it Socrates, codenamed Socrates but, the official, name that we're talking in we're thinking of it it's actually address equal database flexus, scale and. You, can see that there's, the, compute tier at the top you. Can see that there's four replicas, like I mentioned one of them is the right master and that's the one on the left you can read and write to it the other ones you can read from so it scales in read as well and you can scale out this tier so you can have any number of read replicas, to, scale out the amount of data queries, that you can do against.

The Database. Now. The way that we get this separation, is by storing, the log on a premium. Storage. SSD, so. The log gets spit out to the SSD that. Log is then read by these. The storage stamps that then do the commits of the trailer do the serialization. Of the transactions, in the data store so. Any failure at any point either have the logs or you have the the. Data the data in storage and. Now. The data and storage can be read into any of those Reeb replicas, or the right replicas for read operations, and transaction, processing, for, writes. So. Like I said we could take that compute tear and shut it down and that data is still sitting in storage or. We, could take that compute tear and scale, it out and that's exactly what I'm going to show you now with, the demo of as. Your Flex DB it's the first time shown publicly. The. Dynamic, scaling out of. A database and what this also does is let us break. The database size barrier, so. The maximum, size anybody, know the maximum size of an add your sequel database today. Anybody. Anybody. Fort. You could ask Cortana and she probably tell you. 4. Terabytes. So. What I'm gonna show you here here's. My. Socrates. Database. And. It's. 50. Terabytes. And. We're, going to 100 so this is to where we are right now we're working on scaling, so. We're at 50 wanted, to give you a preview of what this is you can see this is a premium, l1, database with 250, dtu's, so. The, dtu's, are, the. Amount, of query. Processing that this database, will support in, the, past like I mentioned for, if, you had an l1 to. Create an l2 to scale up to knell to what, had to happen was. You had to shut down that, l1. Create. The l2 on a bigger, server and then, copy the data off of the, l1 servers, onto the l2 servers an operation. Which could take tens of minutes or hours and, then a 50 terrified at abase that's going to be awhile but. We want to be able to scale while, this database is running and, with. No and. Without. Having to copy the database so what I'm going to do is go into. Go. Into. This. Is. Not. They're getting. Lost in my demos. So. That is over. Here, here. We go so. This, is the I'm, gonna send a sequel. Command into, that database to scale it up to an l5. Execute. That. Oops. I need to select the whole line. And. This is going to take about, two. Minutes so we're gonna come back and look at that in just a couple minutes but. We're gonna see this thing dynamically. Scale out without the. Hours, that it would have taken otherwise. So. Finally. I'm gonna conclude by taking, a look at cosmos, DB. Cosmos. DB is a. Relatively, new database it's a no sequel database but. It I think it really represents the. Pinnacle of or kind of a cutting-edge of cloud. Native databases. It's. Got a number. Of different characteristics which are unique to it in the public cloud for one it's a multi-modal database, it supports, a number of different api's from. The azure tables API to, a J sequel JSON API to MongoDB, we, announce Cassandra, this week and, you're.

Likely To see even more. Gremlin. Graph API, but. Besides supporting, multiple api's, it. Also supports, multiple. Consistency. Levels so. You can on a standard database you get strong consistency, but you sacrifice, performance, for that so, the no sequel movement was saying hey we're going to be inconsistent. Asynchronous. And that. Will let us scale, better this, cosmos DB lets you go fully to asynchronous. Fully to synchronous or and, has three levels in between including. The default which is session consistency, which, is typically what most applications. Need which is anybody. Participating, in this session with the database sees. The reads and the rights in the same order but now. Different. Applications. Have with different sessions that aren't interacting with one another can, be operating, in parallel and that allows your database to get much higher perform, ev

2018-10-08 11:03

Show Video

Comments:

The fun part starts @ 28:51

seven chrome tabs :DDDDDDDDDDDDDDDDDDDDDDDDD Mark, you are the best :D

Mark I have been following you before twitter since Sysinternals in the 90s! You rule man! Thanks for the great work!

hoooooooooooly

Other news