# Using pre-built AI to solve business challenges | AIML20

Show Video

It's. All done through these filtering so, let's start with a simple, image so this might have been the image of a bicycle, but just to make things simple let's, let's do this very simple image of a cross it's. Nine pixels by nine pixels and the white pixels are the plus ones and the black pixels or minus ones so. We're going to apply a filter, to that image just like what happens, in each, of the nodes of a neural network. So. To transform that image we're going to apply a three by three grid of weights and that represents, our filter and actually, small grids like this you know three by three or five by five that's pretty typical for computer, vision systems, the, only difference is that in real computer vision systems the weights on round numbers, like this minus one and plus one that I have in this grid they'll, typically be numbers in that range but they look pretty random. So. To apply that, grid of weights to this particular image of a cross we, overlay, the grid of the weights centered, on a particular pixel, in the image and, then, we multiply each. Of the weights by. Each of the pixel value values, and take, the average and, that average. Then, becomes the pixel, in the transformed, image of the next layer down in, the. Neural, network pretty simple right, one. Thing you've probably noticed is that we can't use. The, very edges, of the source image because the weights grid wouldn't overlay without missing, the edges of the picture and that's, one of the reasons why the picture, gets smaller and smaller as we go through the network and there are the reasons as well but that's, basically why we get smaller images as we go down. Here's. Just another example so. I've moved that grid of whites to pixels to the right and to pixels down overlaid. It over the images again now. When we multiply the weights, by the pixel value so we get a different, average. This times 0.5, 5 so that would mean in our output, image we'd get a gray pixel, and, then we will repeat that process over all that entire image and for, all the copies of the image that flow, through the neural network now.

By The way this process of sweeping, the, weights across. The image robe, over and column by column there's. A fancy mathematical, term for that called convolution, but. It's a really simple process as we've just seen but this is why they're called convolutional. Neural networks. So. Now that we know that, each of those nodes each of those circles, in. The neural network is a transformation. Of its input images and that, in turn is determined, by a grid of weights and so. The trick therefore. To train in a neural network to recognize, objects, is to. Pick those weights in such, a way that the right numbers, come out the end and the, way we do that is with training, data, we. Provide this, neural network with, lots and lots of images in our case lots. Of images of dogs lots of images of bicycles, lots, of images, of apples and lots, of images and texts of tennis balls and we know ahead of time what. Each of those images represent. Because a human, has looked at them and labeled. Them like that and, then. All we need to do all we need to do is pass. All of those images that we know what they are through the neural network and then, choose the all those weights in such, a way that the right numbers, curl, the right labels come out with the highest number at the end each time often. That's, not possible but, at least we can choose weights so that the images at the end are labeled correctly most. Of the time and that's the training process. Now. In the real world. There might be millions, of weights to choose and there, might be millions of labeled images to process, through that. Neural, network so how do we compute, all of those weights and. This. Is the points where you know much talks on machine learning and most books on machine learning will start diving straight into the mathematics, this. Is where we'd start talking about things like back propagation, and learning rate and cost function but, unless. You're an AI researcher. You, can ignore all of that for. A couple of reasons first. If you, did want to actually dive down into this math there are lots and lots of great tools that will help you do those computations, and. They'll do that by taking advantage of, big computing, world's resources, and things like you compute processes, you, might have heard of things like of tools like tensorflow or pi torch that, can do those kinds of thing I'm, not gonna talk about those but. The, AI ml 40 talk later in the day by re Bronstein, will so check that out if that's what you're interested in. Also. Even if you want to make of it make use of all those tools you, would, need some good computing, resources available, to you as I just mentioned, you'd, also need a team of AI researchers. Who could sort of make use of them all but. Instead of setting, up that team and having all those resources, you, could just make the use of somebody, else's research department, that.

We've. Done, this but just by using a web form and uploading a picture and looking at the results on the screen but. If you want to embed this, capability, into an app you, want to do it programmatically. And so let's look at how we can do that now. So. You can interface, into. The cognitive services API is using. Any language, that can connect to an HTTP, endpoint but. What I have here is a batch script, that. I that I use to connect to the azure CLI, to. Create some resources, and the keys and things that I need to access it and then connect, to the computer vision API using, curl, now. Of course you can install the azure CLI, in your local shell but here I'm using the Azure account extension. In Visual Studio code, to. Launch a cloud shell which, means I hacked in so anything on the local environment but. Once that shows ready I can actually execute. Commands, like this directly, from the script, so. With this command I'm, just creating a resource group which are going to hold the used to hold the keys which I need to access cognitive, services and, now. With a Z cognitive, services I'm generating, those keys so it's got to run that code directly in the cloud shell and. Now. I just need to display those keys so. I can get access to them and as, Seth said I've been validated all these keys so no need to copy those down. And. Let's. Just copy that first key you can use either those those two keys and I'm, gonna take that key and put it into an environment, variable for use in the next part of the script I also. Need the end point where. I can access the. Cognitive, service online and that's actually provided, to us by the service itself. And. That depends on which region you're running it in but. Now I can grab the URL of an image and that's, the same image, of, the man in the hard hat we saw just a moment ago so. I'm just grabbing it off the web and set it off of a file this time and. Then. Passing, the key, to. The endpoint with. The, URL, of the image as a JSON, string and then. Having a look at the results that we get back from curl and. You can see all we have there and that's exactly the same output that you saw in the web interface a moment, ago were. The first two things were detected, were man and head dress, now. You can see them, so. Let's actually try that again just with a different image because we're doing things, programmatically, this time, so. All I need to do is change my environment variable for the image, to. A different image and in this case we're. Going to point to a web, version of. That picture, of a, pair of pliers, so, now of the drill, and. Now. Let's pass that same that different image into the same code to run, the cognitive services on that new image and if, we just scroll up to the top to have a look at this output here. You'll. See what the cognitive services. API. Returned for us there. We go and looking. At the tags of the image you can see the first one that has classified, for us is camera, and that's also incorrect we wanted to gabbed rule back from that so we could search for it in the API, so. What went wrong there and how. Can we fix it. Long. Story short the basic problem is is that using too powerful, and your network it can detect too many different, kinds of objects and it, can mistake one thing for another, but, in our particular case we just want to build an app that can recognize the specific objects.

That Tear win traitors cells. Let's. Say was only five different objects, so. Let's train a specific. Neural network the ingest event to detect, those kinds of things and won't, get things mixed up. Fortunately. We, can fix it using. Neural, networks once again. So. What if I told you that, there's a way that we can start with a vision model that's been trained for many many thousands, of images just as we store but. Then adapt it to, identify. Just those objects that you're interested in for your application. Even. If those objects. Weren't. Part of the original training, even if the original, neural. Network wasn't even able to detect those kinds of objects there might be something completely unique like. Your own brand or a brand new product you. Can actually train one of these computer vision systems to detect even, those let's. See how it works. Let's. Go back to the neural networks we've, got the same. Volution, or neural, network from before but, something is different as you can see we've, stripped up that final layer so, that final layer where the object classifications. Has gone, and. What we're left with is just the images, that were, generated at that second-to-last, layer I've labeled them f1 f2 f3 and so forth. Let's. Even forget the fact of their images you know rather than being three by three images by the time we got down to there let's just think of that of data as data nine, data, points. So. What we can do is, feed. A new image there on the left hand side do, all the same, calculations. With the same weights as before but. At the end of the day what we get out is not a, classification. But. We get a sequence, of nine data. Vectors, that somehow represent, that, image now we, don't really know what. These features are in, any real sense all. We know that those features were useful, in classifying, those, millions, of images that the original neural network was trained to classify, and it turns out they're also useful for classifying. Other kinds, of images as well. Things. Those features, might do just as examples is one. Feature might detect greenness. In an image and that feature, turned. Out to be useful in detecting things like tennis balls or lawns, but. It also is useful detecting, other kinds of green things, another. One of those features might have detected, circular. Features, in, an image and that turned out to be useful detecting, bicycles, but, it's also useful for the detecting hula hoops and other kinds of things as well so, we can use that fact and it's a trick we, can take those features, that are generated, by the old you'll network, collect.

The Values with new images and use that to build a new more new model as we'll see here. So. Let's suppose that we, wanted to train a brand new model based on the old neural network to, identify hammers. And hard hats so. We can paste in an image of how an image of a hammer on the left run. It through the neural network, collect, all the features on the right so, that'll give us eight data vectors one. For each feature. And. It also gets us a binary variable a 1, or a 0 and we'll get for each image that we put through. Same. Thing with hot hats take a picture of a hot hat process. It through the image collect, the features get, another binary variable, 1 or 0 for hard hat or not let's. Do that for a few. Dozen few hundred images so. What do we get. Now. If you're a. Statistician. Or a data scientist, you probably know where I'm coming from this after. We process each of our few hundred images we. Get a sequence of values features. A binary. Indicator what. Could we do with that we. Could build a logistic regression, model we. Could build a one layer neural network a very simple, predictive model we could take from those generated features in our binary values, build, a new protector, just from that and. It turns out that that works really well you. Don't even need a lot of data a few are a few dozen images, will often do the trick as long as the things you're trying to detect are visually. Distinct, and those features can can, identify them, you. Also need a lot of computing power it's, a simple, logistic regression, or a one-layer neural network so you don't need a lot of power to do those computations now. This is obviously a toy example and you'll likely want to identify more than two objects, and the underlying neural network will generate many more than just to eat features, at its, second to the last layer but the principle remains you. Can do this with modest data and modest computing power and it often works really well and. Of, course you don't have to build that transfer, learning model that's what it's called by yourself you. Can use the advanced vision models, we just saw in computer, vision as the. Base and then provide your own images and classifications. And use, the service called custom. Vision now. Just like computer vision you, can train transfer learning learning models programmatically, using web-based UI and that's what I'm about to do here, but. You can also do it programmatically, using curl, as, well through directly through the API so. Anyway now I'm going to use Microsoft. Cognitive, services custom. Vision to. Train a model for that shot by photo feature we saw earlier on in the, Tailwind traders website. So. Here I am in the custom vision web-based, interface which gives us this nice UI we. Can provide new images for the transfer learning and analysis and as you can see in this project, I've already uploaded a number of pictures we've. Got here pictures of screwdrivers. Further. Down we've got pictures of pliers I've, also got drills and hammers and. I'm gonna use those to train my custom model let's. Also add one. Other classification. Which is hardhats. So. I'm gonna click on the add images, button up there to provide some new images directly. From a folder that I have onto my hard drive. There. We go and I'm gonna browse to, the hard hats folder just press ctrl-a, just select them all. Click. Open and now it's uploading those images of the hard hats to the service as well, but I've got to label them as a human so, gonna put the hard hat label, associated.